Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
92 views
in Technique[技术] by (71.8m points)

python - Server giving back only a bit of the content after asking for header

Ok, i'm having this weird problem where if i ask the content after doing a HEAD request the host give a bit of the content. If i don't do the HEAD request before asking the content, i receive the page correctly. After some testing, i discovered that if i ask again for the content, it will give the rest of the content that is missing. The target page is example.com. Here's the code:

#it isn't printing the whole website now. why.

import socket
import sys

usr_choice = str(input("Do you choose to only download the header (1) or the header and body? (2)"))

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connecting with the server (example.com)
s.connect(('example.com', 80))
s.settimeout(10)

s.send(b'HEAD /index.html HTTP/1.1
Host: example.com
User-Agent: py-bot

')

# The "response_header" will first be the data we receive, then it will become itself decoded then it will be itself parsed.

response_header = s.recv(512)
response_header = response_header.decode()
if usr_choice == "2":
    response_header = response_header.split("
")
print(response_header)

# Reviewing the HTTP Header "Content-Length"
if usr_choice == "2":
    response_size = 0
    print(response_size)
    for i in response_header:
        if "Content-Length" in i:
            response_size+=int(i.replace("Content-Length: ",""))

    s.send(b"GET /index.html HTTP/1.1
Host: example.com
User-Agent: py-bot

")

    if response_size == 0:
        print("Header incomplete.")
        sys.exit(1)
    print(response_size)
    full_response = s.recv(8192)

    print(full_response.decode())

And here's what it is outputing:

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
        
    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>    
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior co

If i ask the host the content again, it will give back the rest of the page:

ordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
question from:https://stackoverflow.com/questions/65840162/server-giving-back-only-a-bit-of-the-content-after-asking-for-header

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Answering my own question: TCP chunks data into pieces so it doesn't get corrupted or something like that.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...