Ok, i'm having this weird problem where if i ask the content after doing a HEAD request the host give a bit of the content. If i don't do the HEAD request before asking the content, i receive the page correctly. After some testing, i discovered that if i ask again for the content, it will give the rest of the content that is missing. The target page is example.com. Here's the code:
#it isn't printing the whole website now. why.
import socket
import sys
usr_choice = str(input("Do you choose to only download the header (1) or the header and body? (2)"))
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connecting with the server (example.com)
s.connect(('example.com', 80))
s.settimeout(10)
s.send(b'HEAD /index.html HTTP/1.1
Host: example.com
User-Agent: py-bot
')
# The "response_header" will first be the data we receive, then it will become itself decoded then it will be itself parsed.
response_header = s.recv(512)
response_header = response_header.decode()
if usr_choice == "2":
response_header = response_header.split("
")
print(response_header)
# Reviewing the HTTP Header "Content-Length"
if usr_choice == "2":
response_size = 0
print(response_size)
for i in response_header:
if "Content-Length" in i:
response_size+=int(i.replace("Content-Length: ",""))
s.send(b"GET /index.html HTTP/1.1
Host: example.com
User-Agent: py-bot
")
if response_size == 0:
print("Header incomplete.")
sys.exit(1)
print(response_size)
full_response = s.recv(8192)
print(full_response.decode())
And here's what it is outputing:
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 2em;
background-color: #fdfdff;
border-radius: 0.5em;
box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
div {
margin: 0 auto;
width: auto;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior co
If i ask the host the content again, it will give back the rest of the page:
ordination or asking for permission.</p>
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
question from:
https://stackoverflow.com/questions/65840162/server-giving-back-only-a-bit-of-the-content-after-asking-for-header 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…