Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
66 views
in Technique[技术] by (71.8m points)

python - BeautifulSoup trying to remove HTML data from list

As mentioned above, I am trying to remove HTML from the printed output to just get text and my dividing | and -. I get span information as well as others that I would like to remove. As it is part of the program that is a loop, I cannot search for the individual text information of the page as they change. The page architecture stays the same, which is why printing the items in the list stays the same. Wondering what would be the easiest way to clean the output. Here is the code section:

        infoLink = driver.find_element_by_xpath("//a[contains(@href, '?tmpl=component&detail=true&parcel=')]").click()
        driver.switch_to.window(driver.window_handles[1])
        aInfo = driver.current_url
        data = requests.get(aInfo)
        src = data.text
        soup = BeautifulSoup(src, "html.parser")
        parsed = soup.find_all("td")
        for item in parsed:
            Original = (parsed[21])
            Owner = parsed[13]
            Address = parsed[17]
            print (*Original, "|",*Owner, "-",*Address)

Example output is:

<span class="detail-text">123 Main St</span> | <span class="detail-text">Banner,Bruce</span> - <span class="detail-text">1313 Mockingbird Lane<br>Santa Monica, CA  90405</br></span>

Thank you!

question from:https://stackoverflow.com/questions/66067920/beautifulsoup-trying-to-remove-html-data-from-list

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

To get the text between the tags just use get_text() but you should be aware, that there is always text between the tags to avoid errors:

for item in parsed:
    Original = (parsed[21].get_text(strip=True))
    Owner = parsed[13].get_text(strip=True)
    Address = parsed[17].get_text(strip=True)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...