python 3.x - Beautiful Soup Nested Loops

Question

Welcome To Ask or Share your Answers For Others

python 3.x - Beautiful Soup Nested Loops

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python 3.x - Beautiful Soup Nested Loops

I was hoping to create a list of all of the firms featured on this list. I was hoping each winner would be their own section in the HTML but it looks like there are multiple grouped together across several divs. How would you recommend going about solving this? I was able to pull all of the divs but i dont know how to cycle through them appropriately. Thanks!

import requests
from bs4 import BeautifulSoup
import csv

request = requests.get("https://growthcapadvisory.com/growthcaps-top-40-under-40-growth-investors-of-2020/")
text = request.text

soup = BeautifulSoup(text, 'html.parser')
element = soup.find()

person = soup.find_all('div', class_="under40")

question from:https://stackoverflow.com/questions/65622852/beautiful-soup-nested-loops

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T18:54:39+0000

This solution uses css selectors

import requests
from bs4 import BeautifulSoup

request = requests.get("https://growthcapadvisory.com/growthcaps-top-40-under-40-growth-investors-of-2020/")
text = request.text

soup = BeautifulSoup(text, 'html.parser')
# if you have an older version you'll need to use contains instead of -soup-contains
firm_tags = soup.select('h5:-soup-contains("Firm")  strong')
# extract the text from the selected bs4.Tags
firms = [tag.text for tag in firm_tags]
# if there is extra whitespace
clean_firms = [f.strip() for f in firms]

It works by selecting all the strong tags whose parent h5 tag contain the word "Firm"

See the SoupSieve Docs for more info on bs4's CSS Selectors

Categories

python 3.x - Beautiful Soup Nested Loops

python 3.x - Beautiful Soup Nested Loops

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags