I am trying to extract the text that exist inside a div tag using BeautifulSoup package in python.
example I want to extract the text inside the tag <p></p>
and the text inside <dt> and <dd>
When I run the code the system crash and display the below error:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call
last) in
60 # # # article_body = s.find('div', {'class' :'card-content t-small bt p20'}).text
61 # text_info = s.find_all("div",{"class":"card-content is-spaced"})
---> 62 text_desc = text_info.find('div', attrs={'class':'card-content t-small bt p20'}).getText(strip=True)
63
64 print(f"{date_published} {title}
{text_desc}
", "-" * 80)
f:aienvlibsite-packagess4element.py in getattr(self, key)
2172 """Raise a helpful exception to explain a common code
fix.""" 2173 raise AttributeError(
-> 2174 "ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you
call find_all() when you meant to call find()?" % key 2175
)
AttributeError: ResultSet object has no attribute 'find'. You're
probably treating a list of elements like a single element. Did you
call find_all() when you meant to call find()?
html
<div class="card-content t-small bt p20" style="max-height:50vh" data-viewsize='{"d":{"height": {"max": 1}}, "offset":"JobSearch.jobViewSize"}'>
<h2 class="h6">Job Description</h2>
<p>The Executive Chef has full knowledge and capability of managing the general operations of the kitchen, specialty outlets kitchen including Stewarding.</p>
<h2 class="h6 p10t">Skills</h2>
<p>? Provide, develop, train and maintain a professional workforce? Excellent in English both in oral and written.? Computer knowledge is required and good in correspondences and reports writing.</p>
<h2 class="h6 p10t">Job Details</h2>
<dl class="dlist is-spaced is-fitted t-small m0">
<div>
<dt>Job Location</dt>
<dd> Al Olaya, Riyadh , Saudi Arabia </dd>
</div>
<div>
<dt>Company Industry</dt>
<dd>Food & Beverage Production; Entertainment; Catering, Food Service, & Restaurant</dd>
</div>
<div>
<dt>Company Type</dt>
<dd>Employer (Private Sector)</dd>
</div>
<div>
<dt>Job Role</dt>
<dd>Hospitality and Tourism</dd>
</div>
<div>
<dt>Employment Type</dt>
<dd>Unspecified</dd>
</div>
<div>
<dt>Monthly Salary Range</dt>
<dd>$4,000 - $5,000</dd>
</div>
<div>
<dt>Number of Vacancies</dt>
<dd>1</dd>
</div>
</dl>
<h2 class="h6 p10t">Preferred Candidate</h2>
<dl class="dlist is-spaced is-fitted t-small m0">
<div>
<dt>Career Level</dt>
<dd>Management</dd>
</div>
<div>
<dt>Years of Experience</dt>
<dd>Min: 10 Max: 20</dd>
</div>
<div>
<dt>Residence Location</dt>
<dd> Riyadh, Saudi Arabia ; Algeria; Bahrain; Comoros; Djibouti; Egypt; Iraq; Jordan; Kuwait; Lebanon; Libya; Mauritania; Morocco; Oman; Palestine; Qatar; Saudi Arabia; Somalia; Sudan; Syria; Tunisia; United Arab Emirates; Yemen</dd>
</div>
<div>
<dt>Gender</dt>
<dd>Male</dd>
</div>
<div>
<dt>Age</dt>
<dd>Min: 26 Max: 55</dd>
</div>
</dl>
</div>
================================================
code:
import time
import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(
requests.get("https://www.bayt.com/en/international/jobs/executive-chef-jobs/").content,
"lxml"
)
links = []
for a in soup.select("h2.m0.t-regular a"):
if a['href'] not in links:
links.append("https://www.bayt.com"+ a['href'])
for link in links:
s = BeautifulSoup(requests.get(link).content, "lxml")
text_info = s.find_all("div",{"class":"card-content is-spaced"})
text_desc = text_info.find('div', attrs={'class':'card-content t-small bt p20'}).getText(strip=True)
print(f"{date_published} {title}
{text_desc}
", "-" * 80)
question from:
https://stackoverflow.com/questions/65905303/how-to-extract-the-text-from-the-div-tag-using-beautifulsoup-and-python 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…