Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
210 views
in Technique[技术] by (71.8m points)

Clicking on Get Data button for Monthly Settlement Statistics on nseindia.com doesn't fetch results using Selenium and Python

I am trying to scrape data from here.

By clicking on the capital market and 2019-20 year. I want to click on Get data.

I have used following code:

driver = webdriver.Chrome(executable_path=chrome_path,options=chrome_options)

driver.get( nse_cash_keystats_page )


 driver.find_element_by_xpath( "//select[@id='h_filetype']/option[text()='Capital Market ']" ).click()

driver.find_element_by_xpath( "//select[@id='yearField']/option[text()='2019-2020']" ).click()

     downloadButton=WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH,'//input[@type="image"][@src="/common/images/btn-get-data.gif"]')))

driver.execute_script("arguments[0].click();", downloadButton)

By using the above code, I am able to click on Get DATA. But it is not showing output.

Please help me.Thanks in advance.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I took your code added a few tweaks and ran the test as follows:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe')
    driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_monthly_statistics.htm')
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"select#h_filetype")))).select_by_visible_text("Capital Market ")
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"select#yearField")))).select_by_visible_text("2019-2020")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.getdata-button#get[type='image'][src^='/common/images/btn-get-data.gif']"))).click()
    

Observation

Similar to your observation, I have hit the same roadblock with no results as follows:

nseindia_Monthly_Statement


Deep Dive

It seems the click() on the element with text as Get Data does happens. But while inspecting the DOM Tree of the webpage you will find that some of the <script> tag refers to JavaScripts having keyword akam. As an example:

  • <script type="text/javascript" src="https://www1.nseindia.com/akam/11/52349752" defer=""></script>
  • <noscript><img src="https://www1.nseindia.com/akam/11/pixel_52349752?a=dD01ZDZiMTA5OGQ0MDljYTYxN2RjMjc3MzBlN2YwMDQ0NjlkZDNiNTMzJmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;" /></noscript>

Which is a clear indication that the website is protected by Bot Manager an advanced bot detection service provided by Akamai and the response gets blocked.


Bot Manager

As per the article Bot Manager - Foundations:

akamai_detection


Conclusion

So it can be concluded that the request for the data is detected as being performed by Selenium driven WebDriver instance and the response is blocked.


References

A couple of documentations:


tl; dr

A couple of relevant discussions:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...