I'm working on a project whose aim is to retrieve all the information from a news article (media website), for this I'm using the library newspaper3K which works quite well.
however I have a problem concerning some urls (redirected link), according to my research newspaper3k does not load the redirection url, it only treats the sent url as a parameter.
Here is an example of a link I would like to deal with:
url =
"wtm.actualite.20minutes.fr/redirection.html?m=3e2b20a2f1f6dd3c60608f54d7ad4dc5&c=fr&u=https%3A%2F%2Fwww.20minutes.fr%2Fmonde%2F2943823-20210103-bahamas-disparition-bateau-20-personnes-bord%3Fxtor%3DEREC-182-%5Bactualite%5D&dc=yt0U%2FI8COMJyjwQQ1fA2kVEXpoP0nsZydMTZS6jTm2DdKasFuV%2FVA7rEphhqMfGAy%2FlztUlVN4MJt5tg%2FQXfJwmXMRQL8g3Gfwhl%2BsjkkYmd%2BDxDUhb%2BpPRL%2BNsiDETNQeP3MmrQ6ATGJT%2Blf46Zg4DHd%2FzaXy%2B7UAuxatp2UcVd39HKuuMfQHmyDV%2BAxSAJrd4x5CxHqy3uTtZoQEjwGdZ%2FRtoa7YLOWLKhN9tg4TM%3D"
so the goal here with this url is to get the right url (after redirection) and then send it to newspaper3K.
I have tried the following solutions but they don't work on my side;
1 - using the library resquests as follows response = requests.get(url, verify=False, allow_redirects=True)
2- using the mechanize library as follows:
br = mechanize.Browser()
resp = br.open(url)
I would like to have the same process as when I use webbrowser (without opening the browser)
import webbrowser
webbrowser.open_new(url)
and finally have the right
url :
https://www.20minutes.fr/monde/2943823-20210103-bahamas-disparition-bateau-20-personnes-bord?xtor=EREC-182-[actualite]
thank you in advance for your reply :)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…