lundi 29 juin 2015

Why BeautifulSoup and lxml don't work?


I'm using mechanize library to log in website. I checked, it works well. But problem is i can't use response.read() with BeautifulSoup and 'lxml'.

#BeautifulSoup
response = browser.open(url)
source = response.read()
soup = BeautifulSoup(source)  #source.txt doesn't work either
for link in soup.findAll('a', {'class':'someClass'}):
    some_list.add(link)

This doesn't work, actually doesn't find any tag. It works well when i use requests.get(url).

#lxml->html
response = browser.open(url)
source = response.read()
tree = html.fromstring(source)  #souce.txt doesn't work either
print tree.text
like_pages = buyers = tree.xpath('//a[@class="UFINoWrap"]')  #/text() doesn't work either
print like_pages

Doesn't print anything. I know it has problem with return type of response, since it works well with requests.open(). What could i do? Could you, please, provide sample code where response.read() used in html parsing?

By the way, what is difference between response and requests objects?

Thank you!


Aucun commentaire:

Enregistrer un commentaire