xml: Why BeautifulSoup and lxml don't work?

lundi 29 juin 2015

Why BeautifulSoup and lxml don't work?

I'm using mechanize library to log in website. I checked, it works well. But problem is i can't use response.read() with BeautifulSoup and 'lxml'.

#BeautifulSoup
response = browser.open(url)
source = response.read()
soup = BeautifulSoup(source)  #source.txt doesn't work either
for link in soup.findAll('a', {'class':'someClass'}):
    some_list.add(link)

This doesn't work, actually doesn't find any tag. It works well when i use requests.get(url).

#lxml->html
response = browser.open(url)
source = response.read()
tree = html.fromstring(source)  #souce.txt doesn't work either
print tree.text
like_pages = buyers = tree.xpath('//a[@class="UFINoWrap"]')  #/text() doesn't work either
print like_pages

Doesn't print anything. I know it has problem with return type of response, since it works well with requests.open(). What could i do? Could you, please, provide sample code where response.read() used in html parsing?

By the way, what is difference between response and requests objects?

Thank you!

xml

lundi 29 juin 2015

Why BeautifulSoup and lxml don't work?

Aucun commentaire:

Enregistrer un commentaire