python - BeautifulSoup and lxml.html - what to prefer? -


i working on project involve parsing html.

after searching around, found 2 probable options: beautifulsoup , lxml.html

is there reason prefer 1 on other? have used lxml xml time , feel more comfortable it, beautifulsoup seems common.

i know should use 1 works me, looking personal experiences both.

the simple answer, imo, if trust source well-formed, go lxml solution. otherwise, beautifulsoup way.

edit:

this answer 3 years old now; it's worth noting, jonathan vanasco in comments, beautifulsoup4 supports using lxml internal parser, can use advanced features , interface of beautifulsoup without of performance hit, if wish (although still reach straight lxml myself -- perhaps it's force of habit :)).


Comments

Popular posts from this blog

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -