python - BeautifulSoup and lxml.html

python - BeautifulSoup and lxml.html - what to prefer? -

- May 15, 2013

this question has answer here:

parsing html in python - lxml or beautifulsoup? of these better kinds of purposes? 7 answers

i working on project involve parsing html.

after searching around, found 2 probable options: beautifulsoup , lxml.html

is there reason prefer 1 on other? have used lxml xml time , feel more comfortable it, beautifulsoup seems common.

i know should use 1 works me, looking personal experiences both.

the simple answer, imo, if trust source well-formed, go lxml solution. otherwise, beautifulsoup way.

edit:

this answer 3 years old now; it's worth noting, jonathan vanasco in comments, beautifulsoup4 supports using lxml internal parser, can use advanced features , interface of beautifulsoup without of performance hit, if wish (although still reach straight lxml myself -- perhaps it's force of habit :)).

Search This Blog

Manage

python - BeautifulSoup and lxml.html - what to prefer? -

Comments

Post a Comment

Popular posts from this blog

How do .net 4.0 [named] tuples work under the hood? -

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -