python - BeautifulSoup and lxml.html - what to prefer? -
this question has answer here:
i working on project involve parsing html.
after searching around, found 2 probable options: beautifulsoup , lxml.html
is there reason prefer 1 on other? have used lxml xml time , feel more comfortable it, beautifulsoup seems common.
i know should use 1 works me, looking personal experiences both.
the simple answer, imo, if trust source well-formed, go lxml solution. otherwise, beautifulsoup way.
edit:
this answer 3 years old now; it's worth noting, jonathan vanasco in comments, beautifulsoup4
supports using lxml
internal parser, can use advanced features , interface of beautifulsoup without of performance hit, if wish (although still reach straight lxml
myself -- perhaps it's force of habit :)).
Comments
Post a Comment