Big changes:
This is of course a feature of Python, but due to a pretty bad bug in
Wed Feb 08 2012 11:02 Beautiful Soup 4 Beta 4:
Beautiful Soup 4 beta 4 is out! You can install it with easy_install beautifulsoup4 or pip install beautifulsoup4. You can also download the tarball
or check out the Bazaar repository.
html.parser is now reliable enough to use on its own. You don't need to install lxml or html5lib just to parse bad HTML (but lxml is still a lot faster). The forthcoming Python 2.7.3 should also work this way.
html.parser, I wasn't taking advantage of it. I worked with Ezio Melotti to monkeypatch that bug from within BS, and now we're back in the very good situation of not needing any external dependencies.
new_tag() will follow the rules of whatever tree builder was used to create the original soup. For example, a new <p> tag will look like "<p />" if you're dealing with XML, but it'll look like "<p></p>" if you're dealing with HTML.
new_string() method to go along with new_tag().
PageElement.insert_before() and PageElement.insert_after().
substitute_html_entities argument with the more general formatter argument. You can do all sorts of crazy stuff with this.
