[Comments] (1) : Today's my day for publicity, as Beautiful Soup gets a favorable mention in Uche Ogbuji's xml.com column, which is actually about converting HTML into XML. Beautiful Soup tries to reduce the number of times you have to convert HTML into XML, but if you do have to there are tools for it.

libxml2's tree object looks like something I could use as a model for a future version of Beautiful Soup (I do need to rewrite a big chunk of it; I'm painfully aware of numerous embarassing flaws, but it's still the best screen-scraping library IMO).

Posted by Nick Moffitt at Thu Sep 09 2004 19:32

Of course, I usually just use xml2

