I've fixed 17 bugs, added some minor new features, and changed the implementations of It's a little annoying to get this message, but it's also annoying to have your code silently behave differently because you copied it to a machine that didn't have lxml installed, and it's also annoying when I have to check pretty much every reported bug to see whether this is the problem. Whenever I think I can eliminate a class of support question with a warning, I put in the warning. It saves everybody time.
The other possibility: now that Python's built-in HTMLParser is decent, I could make it so that it's always the default unless you specify another parser. This would cause a big one-time wrench, as even machines which have lxml installed would start using HTMLParser, but once it shook out the problem would be solved. I might still do that, but I think I'll give everyone about a year to get rid of this annoying warning.
Anyway, try out the beta. Unless there's a big problem I'll be releasing 4.4.0 on Friday.
(2) Mon Jun 29 2015 09:36 Beautiful Soup 4.4.0 beta:
I've found an agent for Situation Normal and the book is out to publishers and I don't have to think about it for a while. As seems to be my tradition after finishing a big project, I went through the accumulated Beautiful Soup backlog and closed it out. I've put out
a beta release which I'd like you to try out and report any problems.
__copy__
and __repr__
to work more like you'd expect from Python objects. But in my mind the major new change is this: I've added a warning that displays when you create a BeautifulSoup
object without explicitly specifying a parser:
UserWarning: No parser was explicitly specified, so I'm using the
best available HTML parser for this system ("lxml"). This usually
isn't a problem, but if you run this code on another system, or in a
different virtual environment, it may use a different parser and
behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
- Comments:
Posted by Brendan at Mon Jun 29 2015 12:00
Buried lede in a shallow grave.
Posted by Danny at Wed Jul 08 2015 14:24
This error message looks different than what I'm getting (using beautifulsoup4, version: 4.4.0)To get rid of this warning, change this: BeautifulSoup([your markup])to this: BeautifulSoup([your markup], "lxml-xml") markup_type=markup_type))The error message is confusing - where do I assign markup_type? I tried specifying the markup_type as an argument, but then I get a keyword argument error.My code looks like this:
soup = BeautifulSoup(open(final), ["lxml-xml","xml"])