Where
(4) Mon May 29 2006 13:50 PST Beautiful Soup Trickery:
Here's a nice bit of trickery I discovered, based on code sent me by Staffan Malmgren. A common task among the target audience for Beautiful Soup is to strip HTML tags from a document. It turns out this is a one-liner:
''.join([e for e in soup.recursiveChildGenerator() if isinstance(e,unicode)])
soup
is your soup object. I really like the if
conditional in list comprehensions; I don't think Ruby has anything like it. I can think of ways to simulate it but they kill its simplicity, which is what I like about it.
- Comments:
Posted by anonymous at Mon May 29 2006 21:15
''.join(e for e in soup.recursiveChildGenerator() if isinstance(e,unicode))nicer?
Posted by Leonard at Mon May 29 2006 22:16
Is that valid Python? It doesn't work for me.
Posted by anonymous at Mon May 29 2006 23:14
You are right it is >= Python 2.4
http://docs.python.org/whatsnew/node4.htmlPosted by Leonard at Mon May 29 2006 23:16
Coooooool, as Manoj would say.