I wrote a tiny html-parser in Python using lxml. It's very useful, but I have a problem.
I have the following code:
tags = doc.xpath('//table//tr/td[@align="right"]/b')
for tag in tags:
print(x.text.strip())
It works fine. But if there is a <br> tag inside a <b> element, like this:
<b> first-half <br>
second-half </b>
this code will only print first-half into the <b> tag.
How can I get all of text in <b> even if there is a <br> tag?
Thanks.
Use text_content() to extract all of the non-markup text within a tag. Replace x.text with x.text_content().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With