I have the following item to find the text in a paragraph:
soup.find("td", { "id" : "overview-top" }).find("p", { "itemprop" : "description" }).text
How would I exclude all text within an <a> tag? Something like in <p> but not in <a>?
Find and join all text nodes in the p tag and check that it's parent is not an a tag:
p = soup.find("td", {"id": "overview-top"}).find("p", {"itemprop": "description"})
print ''.join(text for text in p.find_all(text=True) 
              if text.parent.name != "a")
Demo (see no link text printed):
>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <td id="overview-top">
...     <p itemprop="description">
...         text1
...         <a href="google.com">link text</a>
...         text2
...     </p>
... </td>
... """
>>> soup = BeautifulSoup(data)
>>> p = soup.find("td", {"id": "overview-top"}).find("p", {"itemprop": "description"})
>>> print p.text
        text1
        link text
        text2
>>>
>>> print ''.join(text for text in p.find_all(text=True) if text.parent.name != "a")
        text1
        text2
Using lxml,
import lxml.html as LH
data = """
<td id="overview-top">
    <p itemprop="description">
        text1
        <a href="google.com">link text</a>
        text2
    </p>
</td>
"""
root = LH.fromstring(data)
print(''.join(root.xpath(
    '//td[@id="overview-top"]//p[@itemprop="description"]/text()')))
yields
        text1
        text2
To also get the text of child tags of <p>, just use a double forward slash, //text(), instead of a single forward slash:
print(''.join(root.xpath(
    '//td[@id="overview-top"]//p[@itemprop="description"]//text()')))
yields
        text1
        link text
        text2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With