I have the following span:
<span class="name">
bla bla <a href="address">foo</a> bar
</span>
I want scrapy to extract the entire sentence without the link, meaining:
bla bla foo bar
How do I do that?
You can use descendant-or-self::*/text() xpath expression:
//span[@class="name"]/descendant-or-self::*/text()
Demo (using scrapy shell):
$ cat index.html
<span class="name">bla bla <a href="address">foo</a> bar</span>
$ scrapy shell index.html
>>> results = sel.xpath('//span[@class="name"]/descendant-or-self::*/text()').extract()
>>> ''.join(results)
u'bla bla foo bar'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With