The site I am scraping has an inconsistent layout. I'm currently using this but its not returning all the titles -
article['title'] = sel.css('p[class=title] ::text').extract()
I need to use this to scrape span classes also -
article['title'] = sel.css('span[class=newstitle] ::text').extract()
Is there a way to combine two css selectors in a single ArticleItem?
As simple as list concatenation:
article['title'] = response.css("p.title ::text").extract() + \
response.css("span.newstitle ::text").extract()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With