Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy informatio is not JSON serializable

Tags:

python

scrapy

I am scraping information from some event websites I got this error when I ran my spider, I was wondering if anyone new how to solve it.

  File "/usr/lib/python3.4/json/encoder.py", line 173, in default raise TypeError(repr(o) + " is not JSON serializable")
  TypeError: <Selector xpath='.//a[@target="_top"]/text()' data='Artist Development Fellowship Informatio'> is not JSON serializable

I can scrape the text with:

scrapy shell http://ofa.fas.harvard.edu/events

for event in response.xpath('.//article[@class="node node-event node-teaser article event-start clearfix"]'):
event.xpath('.//a[@target="_top"]/text()')

spider:

import scrapy


class FAS(scrapy.Spider):
    name ='fas'
    start_urls = [
        'http://ofa.fas.harvard.edu/events',
    ]

    def parse(self, response):
        for event in response.xpath('.//article[@class="node node-event node-teaser article event-start clearfix"]'):
            yield {
            'title' : event.xpath('.//a[@target="_top"]/text()'),  
             }
like image 668
Tank Avatar asked Jan 23 '26 14:01

Tank


1 Answers

as the error says: anything.xpath('...') is a selector, not a string you are missing to add the .extract_first() method.

anything.xpath('...').extract_first()
like image 68
eLRuLL Avatar answered Jan 25 '26 04:01

eLRuLL



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!