Scrapy Spider - For loop within response callback not iterating

Question

I am trying to use the link parsing structure described by "warwaruk" in this SO thread: Following links, Scrapy web crawler framework

This works great when only grabbing a single item from each page. However, when I try to create a for loop to scrape all items within each page, it appears that the parse_item function terminates upon reaching the first yield statement. I have a custom pipeline setup to handle each item, but currently it only receives one item per page.

Let me know if I need to include more code, or clarification. THANKS!

def parse_item(self,response):  
    hxs = HtmlXPathSelector(response)
    prices = hxs.select("//div[contains(@class, 'item')]/script/text()").extract()
    for prices in prices:
        item = WalmartSampleItem()
        ...
        yield items

alecxe · Accepted Answer

You should yield a single item in the for loop, not items:

for prices in prices:
    item = WalmartSampleItem()
    ...
    yield item

Scrapy Spider - For loop within response callback not iterating

Tags:

python

scrapy

Tyler

1 Answers

alecxe

Recent Activity

Donate For Us

Scrapy Spider - For loop within response callback not iterating

Tags:

python

scrapy

Tyler

1 Answers

alecxe

Related questions

Recent Activity

Donate For Us