I can't change spider settings in parse method. But it is definitely must be a way.
For example:
class SomeSpider(BaseSpider):
name = 'mySpider'
allowed_domains = ['example.com']
start_urls = ['http://example.com']
settings.overrides['ITEM_PIPELINES'] = ['myproject.pipelines.FirstPipeline']
print settings['ITEM_PIPELINES'][0]
#printed 'myproject.pipelines.FirstPipeline'
def parse(self, response):
#...some code
settings.overrides['ITEM_PIPELINES'] = ['myproject.pipelines.SecondPipeline']
print settings['ITEM_PIPELINES'][0]
# printed 'myproject.pipelines.SecondPipeline'
item = Myitem()
item['mame'] = 'Name for SecondPipeline'
But! Item will be processed by FirstPipeline. New ITEM_PIPELINES param don't work. How can I change settings after start crawling? Thanks in advance!
If you want that different spiders to have different pipelines you can set for a spider a pipelines list attribute which defines the pipelines for that spider. Than in pipelines check for existence:
class MyPipeline(object):
def process_item(self, item, spider):
if self.__class__.__name__ not in getattr(spider, 'pipelines',[]):
return item
...
return item
class MySpider(CrawlSpider):
pipelines = set([
'MyPipeline',
'MyPipeline3',
])
If you want that different items to be proceesed by different pipelines you can do this:
class MyPipeline2(object):
def process_item(self, item, spider):
if isinstance(item, MyItem):
...
return item
return item
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With