Is there a way to trigger a method in a Spider class just before it terminates?
I can terminate the spider myself, like this:
class MySpider(CrawlSpider): #Config stuff goes here... def quit(self): #Do some stuff... raise CloseSpider('MySpider is quitting now.') def my_parser(self, response): if termination_condition: self.quit() #Parsing stuff goes here... But I can't find any information on how to determine when the spider is about to quit naturally.
It looks like you can register a signal listener through dispatcher.
I would try something like:
from scrapy import signals from scrapy.xlib.pydispatch import dispatcher class MySpider(CrawlSpider): def __init__(self): dispatcher.connect(self.spider_closed, signals.spider_closed) def spider_closed(self, spider): # second param is instance of spder about to be closed. In the newer version of scrapy scrapy.xlib.pydispatch is deprecated. instead you can use from pydispatch import dispatcher.
Just to update, you can just call closed function like this:
class MySpider(CrawlSpider): def closed(self, reason): do-something()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With