I'm new to scrapy framework and I've seen some tutorial using LinkExtractors and a few using SgmlLinkExtractor. I've tried searching for the differences/pros-cons for both, but the results haven't been satisfying.
Can someone tell me the difference between both? When should we use the above extractors?
Thanks!
The problem why you cannot find the references to what SgmlLinkExtractor is, is that it is now deprecated (related changeset). You can find the SgmlLinkExtractor definition here - inside the Scrapy 0.24 docs.
And, you should not be using SgmlLinkExtractor anymore - Scrapy now leaves a single link extractor only - the LxmlLinkExtractor - the one to which the LinkExtractor alias points to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With