Does anyone know how I could run the same Scrapy scraper over 200 times on different websites, each with their respective output files? Usually in Scrapy, you indicate the output file when you run it from the command line by typing
pipelineto drop the items with configurable parameters, like running
scrapy crawl myspider -a output_filename=output_file.txt. output_filename is added as an argument to the spider, and now you can access it from a pipeline like:
class MyPipeline(object): def process_item(self, item, spider): filename = spider.output_filename # now do your magic with filename
You can run scrapy within a python script, and then also do your things with the output items.