The recommended way to use Crawlera with Scrapy is by using the Crawlera middleware which can be installed with:

pip install scrapy-crawlera

You can enable the middleware by adding the following settings to your Scrapy project:

DOWNLOADER_MIDDLEWARES = {'scrapy_crawlera.CrawleraMiddleware': 300}
CRAWLERA_ENABLED = True
CRAWLERA_APIKEY = '<API key>'

To achieve higher crawl rates when using Crawlera with Scrapy, it’s recommended to disable the Auto Throttle addon and increase the maximum number of concurrent requests. You may also want to increase the download timeout. Here is a list of settings that achieve that purpose:

CONCURRENT_REQUESTS = 32
CONCURRENT_REQUESTS_PER_DOMAIN = 32
AUTOTHROTTLE_ENABLED = False
DOWNLOAD_TIMEOUT = 600

To enable Crawlera see Getting started with Crawlera.


NOTE: You can override Crawlera settings from your settings.py file by adding them to Scrapy Cloud project/spider settings (normally, settings entered in Scrapy Cloud UI take precedence over settings entered in the spider, but not in the case of using custom Docker images -- UI settings will be ignored).