When using Crawlera, it's important to keep in mind the following best practices:


Set download timeout

One of the most common problems our users have is too low download timeout in their web crawlers or scraping application. Handling one request in crawlera can take a long time. This happens due to Crawlera internal throttling and it's the way it's supposed to work. Crawlera will try to process your request with different slaves and delay time. The recommended timeout for Crawlera requests is 600 seconds. If you are using scrapy please check our example configuration.


Adjust concurrency

Adjust concurrency to your plan limit (50 in C50, 100 in C100, etc).


Retry 503 responses

Even though Crawlera should protect you against bans, sometimes it runs out of capacity and will return a 503 response. Because of this, we recommend you retry 503 responses up to 5 times. Consider using the x-crawlera-next-request-in to retry more efficiently.


What's the best way to use Crawlera with Scrapy?

See Using Crawlera with Scrapy.