Can I try Crawlera before subscribing?
We don't currently offer Crawlera trial periods, but we do have a 7-day money back guarantee. For more info see Crawlera trials.
Can I use Crawlera service with my own crawler without using Scrapy, Scrapy Cloud or any other Scrapinghub service?
Definitely. Crawlera is a standalone service that can be used with any crawler or HTTP client, independently of the rest of the Scrapinghub platform.
Where is the Crawlera API documented?
How do I change my user-agent?
To change your User-Agent you will need to use the X-Crawlera-Profile header with value
pass. This will instruct Crawlera to use the User-Agent header you send in the request.
How do I find the IP used in a request?
X-Crawlera-Slave response header containing the IP address and port of the slave used to make the request.
How do I measure Crawlera’s speed for a particular domain?
You can use the crawlera-bench tool. Check the GitHub page for more information on how to use it. Note that you would be consuming Crawlera traffic when using the
Where can I monitor my Crawlera usage?
Go to the Scrapinghub dashboard and select Crawlera Overview or a specific account you would like to zoom in. If you click on a user, you will be able review the number of requests per day/month for that user. Note: Recent Requests section is not populated in real-time. Newly created accounts could be presented with No usage detected so far message, even after the user has started sending requests. Please check the page later to give Scrapy Cloud some time to catch up on the logs.
Why are requests slower through Crawlera than other proxies?
If you’re using your own proxies, you may notice a discrepancy in speed between using your own proxies and using them with Crawlera. This is because Crawlera throttles requests by introducing delays to avoid being banned on the target website.
These delays can differ depending on the target domain, as some popular sites have more rigorous anti-scraping measures than others. Throttling also helps prevent inadvertently bringing down the target website should it lack the resources to handle a large volume of requests.