Scrapy Cloud Advanced Topics

Here you'll find articles on advanced settings and features of Scrapy Cloud.

Deploying Python Dependencies for your Projects in Scrapy Cloud
The environment where your spiders run on Scrapy Cloud brings a set of pre-installed packages. However, sometimes you'll need some extra packages that ...
Tue, 10 Apr, 2018 at 1:25 PM
Deploying Python 3 spiders to Scrapy Cloud
By default, any Scrapy project deployed to Scrapy Cloud runs on Python 2.7. If you are creating your spiders using Python 3 features, you have to define the...
Wed, 26 Apr, 2017 at 1:38 PM
Running custom Python scripts
In addition to Scrapy spiders, you can also run custom, standalone python scripts on Scrapy Cloud. They need to be declared in the scripts section of your p...
Fri, 24 Mar, 2017 at 11:52 PM
Deploying non-code files
You need to declare the files in the package_data  section of your setup.py  file. For example, if your Scrapy project has the following structure: ...
Wed, 29 Nov, 2017 at 4:39 PM
Exporting scraped items to an AWS/S3 account (UI mode)
In order to store your items in an S3 account provided by AWS, you need to enable certain Scrapy settings in Scrapy Cloud. First, go to your Spider sett...
Tue, 5 Dec, 2017 at 10:14 AM
Changing the deploy environment with Scrapy Cloud Stacks
You can select the runtime environment for your spiders from a list of pre-defined stacks. Each stack is a runtime environment containing certain versions o...
Wed, 17 May, 2017 at 4:02 PM
Sharing data between spiders
If you need to provide data to a spider within a given project, you can use the API via the python-scrapinghub library to store the data in collections. ...
Mon, 27 Mar, 2017 at 1:33 PM
Scrapy Cloud API
The Scrapy Cloud API (often also referred as the Scrapinghub API) is a HTTP API that you can use to control your spiders and consume the scraped data, among...
Sat, 25 Mar, 2017 at 12:36 AM
Fetching latest spider data
To get the scraped items, you can use the Items API. In some cases, it's convenient to have a static URL that points to the last job, in a specific fo...
Mon, 16 Apr, 2018 at 10:56 AM
Understanding Job Outcomes
The job outcome indicates whether the job succeeded or failed. By default, it contains the value of the spider close reason from Scrapy. It’s available in t...
Wed, 16 Aug, 2017 at 7:02 PM