It's possible to build a custom Docker image with a given Portia project bundle.

It can provide some benefits in certain cases:


- SH-hosted Portia doesn't work for you for some reasons

- you need a custom Portia version


To make it work, your work directory should contain at least these 4 files:


1) Portia project zipped bundle (named project-slybot.zip in this guide)

For example, it can be downloaded from hosted Portia instance with:


curl --user $SHUB_APIKEY: \
'https://portia-beta.scrapinghub.com/api/projects/$PROJECT_ID/download' > project-slybot.zip


2) scrapinghub.yml with images section (more details here)


projects:    
default: PUT_YOUR_PROJECT_ID_HERE
requirements:
  file: requirements.txt
images:
  default: SomeDockerhubAccount/SomeDockerhubRepository


3) Dockerfile to build and deploy an image to Scrapy Cloud


FROM scrapinghub/scrapinghub-stack-portia:0.13
ENV SHUB_ENTRYPOINT=''
COPY project-slybot.zip /scrapy/
COPY list-spiders /usr/local/sbin/list-spiders
RUN chmod +x /usr/local/sbin/list-spiders


You can use any other Portia stack version as a base from the listed here.

Additional note: a line with SHUB_ENTRYPOINT is important, as we need to reset existing entrypoint set by original Portia stack. Otherwise stack's entrypoint will try to download a Portia bundle on job start.


4) list-spiders script


#!/bin/sh
/usr/local/bin/portiacrawl /scrapy/project-slybot.zip


Now you have everything ready to build and deploy your project with:


shub-image upload --username SomeName --password SomePassword