Start a new topic
Answered

importing from github is failing.

Good day. trying to import from github and getting the below error from the log.

Not sure what in the heck to do.


Fetching changes Getting data for a given refname Checking project tarball scrapinghub.yml is not found, assume it's a Scrapy project Setup build step for Scrapy project setup.py is not found, creating it from template Using default location for requirements.txt Login succeeded Building an image: Step 1/12 : FROM scrapinghub/scrapinghub-stack-scrapy:1.4 [91m# Executing 2 build triggers... [0m Step 1/1 : ENV PIP_TRUSTED_HOST $PIP_TRUSTED_HOST PIP_INDEX_URL $PIP_INDEX_URL ---> Using cache Step 1/1 : RUN test -n $APT_PROXY && echo 'Acquire::http::Proxy \"$APT_PROXY\";' >/etc/apt/apt.conf.d/proxy ---> Using cache ---> 09db15b62bc7 Step 2/12 : ENV PYTHONUSERBASE /app/python ---> Using cache ---> f930abf17209 Step 3/12 : ADD eggbased-entrypoint shub-build-egg shub-list-scripts /usr/local/sbin/ ---> Using cache ---> 485e43ed3c4a Step 4/12 : ADD run-pipcheck /usr/local/bin/ ---> Using cache ---> 58ed40cd34d5 Step 5/12 : RUN chmod +x /usr/local/bin/run-pipcheck /usr/local/sbin/shub-build-egg /usr/local/sbin/shub-list-scripts /usr/local/sbin/eggbased-entrypoint ---> Using cache ---> a0423a78ce50 Step 6/12 : RUN ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/start-crawl && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/scrapy-list && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/shub-image-info && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/run-pipcheck ---> Using cache ---> 27a938d168d8 Step 7/12 : ADD requirements.txt /app/requirements.txt ---> Using cache ---> 50d4c9759835 Step 8/12 : RUN mkdir $PYTHONUSERBASE && chown nobody:nogroup $PYTHONUSERBASE ---> Using cache ---> f3728901fc0b Step 9/12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt ---> Using cache ---> 5a82e6f667a6 Step 10/12 : ADD project /tmp/project ---> 55372c27ee46 Removing intermediate container 858eac9a499a Step 11/12 : RUN shub-build-egg /tmp/project ---> Running in 69659157bab2 running bdist_egg running egg_info creating project.egg-info writing project.egg-info/PKG-INFO writing top-level names to project.egg-info/top_level.txt writing dependency_links to project.egg-info/dependency_links.txt writing entry points to project.egg-info/entry_points.txt writing manifest file 'project.egg-info/SOURCES.txt' reading manifest file 'project.egg-info/SOURCES.txt' writing manifest file 'project.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib [91mwarning: install_lib: 'build/lib' does not exist -- no Python modules to install [0m creating build creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/EGG-INFO copying project.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying project.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying project.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying project.egg-info/entry_points.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying project.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO [91mzip_safe flag not set; analyzing archive contents... [0m creating '/tmp/scrapinghub/project-1.0-py2.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) ---> 774888771039 Removing intermediate container 69659157bab2 Step 12/12 : ENV PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ---> Running in cac0d7ae90da ---> 4c022b2db7e8 Removing intermediate container cac0d7ae90da Successfully built 4c022b2db7e8 Step 1/3 : FROM alpine:3.5 ---> 6c6084ed97e5 Step 2/3 : ADD kumo-entrypoint /kumo-entrypoint ---> Using cache ---> b7598b7168ec Step 3/3 : RUN chmod +x /kumo-entrypoint ---> Using cache ---> 9bfe7ca53670 Successfully built 9bfe7ca53670 Entrypoint container is created successfully >>> Checking python dependencies No broken requirements found. >>> Getting spiders list: >>> Trying to get spiders from shub-image-info command WARNING: There're some errors on shub-image-info call: ERROR:root:Settings initialization failed Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/sh_scrapy/crawl.py", line 138, in _run_usercode settings = populate_settings(apisettings_func(), spider) File "/usr/local/lib/python2.7/site-packages/sh_scrapy/settings.py", line 235, in populate_settings return _populate_settings_base(apisettings, _load_default_settings, spider) File "/usr/local/lib/python2.7/site-packages/sh_scrapy/settings.py", line 164, in _populate_settings_base settings = get_project_settings().copy() File "/usr/local/lib/python2.7/site-packages/scrapy/utils/project.py", line 68, in get_project_settings settings.setmodule(settings_module_path, priority='project') File "/usr/local/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 292, in setmodule module = import_module(module) File "/usr/local/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named PocSpiders.settings Traceback (most recent call last): File "/usr/local/bin/shub-image-info", line 11, in <module> sys.exit(shub_image_info()) File "/usr/local/lib/python2.7/site-packages/sh_scrapy/crawl.py", line 210, in shub_image_info _get_apisettings, commands_module='sh_scrapy.commands') File "/usr/local/lib/python2.7/site-packages/sh_scrapy/crawl.py", line 138, in _run_usercode settings = populate_settings(apisettings_func(), spider) File "/usr/local/lib/python2.7/site-packages/sh_scrapy/settings.py", line 235, in populate_settings return _populate_settings_base(apisettings, _load_default_settings, spider) File "/usr/local/lib/python2.7/site-packages/sh_scrapy/settings.py", line 164, in _populate_settings_base settings = get_project_settings().copy() File "/usr/local/lib/python2.7/site-packages/scrapy/utils/project.py", line 68, in get_project_settings settings.setmodule(settings_module_path, priority='project') File "/usr/local/lib/python2.7/site-packages/scrapy/settings/__init__.py", line 292, in setmodule module = import_module(module) File "/usr/local/lib/python2.7/importlib/__init__.py", line 37, in import_module __import__(name) ImportError: No module named PocSpiders.settings {"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}

Best Answer

You need add a __init__.py file to your PocSpider folder


Great! No problem :)

Just following up. I was able to upload and run a test spider with shub. I'll just need to validate the github auto upload. Thanks again for the help!

Nick

Thank you for the reply and sorry for the late response. I've been unavailable for about a week ;)

I'll make sure there is an "_init_.py" file in the folder. I'll also check the shub deploy link from above.


Thanks again and hopefully I'll get this going. :)

Nick

Answer

You need add a __init__.py file to your PocSpider folder

Also check https://shub.readthedocs.io/en/stable/deploying.html. You should have a scrapinghub.yml file outlining some items to use for your project.


A few examples are shown there, if you have issues let us know and we could help.



Could you outline your project's folder structure?

Login to post a comment