Start a new topic

Error while obtaining start requests

Spider: https://portia.scrapinghub.com/#/projects/255795/spiders/www.trilliumbrewing.com_1


Feed URL: https://pastebin.com/raw/UmCQ6Kh9

Link Crawling: Don't follow links


I notice that when I re-open the project after running it, the Feed URL is blank. I also notice that if I update any info in the left panel without clicking & editing the sample, there is no way to publish changes so hopefully those are saved?


I have successfully crawled the site before by setting trilliumbrewing.com as homepage but I'm unable to crawl with the feed URL. Error below:



[scrapy.core.engine] Error while obtaining start requests

 Less
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/scrapy/core/engine.py", line 127, in _next_request
    request = next(slot.start_requests)
  File "/src/slybot/slybot/slybot/spider.py", line 178, in start_requests
    for req in start_requests:
  File "/src/slybot/slybot/slybot/spider.py", line 86, in _create_start_requests
    for start_url in self._start_urls:
  File "/src/slybot/slybot/slybot/starturls/__init__.py", line 22, in __iter__
    for url in chain(*(arg_to_iter(g) for g in generated)):
  File "/src/slybot/slybot/slybot/starturls/__init__.py", line 22, in <genexpr>
    for url in chain(*(arg_to_iter(g) for g in generated)):
  File "/src/slybot/slybot/slybot/starturls/__init__.py", line 21, in <genexpr>
    generated = (self._generate_urls(url) for url in self.start_urls)
  File "/src/slybot/slybot/slybot/starturls/__init__.py", line 38, in _generate_urls
    return generator(start_url.generator_value)
  File "/src/slybot/slybot/slybot/starturls/feed_generator.py", line 11, in __call__
    return Request(url, callback=self.parse_urls)
  File "/usr/local/lib/python2.7/site-packages/scrapy/http/request/__init__.py", line 25, in __init__
    self._set_url(url)
  File "/usr/local/lib/python2.7/site-packages/scrapy/http/request/__init__.py", line 58, in _set_url
    raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: 
1 Comment

Updated with new feed URL: https://pastebin.com/raw/6S28RmTD

Login to post a comment