Start a new topic
Answered

How to overwrite settings.py API keys from Spiders - Settings tab

When I'm using API keys locally, I set them in settings.py using dotenv and ```os.getenv("AWS_ACCESS_KEY_ID")```. 


I want to set them on scraping hub using the Spiders - Settings tab. However, it appears they don't overwrite the values in settings.py when I set them on scrapinghub. I also don't want to hardcode API keys into settings.py.


Is there a solution to this problem? I've seen a suggestion elsewhere of just commenting out the lines that set the api keys in settings.py for any deploy, but it seems like there must be a better solution to this. Thanks.


Best Answer

Settings in the UI do overwrite the ones in settings.py. It goes like this: UI Spider Level Settings > UI Project level Settings > Settings.py. According to your project activity, you added the settings in the UI but never scheduled a new job, changes in settings don't apply to running jobs.


I tried commenting out the lines, and it turns out the settings I'm setting on the UI aren't being included in the spiders at all. I have created and saved these settings for the spiders via the Spiders - Settings tab, are there any suggestions for how to debug those settings not being communicated to the spiders that are started?


I have set them in the Spider - Settings tab and in the settings tab of the specific spider that is running (and failing with error: AttributeError: module 'spida.settings' has no attribute 'AWS_ACCESS_KEY_ID').


Spider Details shows Custom Settings as:

Custom settings:AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
Answer

Settings in the UI do overwrite the ones in settings.py. It goes like this: UI Spider Level Settings > UI Project level Settings > Settings.py. According to your project activity, you added the settings in the UI but never scheduled a new job, changes in settings don't apply to running jobs.

Ok, I've tried starting a new job and it is still giving the same error.


I'm not able to propagate settings from anywhere other than the settings.py file. I've tried custom_settings, command line, and on the UI at both project and spider level.


Do you have any idea for why that might be? The fact that the issue also exists locally with custom_settings/cli suggests that this isn't a scrapinghub/scrapy cloud issue but a scrapy one. But any advice is still appreciated.

I seem to have found the solution to my specific issue, although I'm still quite confused as to how the settings work. The issue was that I was trying to access these overwritten settings in __init__, before they'd been overwritten. I've fixed it by moving the logic that accesses them into start_requests, although if there are any good resources on best practices with the overwritten settings and how they should be accessed later (e.g. importing settings still imports the file settings.py, as would be expected, not the actual spider's settings), I think those would be useful. 


I'll go ahead and close this support ticket for now. Thanks!

Login to post a comment