0
Answered
kzrster 2 months ago in Scrapy Cloud • updated by Pablo Vaz (Support Engineer) 2 weeks ago 2

Hi !
I needed to scrape site which have many JS code. So I use scrapy+selenium. Aslo it should run at Scrapy Cloud.
I've write spider which uses scrapy+selenuim+phantomjs and run it on my local machine. All is ok.
Then I deployed project to Scrapy cloud using shub-image. Deployment is ok. But results of
webdriver.page_source is different. It's ok on local, not ok(HTML with inscription - 403, request 200 http) at cloud.
Then I decided to use crawlera acc. I've added it with:

service_args = [

            '--proxy="proxy.crawlera.com:8010"',
'--proxy-type=https',
'--proxy-auth="apikey"',
]


for Windows(local)
self.driver = webdriver.PhantomJS(executable_path=r'D:\programms\phantomjs-2.1.1-windows\bin\phantomjs.exe',service_args=service_args)


for docker

self.driver = webdriver.PhantomJS(executable_path=r'/usr/bin/phantomjs', service_args=service_args, desired_capabilities=dcap)

Again at local all is ok. Cloud not ok.
I've checked cralwera info. It's ok. Requests sends from both(local and cloud).

I dont get what's wrong.
I think It might be differences between phantomjs versions(Windows, Linux).

Any ideas?










Answer

Answer
Answered

Hi Kzrester,


If the issue is related to SSL fetching (https), this may be due our current version of Erlang that returns some errors for some languages and browsers for that.


Our team is working in an update of the Erlang version and should be deployed in terms of weeks.


Let us know if you find more information about the error you get.


Best,


Pablo

I didnt want to post in Ideas(
Cant fix it now.

Answer
Answered

Hi Kzrester,


If the issue is related to SSL fetching (https), this may be due our current version of Erlang that returns some errors for some languages and browsers for that.


Our team is working in an update of the Erlang version and should be deployed in terms of weeks.


Let us know if you find more information about the error you get.


Best,


Pablo