Start a new topic
Answered

crawlera api is not working with python requests

 i am using crawlera from 2 months , it was woking fine but  got this error :

"/home/vocso/.local/lib/python3.6/site-packages/urllib3/connection.py:362: SubjectAltNameWarning: Certificate for www.google.com has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)

here is my code

"

 

        proxy_host = "proxy.crawlera.com"


 proxy_port = "8010"

 proxy_auth = "<key>:" 

 proxies = {"https""https://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port),

 "http""http://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port)}

 photon_requests_session = requests.sessions.Session()

 photon_requests_session.verify = certifi.where()

 r = requests.get(url,proxies=proxies,verify="crawlera-ca.crt")

 soup = BeautifulSoup(r.text,'html5lib')

"


Best Answer

That's just a warning about an urllib3 feature support, not an error. The request still goes through the proxy and can get a response, so it can be safely ignored.


Answer

That's just a warning about an urllib3 feature support, not an error. The request still goes through the proxy and can get a response, so it can be safely ignored.

hi


I am getting the response on localhost but not getting any response in aws server. The code is same. 

 

i got nothing in server 
url = "https://www.google.com/search?q="+entry_keyword+"&gl="+entry_gl+"&start="+str(i*10)+"&as_qdr=y15"

proxy_host = "proxy.crawlera.com"

proxy_port = "8010"

proxy_auth = "<key>:"

proxies = {"https": "https://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port),

"http": "http://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port)}

photon_requests_session = requests.sessions.Session()

photon_requests_session.verify = certifi.where()

r = photon_requests_session.get(url,proxies=proxies,verify='crawlera-ca.crt')

print(r.text)

whole.png
(95.7 KB)

Please add a more verbose response like in the sample: https://support.scrapinghub.com/solution/articles/22000203567-using-crawlera-with-python-requests (e.g. response headers)

 

            url = "https://www.google.com/search?q="+entry_keyword+"&gl="+entry_gl+"&start="+str(i*10)+"&as_qdr=y15"

proxy_host = "proxy.crawlera.com"

proxy_port = "8010"

proxy_auth = "<key>:"

proxies = {"https": "https://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port),

"http": "http://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port)}

photon_requests_session = requests.sessions.Session()

photon_requests_session.verify = certifi.where()

r = photon_requests_session.get(url,proxies=proxies,verify='crawlera-ca.crt')

soup = BeautifulSoup(r.text,'html5lib')

print("""

 

Requesting [{}]

through proxy [{}]


Request Headers:

{}


Response Time: {}

Response Code: {}

Response Headers:

{}


""".format(url, proxy_host, r.request.headers, r.elapsed.total_seconds(),

r.status_code, r.headers, r.text))

new.png
(88.7 KB)

showing bad proxy auth on server but working perfectly in local machine

Are you sure you're using the same script in both local and AWS?

yaa sir  i am 100% sure

Bad Authentication is a client side error, if the API Key being used is the correct one, then the only I can think of is the python requests version installed which might be causing problems.

problem solved...problem is  python requests version 

and thank you so much for help

No problem.

Login to post a comment