0
Answered
signedup88 1 month ago in Crawlera • updated by Pablo Vaz (Support Engineer) 3 weeks ago 1

Hey guys does anybody have experience connecting to crawlera using  webdriver / selenium.


I am running a project on a WIN PC

Answer

Answer
Answered

Hey Signedup88,


Since it’s not so trivial to set up proxy authentication in Selenium, a popular option is to employ Polipo as a proxy. Update Polipo configuration file /etc/polipo/config to include Crawlera credentials (if the file is not present, copy and rename config.sample found in Polipo source folder):

parentProxy = "proxy.crawlera.com:8010"
parentAuthCredentials = "<API key>:"

For password safety reasons this content is displayed as (hidden) in the Polipo web interface manager. The next step is to specify Polipo proxy details in the Selenium automation script, e.g. for Python and Firefox:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.proxy import *
polipo_proxy = "localhost:8123"
proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': polipo_proxy,
    'ftpProxy' : polipo_proxy,
    'sslProxy' : polipo_proxy,
    'noProxy'  : ''
})
driver = webdriver.Firefox(proxy=proxy)
driver.get("http://scrapinghub.com")
assert "Scrapinghub" in driver.title
elem = driver.find_element_by_class_name("portia")
actions = ActionChains(driver)
actions.click(on_element=elem)
actions.perform()
print "Clicked on Portia!"
driver.close()

Best regards,


Pablo

Answer
Answered

Hey Signedup88,


Since it’s not so trivial to set up proxy authentication in Selenium, a popular option is to employ Polipo as a proxy. Update Polipo configuration file /etc/polipo/config to include Crawlera credentials (if the file is not present, copy and rename config.sample found in Polipo source folder):

parentProxy = "proxy.crawlera.com:8010"
parentAuthCredentials = "<API key>:"

For password safety reasons this content is displayed as (hidden) in the Polipo web interface manager. The next step is to specify Polipo proxy details in the Selenium automation script, e.g. for Python and Firefox:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.proxy import *
polipo_proxy = "localhost:8123"
proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': polipo_proxy,
    'ftpProxy' : polipo_proxy,
    'sslProxy' : polipo_proxy,
    'noProxy'  : ''
})
driver = webdriver.Firefox(proxy=proxy)
driver.get("http://scrapinghub.com")
assert "Scrapinghub" in driver.title
elem = driver.find_element_by_class_name("portia")
actions = ActionChains(driver)
actions.click(on_element=elem)
actions.perform()
print "Clicked on Portia!"
driver.close()

Best regards,


Pablo