You can use Splash with Crawlera to render JavaScript and proxy all requests issued from Splash. This may be necessary if your uses Splash heavily and target website throttles or blocks requests from Splash.


How to do it?


You need to send your requests to Splash. Splash must proxy its requests via Crawlera.

This is best achieved by using Splash /execute endpoint. You need to create Lua script that will tell Splash to use proxy for requests. Splash provides splash:on_request callback function that can be used for this purpose.


function main(splash)
    local host = "proxy.crawlera.com"
    local port = 8010
    local user = "<API key>"
    local password = ""
    local session_header = "X-Crawlera-Session"
    local session_id = "create"

    splash:on_request(function (request)
        request:set_header("X-Crawlera-UA", "desktop")
        request:set_header(session_header, session_id)
        request:set_proxy{host, port, username=user, password=password}
    end)

    splash:on_response_headers(function (response)
        if response.headers[session_header] ~= nil then
            session_id = response.headers[session_header]
        end
    end)

    splash:go(splash.args.url)
    return splash:png()
end


The previous example rendered a page as a PNG image and the binary content is returned in the HTTP request. The /execute endpoint reads the automation script in the lua_source parameter (which is a string containing the full script).


Example (using python requests library):


# coding: utf-8
import requests

splash_server = 'http://192.168.99.100:8050'

with open('crawlera-splash.lua') as lua:
    lua_source = ''.join(lua.readlines())
    splash_url = '{}/execute'.format(splash_server)
    r = requests.post(
            splash_url,
            json={
                'lua_source': lua_source,
                'url': url,
            },
            timeout=100,
    )

    fp = open("crawlera-splash.png", "wb")
    fp.write(r.content)
    fp.close()


Note: in the previous python script Splash was running at address 192.168.99.100 default IP from docker container.