cURL


Making use of PHP binding for libcurl library:


<?php

$ch = curl_init();

$url = 'https://httpbin.scrapinghub.com/get';
$proxy = 'proxy.crawlera.com:8010';
$proxy_auth = '<API KEY>:';

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxy_auth);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 180);
curl_setopt($ch, CURLOPT_CAINFO, '/path/to/crawlera-ca.crt'); //required for HTTPS
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 1); //required for HTTPS

$scraped_page = curl_exec($ch);

if($scraped_page === false)
{
    echo 'cURL error: ' . curl_error($ch);
}
else
{
    echo $scraped_page;
}

curl_close($ch);

?>


Please be sure to download the certificate provided in your Crawlera account's settings page (visit

https://app.scrapinghub.com/o/<ORG_ID>/crawlera/setup

) and set the correct path to the file in your script.


Refer to curl_multi_exec function to take advantage of Crawlera's concurrency feature and process requests in parallel (within the limits set for a given Crawlera plan).


Guzzle


Making use of Guzzle, a PHP HTTP client, in the context of Symfony framework:


<?php

namespace AppBundle\Controller;

use GuzzleHttp\Client;
use Symfony\Bundle\FrameworkBundle\Controller\Controller;
use Sensio\Bundle\FrameworkExtraBundle\Configuration\Route;
use Symfony\Component\HttpFoundation\Response;

class CrawleraController extends Controller
{
    /**
     * @Route("/crawlera", name="crawlera")
     */
    
    public function crawlAction()
    {
        $url = 'https://twitter.com';
        $client = new Client(['base_uri' => $url]);
        $crawler = $client->get($url, ['proxy' => 'http://<API KEY>:@proxy.crawlera.com:8010'])->getBody();

        return new Response(
            '<html><body> '.$crawler.' </body></html>'
        );
    }
}


Another Guzzle example:


<?php

use GuzzleHttp\Client as GuzzleClient;

$proxy_host = 'proxy.crawlera.com';
$proxy_port = '8010';
$proxy_user = '<API KEY>';
$proxy_pass = '';
$proxy_url = "http://{$proxy_user}:{$proxy_pass}@{$proxy_host}:{$proxy_port}";

$url = 'https://httpbin.org/headers';

$guzzle_client = new GuzzleClient();
$res = $guzzle_client->request('GET', $url, [
    'proxy' => $proxy_url,
    'headers' => [
        'X-Crawlera-Cookies' => 'disable',
        'Accept-Encoding' => 'gzip, deflate, br',
    ]
]);

echo $res->getBody();

?>

Sign Up Here and start using Crawlera with PHP.