Portia Basics

Portia FAQ
Does Portia support HTTP authentication? No, but is in the roadmap. In the meantime, you would have to write a Scrapy spider if you need HTTP authentica...
Thu, 23 Mar, 2017 at 10:23 PM
Learn Portia (video tutorials)
Meet Portia, a visual scraping tool that lets you point and click in the elements that you want to scrape from a page. Check out the videos below to see Por...
Mon, 3 Apr, 2017 at 5:30 PM
Using Portia - The complete beginner's guide
In this tutorial, we are going to set up Portia and develop a project to scrape items from books.toscrape.com Then we are going to export these Items in CSV...
Fri, 24 Mar, 2017 at 10:41 PM
How do you extract data from a list of URLs?
With Portia, you can easily set a list of URLs as starting pages. Once in Portia, just enter any of the pages you want to scrape. Then, create a New...
Fri, 24 Mar, 2017 at 10:43 PM
Annotations and Data extraction
Portia provides two indicators about the data that is being extracted: the colorful counts next to each annotation on the left panel and the Extracted Items...
Tue, 25 Apr, 2017 at 9:08 PM
Using Regular expressions and Query cleaner Addon
In this article we'll cover two features you can use to help improve the efficiency of your spiders by restricting which pages are crawled and preven...
Fri, 24 Mar, 2017 at 10:44 PM
How to handle pagination in Portia
Portia can search all links in a domain searching for some items, but this can be not efficient and sometimes not adequate for our purposes.  Sometimes ...
Thu, 4 May, 2017 at 1:44 AM
Scraping duplicate products
If you wish to scrape duplicate products using Portia, you need to add a new setting - SLYDUPEFILTER_ENABLED=0 as shown below: Hope you find this info...
Fri, 1 Sep, 2017 at 5:48 PM