Now I need to write a new spider. And the spider need to:
- Download a zip file from a website, about 3GB per file.
- Unzip the download file, then I got many xml files.
- Parse the xml, and select the information what I need into one item or mysql tables.
But there exists some questions in above steps:
- Where can I put the download files? Amazon S3?
- How can I unzip the file if I put the file in S3?
- If the files in S3 is very big, such as 3GB. How can I open the S3 file from scrapinghub?
- Can I use the ftp instead of the Amazon S3 if the file is 3GB?
Customer support service by UserEcho