Start a new topic
Answered

Rejected message because it was too big: ITM {

Apparently, there is a 1 MB limitation on serialized items. Is there a way to remove the limitation? I need around 6 MB at least.


Best Answer

There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.


Answer

There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.

From time to time, my scraper is not able to parse the html.

I am trying to get access to the raw HTML. I have enabled the Page Storage addon and I raise an error. I get a warning that says "Page not saved, body too large: ". 

Any workaround?


Login to post a comment