Pocket import fails with memory error during pdf parsing #7460

R-Rudolf · 2024-05-03T13:40:47Z

Environment

Version: 2.6.9
Installation: Container execution (Host OS Fedora12, container runtime Podman 4.3.1)
PHP version:
PHP 8.1.27 (cli) (built: Feb 21 2024 14:48:59) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.27, Copyright (c) Zend Technologies
OS: (within the container) Alpine Linux v3.18.6
Database: SQLite
Parameters: Default, only domain name changed.

What steps will reproduce the bug?

Initiated Pocket import. After I click "Authorize" from the getpocket domain, it loads and after a while, the error message appears on the self-hosted domain:

500: Internal Server Error
Error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 15816539 bytes)

Detailed error logs from within the container

$ tail var/logs/prod.log

[2024-05-03T13:09:35.847268+00:00] httplug.INFO: Sending request: GET http://arxiv.org/pdf/2305.16291 1.1 {"request":{"GuzzleHttp\\Psr7\\Request":[]},"uid":"6634e20fced934.16553197"} []
[2024-05-03T13:09:36.547080+00:00] httplug.INFO: Received response: 200 OK 1.1 {"milliseconds":700,"uid":"6634e20fced934.16553197"} []
[2024-05-03T13:09:36.664665+00:00] graby.INFO: Data fetched: array{"effective_url":"http://arxiv.org/pdf/2305.16291","body":"(only length for debug): 18830859","headers":{"connection":"keep-alive","content-length":"18830859","content-type":"application/pdf","etag":"\"sha256:4ad0e876edf36c97290bf0a5431b28771580f39d44ebfefa463f1315387d0be9\"","last-modified":"Fri, 20 Oct 2023 01:18:19 GMT","access-control-allow-origin":"*","cache-control":"max-age=86400","content-disposition":"inline; filename=\"2305.16291v2.pdf\"","x-cloud-trace-context":"f31e38a84b3f8a29400e49cfd015b2a5;o=1","server":"Google Frontend","via":"1.1 google, 1.1 google, 1.1 varnish, 1.1 varnish","accept-ranges":"bytes","age":"6300","date":"Fri, 03 May 2024 13:09:35 GMT","x-served-by":"cache-lga21930-LGA, cache-vie6336-VIE","x-cache":"HIT, HIT","x-timer":"S1714741776.881680,VS0,VE1"},"status":200} {"data":{"effective_url":"http://arxiv.org/pdf/2305.16291","body":"(only length for debug): 18830859","headers":{"connection":"keep-alive","content-length":"18830859","content-type":"application/pdf","etag":"\"sha256:4ad0e876edf36c97290bf0a5431b28771580f39d44ebfefa463f1315387d0be9\"","last-modified":"Fri, 20 Oct 2023 01:18:19 GMT","access-control-allow-origin":"*","cache-control":"max-age=86400","content-disposition":"inline; filename=\"2305.16291v2.pdf\"","x-cloud-trace-context":"f31e38a84b3f8a29400e49cfd015b2a5;o=1","server":"Google Frontend","via":"1.1 google, 1.1 google, 1.1 varnish, 1.1 varnish","accept-ranges":"bytes","age":"6300","date":"Fri, 03 May 2024 13:09:35 GMT","x-served-by":"cache-lga21930-LGA, cache-vie6336-VIE","x-cache":"HIT, HIT","x-timer":"S1714741776.881680,VS0,VE1"},"status":200}} []
[2024-05-03T13:09:36.940533+00:00] request.CRITICAL: Uncaught PHP Exception Symfony\Component\ErrorHandler\Error\OutOfMemoryError: "Error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 15816539 bytes)" at /var/www/wallabag/vendor/smalot/pdfparser/src/Smalot/PdfParser/RawData/FilterHelper.php line 244 {"exception":"[object] (Symfony\\Component\\ErrorHandler\\Error\\OutOfMemoryError(code: 0): Error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 15816539 bytes) at /var/www/wallabag/vendor/smalot/pdfparser/src/Smalot/PdfParser/RawData/FilterHelper.php:244)"} []

I double-checked and the container itself had enough memory (2GB), and it did not even use much, only around ~120 MB before it crashed.

I would expect, that if a single article fails to be imported, the import continues and maybe just lists the failures.
The best would be if the process just uses more memory to do the import without crashing.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pocket import fails with memory error during pdf parsing #7460

Pocket import fails with memory error during pdf parsing #7460

R-Rudolf commented May 3, 2024

Pocket import fails with memory error during pdf parsing #7460

Pocket import fails with memory error during pdf parsing #7460

Comments

R-Rudolf commented May 3, 2024

Environment

What steps will reproduce the bug?