Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

printPDF to return a stream #216

Closed
jifeon opened this issue Jul 25, 2017 · 6 comments
Closed

printPDF to return a stream #216

jifeon opened this issue Jul 25, 2017 · 6 comments

Comments

@jifeon
Copy link

jifeon commented Jul 25, 2017

Currently printToPDF returns a base64 encoded string. It works great in general, but we are generating a bit huge PDFs - min 40MBs, they include lot of pages with high resolution images. Keeping such big data in memory causes performance and memories issues with node. It would be nice to take PDF content as a stream, or as a file.

Currently we are thinking about workaround with page per page printing. But it dramatically increases complexity of the service. So my question is: are there any plans to support streams for printToPDF?

@cyrus-and
Copy link
Owner

While I agree that it would be a nice thing to have, this feature request is off topic here. Ping the Google Group, the official protocol repo or even file an upstream issue.

@jifeon
Copy link
Author

jifeon commented Jul 25, 2017

@cyrus-and thx for quick response!

@nyroDev
Copy link

nyroDev commented Feb 18, 2022

There is now a transferMode (experimental) option available in the printToPDF method, that accept "ReturnAsStream" value.
How can we use it in node to use the returned stream?

@cyrus-and
Copy link
Owner

@nyroDev something along the lines of:

const CDP = require('chrome-remote-interface');
const fs = require('fs');

async function streamPdfToFile(url, file, chunkSize = undefined /* means auto */) {
    const client = await CDP();

    try {
        const {IO, Page} = client;

        console.log(`Navigating to ${url}`);

        await Page.enable();
        await Page.navigate({url});
        await Page.loadEventFired();

        console.log(`Page loaded, requesting the PDF stream`);

        const {stream: handle} = await Page.printToPDF({
            transferMode: 'ReturnAsStream'
        });

        let fileStream;
        while (true) {
            const {base64Encoded, data, eof} = await IO.read({
                handle,
                size: chunkSize
            });

            if (!fileStream) {
                fileStream = fs.createWriteStream(file, {
                    encoding: base64Encoded ? 'base64' : 'binary'
                });
            }

            if (eof) {
                console.log(`PDF stream finished, saved to ${file}`);

                await IO.close({handle});
                fileStream.close();
                break;
            }

            console.log(`Processing the next chunk of ${data.length} (decoded) bytes`);

            fileStream.write(data);
        }
    } finally {
        client.close();
    }
}

streamPdfToFile('https://nodejs.org/api/fs.html', '/tmp/out.pdf', 1 << 20);

@nyroDev
Copy link

nyroDev commented Feb 19, 2022

@cyrus-and thank you so much for your quick and detailed response, it works perfectly!

@nyroDev
Copy link

nyroDev commented Mar 26, 2024

@cyrus-and Please not that in some edges cases (particullary with large images), letting the chunkSize to undefined creates a truncated (corrupted) PDF file.

After reading this comment, I updated the code below to set the default chunk size to 16384*2 and so far it's working great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants