Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid String Length When Saving PDF #1724

Closed
mackersD opened this issue Apr 26, 2018 · 9 comments · May be fixed by #3646
Closed

Invalid String Length When Saving PDF #1724

mackersD opened this issue Apr 26, 2018 · 9 comments · May be fixed by #3646

Comments

@mackersD
Copy link

mackersD commented Apr 26, 2018

Thank you for submitting an issue to jsPDF. Please read carefully.

Are you using the latest version of jsPDF?
Yes (1.3.5)

Have you tried using jspdf.debug.js?
Yes

Steps to reproduce

  1. Add images to a jsPDF object that would generate a file string at around 384859244 characters. (~ 130 pages with full page canvas images generated via html2canvas)
  2. Save File.
  3. Observe Array.join error due to string being too long. (jspdf.debug.js; line 1051)

What I saw
The save function silently fails, and a console log error displays with an Invalid string length error:

index.js:2177 Error in function Array.join (native): Invalid string length RangeError: Invalid string length
    at Array.join (native)
    at buildDocument (jspdf.debug.js:1051)
    at getArrayBuffer (jspdf.debug.js:1073)
    at getBlob (jspdf.debug.js:1083)
    at Object.<anonymous> (jspdf.debug.js:1112)
    at Object.__safeCallWrapper [as output] (jspdf.debug.js:621)
    at Object.API.save (jspdf.debug.js:2191)

What I expected
I expected the pdf file to save.

@mackersD
Copy link
Author

mackersD commented Apr 27, 2018

Below is the code that I'm running when this error is generated:

function saveToPDF(tiles) {
  var pdfDoc = new jsPDF({
    orientation: "l",
    unit: "pt",
    format: [612, 792] // 72pts per inch
  })

  var margins = {
    top: 36,
    bottom: 36,
    left: 36,
    width: 720,
    height: 540
  }

  var pdfHTML = document.getElementsByClassName("pdfPreview")

  var pageLength = pdfHTML.length
  var pageCanvases = new Array(pageLength)
  var canvasOperations = []
  for(var i = 0; i < pdfHTML.length; i++) {
    (j => {
      var createCanvas = html2canvas(pdfHTML[j], {
        async: true
      }).then(canvas => {
        pageCanvases[j] = canvas
      }, error => {
        alert(error)
      })
      canvasOperations.push(createCanvas)
    })(i)
  }

  Promise.all(canvasOperations).then(() => {
    pageCanvases.forEach((canvas, idx) => {
      if(idx > 0) {
        pdfDoc.addPage()
      }
      var img = canvas.toDataURL()
      pdfDoc.addImage(img, "jpg", margins.left, margins.top, margins.width, margins.height)
    })
    pdfDoc.save("file.pdf")    
  })
}

@Custardcs
Copy link

I had this issue as well our reports generate 3000+ pages with images attached. we were running out of memory but as it stands now. We managed to fix the issue. but we are not using html2canvas.

@Uzlopak
Copy link
Collaborator

Uzlopak commented May 27, 2018

Actually the problem is that javascript can only handle a specific length of a string. And this is the problem, because the mentioned line is content.join('\n');

We can not modify the string length of the javascript engine. Probably we have to reduce the amount of memory/string length by using compression.

@dalalsurender
Copy link

I had this issue as well our reports generate 3000+ pages with images attached. we were running out of memory but as it stands now. We managed to fix the issue. but we are not using html2canvas.

Can you please update which API you are using, please share the link for new API to generate PDF. Thanks

@qinlili23333
Copy link

I got same error when im making a pdf of about 600 pages with a image about 1.5M on every page.

@steve231293
Copy link

Hello there,
Any one have any solutions for this problem? Could you please share for me?

@leviznull
Copy link

@steve231293 I know this is late but it might still be useful to others.

Cause

As we've found out the library is able to generate very large PDFs but crashes when it tries to save them, as it tries to concatenate an array of lines into a single string.

Node uses V8 which has a maximum string size cap of 2^29 = 512M on 64-bit platforms, which is the maximum it can support given that any heap object's size in bytes must fit into a Smi (which are now 31-bit on all 64-bit platforms, with or without pointer compression).

Fix

To fix this we patched the library to return a string array instead of a single string.

To do the patching you can simply add "postinstall": "npx patch-package" to your package.json file.

I've attached the patch file for jspdf version 2.5.1 here jspdf+2.5.1.zip. The reason the patch is so big is because I had to clone and rebuild the whole library in order to also patch the minified code.

Or if you want to apply the patch manually here is the diff

diff --git a/src/jspdf.js b/src/jspdf.js
index fa3dd923..39418ae3 100644
--- a/src/jspdf.js
+++ b/src/jspdf.js
@@ -2975,7 +2975,7 @@ function jsPDF(options) {
     }
   });
 
-  var buildDocument = (API.__private__.buildDocument = function() {
+  var buildDocument = (API.__private__.buildDocument = function(raw = false) {
     resetDocument();
     setOutputDestination(content);
 
@@ -2998,6 +2998,14 @@ function jsPDF(options) {
 
     setOutputDestination(pages[currentPage]);
 
+    if (raw) {
+      for (let index = 0; index < content.length - 1; index++) {
+        content[index] += "\n";
+      }
+
+      return content;
+    }
+
     return content.join("\n");
   });
 
@@ -3048,6 +3056,8 @@ function jsPDF(options) {
     switch (type) {
       case undefined:
         return buildDocument();
+      case "stringArray":
+        return buildDocument(true);
       case "save":
         API.save(options.filename);
         break;
(END)

Using the fix

We then turn this string array into a readable stream and stream its results directly to S3 therefore bypassing the string size limitation. Note that javascript strings are unicode, but jspdf is ASCII so streaming unicode strings would result in corrupted data, instead we convert the ASCII strings to a byte array to fix the problem.

const pdfOutput = doc.output('stringArray');

const readableStream = new Readable({
  read() {
    const chunk = pdfOutput.shift();
    if (chunk) {
      // convert string to bytes to keep PDF ascii encoding
      // if we don't do this the PDF will break and show blank
      this.push(Uint8Array.from(chunk, (x) => x.charCodeAt(0)));
    } else {
      this.push(null);
    }
  }
});

const passThrough = new PassThrough();

const uploadToS3 = new Upload({
  client: new S3({
    ...
  }),
  params: {
    ...
    Body: passThrough
  },
});

readableStream.on('data', (chunk) => {
  // S3 streaming also breaks unless we force ascii encoding here as well
  passThrough.write(chunk, 'ascii');
});

readableStream.on('end', () => {
  passThrough.end();
});

await uploadToS3.done();

Testing the fix

  1. Download any 4k image (the larger the better)
  2. Run node script below

This script begins generating a PDF with 400 pages, encoding a 4k image on each page. We also add some text over the image to ensure that the images cannot be re-referenced. Since each image is embedded as base64 we will quickly hit the JavaScript string size limit and it'll crash on the save step. You might need to tweak the number of pages (400) to align with the image size etc.

Here is the messy script

const fs = require("fs");
const process = require("process");
const sharp = require("sharp");
const sizeOf = require("image-size");
const { Readable } = require("stream");
const file = fs.readFileSync("test.jpg");
const dimensions = sizeOf(file);

function* gen(start, end) {
  for (let i = start; i <= end; i++) yield i;
}

function logIt(num) {
  console.log(
    `${num} ${Object.entries(process.memoryUsage()).reduce(
      (carry, [key, value]) => {
        return `${carry}${key}:${
          Math.round((value / 1024 / 1024) * 100) / 100
        }MB    `;
      },
      ""
    )}`
  );
}

async function runTest() {
  try {
    let doc = new jsPDF();

    for await (const num of gen(1, 400)) {
      // render a number over the image to increase file size
      // prevents JSPDF reusing the same image reference
      await sharp(file)
        .composite([
          {
            input: Buffer.from(` <svg width="${dimensions.width}" height="${dimensions.height}"> <style> .title { fill: #fff; font-size: 1000px; font-weight: bold;} </style> <text x="50%" y="50%" text-anchor="middle" class="title">#${num}</text> </svg> `),
            top: 0,
            left: 0,
          },
        ])
        .toBuffer({
          resolveWithObject: true,
        })
        .then(({ data, info }) => {
          doc.setFontSize(40);
          doc.text(`test page ${num}`, 35, 25);
          doc.addImage(
            `data:image/${info.format};base64,${data.toString("base64")}`,
            "JPEG",
            15,
            40,
            180,
            180
          );
          doc.addPage();
        });

      logIt(num);
    }
    
    // broken version - should throw an error
    const buffer = Buffer.from(doc.output("arraybuffer"));
    fs.writeFileSync("./x.pdf", buffer, {encoding: "ascii"});
    
    // patched version - should be working
    const readableStream = Readable.from(doc.output("stringArray"));
    const writableStream = fs.createWriteStream("./x.pdf", {flags: 'a', encoding: "ascii"});
    readableStream.on("data", (row) => {
      writableStream.write(row);
    });

  } catch (e) {
    console.log(e);
  }
}

runTest();

@laucheukhim
Copy link

@leviznull You are aware that there are 2 pending pull requests for this bug (#3625, #3646). There are already multiple solutions for this.

For now, you can just use my fork:

npm install https://github.com/laucheukhim/jsPDF.git

And do:

const pdfOutput = doc.output('blob');

@steve231293
Copy link

Thank you @leviznull and @laucheukhim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants