Initial PDF 1.4 support (embedded in core) #920

Flamenco · 2016-10-23T00:16:39Z

This commit paves the way for transparency, blending, masking, and grouping. The API is far from complete, and the implementation could use some optimizing in the generated PDF source.

There are many internal workarounds for the current pdf generation scheme, as it generates the PDF on the fly.

Defining pdfVersion:'1.4' in the CTOR options will enable the feature.

Flamenco · 2016-10-23T00:18:52Z

jean343 · 2018-02-07T17:01:41Z

Can we have an update on this pull request?
It's a really useful feature to have.
Additionally, I am working in adding linear gradients to context2d, and I need this as a prerequisite.
master...jean343:background-gradient

As per the pull request itself, is there a need to specify the version, could we just add the code by default?

Thanks,

Flamenco · 2018-02-20T12:09:44Z

@jean343 The generated PDF needs v1.4, and that needs to be set in the core JSPDF.

Flamenco · 2018-02-20T12:12:38Z

@jean343 Also, I am not sure that JSPDF will be compatible with this PR since at least a year has passed.

jean343 · 2018-02-20T14:24:59Z

I understand that it need core change, and we could update the PR to be compatible with the new core. Why does it need version 1.4, is it PDF specific or simply to be backward compatible?

Flamenco · 2018-02-20T18:33:01Z

PDF version 1.3 has no transparency support

Uzlopak

I have some questions

Uzlopak · 2018-02-20T18:45:05Z

jspdf.js

+      //
+      // Graphics State
+      //
+      pdf.output


Here happens nothing?

Uzlopak · 2018-02-20T18:46:13Z

jspdf.js

@@ -895,6 +983,35 @@ var jsPDF = (function(global) {
        // putHeader()
        out('%PDF-' + pdfVersion);

+        if (pdfVersion == '1.4') {
+            // add objects that were created using the newObject2 function
+            newObject2List.forEach(function (newObject2) {


What is this newObject-stuff?

These functions are workarounds for the current jsPDF architecture. It allows for delayed rendering.

Uzlopak · 2018-02-20T18:47:09Z

jspdf.js

@@ -368,6 +427,11 @@ var jsPDF = (function(global) {
      putXobjectDict = function() {
        // Loop through images, or other data objects
        events.publish('putXobjectDict');
+        if (pdfVersion == '1.4') {


Why dont we use PubSub and subssribe this to putXobjectDict?

Uzlopak · 2018-02-20T18:47:34Z

jspdf.js

@@ -380,10 +444,29 @@ var jsPDF = (function(global) {
          }
        }
        out('>>');
+        if (pdfVersion == '1.4') {


Why dont we PubSub this?

Please explain what you mean by PubSub. Thanks.

Uzlopak · 2018-02-20T18:48:14Z

jspdf.js

+         * The returned object also has a convenience method to push lines.
+         * @returns {{objId: number, lines: *[]}}
+         */
+        newObject2 = function () {


cant we merge this with the existing code?

Uzlopak · 2018-02-20T18:48:53Z

jspdf.js

@@ -200,6 +202,12 @@ var jsPDF = (function(global) {
      page = 0,
      currentPage,
      pages = [],
+      // pdf 1.4


Shouldnt we encapsulate this?

Uzlopak · 2018-02-20T18:49:40Z

I think it is good think but maybe it should be encapsulated more?

jean343 · 2018-02-21T01:21:24Z

I see, the pdf version is related to the Adobe version,
Considering version 1.4 is back in 2001, could we enable it by default?https://en.wikipedia.org/wiki/Adobe_Acrobat_version_history
https://acrobatusers.com/tutorials/understanding-pdf-compatibility-levels

@Flamenco, do you have time to resolve the conflicts?
JP

Flamenco · 2018-02-21T16:02:46Z

Considering version 1.4 is back in 2001, could we enable it by default
Probably best to leave it at 1.3 unless transparency and groupings are needed.

The reason I never went full speed with this is that I was trying to get a conversation about the API for declaring groupings, and that never happened. They can get rather complicated...

I was also hoping the core would stop streaming a string to build the PDF and instead build an AST, and render after it was assembled. So many 'workarounds' were needed because of that architecture.

Moving forward, I think it would be best to define an API before merging this, or at least marking it as a beta feature.

Uzlopak · 2018-02-21T16:09:36Z

By AST you mean that we should separate data-layer from the controller-layer?

How about we try to make all the core-methods like text, line, etc. accepting an configuration-object as a parameter which is then later used to fill an model-layer?

Flamenco · 2018-02-21T16:16:23Z

By AST, I mean the PDF should be built as a model, and then rendered as a string in a single pass. This would have many added benefits, not just for transparency/grouping. I spoke to James about this a couple of years ago.

Uzlopak · 2018-02-21T16:25:20Z

I totally agree with you, that a layer separation would improve this project alot. What do you think about my proposition above? I mean first rewriting all core-methods to be accepting an config-object as parameter.

Flamenco · 2018-02-21T16:51:48Z

I think your point of accepting config-object is good. I believe that methods should never take more or less than 1 argument...

But I think that is a different issue. IIRC most of the library generated strings and put them in lists, then flattened those lists into more strings. The nature of PDF is that you have object references and offsets, and those IDs and offsets really should not get generated until the final stage. That caused me a considerable amount of effort in developing plugins, was was a major setback for nested groups.

Uzlopak · 2018-02-21T17:14:21Z

I agree with you, but where do we want to start... Completely rewriting would be a pain in the ass :D

Flamenco · 2018-02-21T18:01:10Z

The current code relies so much on strings, that a total rewrite would be needed, and that's not going to happen anytime soon.

The newObject2 et al starts to do this though. It delays the rendering of the object id and offset. I did not want to pollute jspdf so I created a class to monkey-patch this stuff on top of the core. That's the solution we supplied to SAP for their opacity chart needs.

Quite frankly the code is small, and should not break current API, so I say we put it in and continue require a 1.4 flag to be set to activate it. The API to create groupings needs a bit of work though, and could use some user input along with 'borrowing' of other frameworks approaches. @jean343 How do you feel about the API this exposes?

@arasabbasi I forgot this PR was even active, and most likely we made some changes anyway. If you want to add it, let me know and I will pull up the latest version on my end and resubmit it.

jean343 · 2018-02-21T20:01:49Z

I like the AST approach for sure, but I understand it's a lot of work.
From your patch, I really liked the jsonToPdfObject, it's much simpler to just write a JSON element to the PDF.
I would remove the unused function createPattern_Shading_Axial, and create real pattern functions taking gradient stops.

agilgur5 · 2018-02-25T07:06:01Z

Agree that an AST approach would be better. It would open up a lot of new ways to use jsPDF as well. Perhaps AST approach would be simpler to implement once this library is properly modularized per #839

Flamenco · 2018-02-25T16:57:42Z

I think the best approach to this is to

Refactor non_string writing PDF code to a lib file to be shared by both approaches. This will not break existing code and keep things DRY.
Build an AST from the current API using a JSON model.
Generate PDF from the AST using as much refactored shared lib code as possible.

This will break many plugins that rely on streaming callbacks, so a new plugin API will need to emerge.

Uzlopak · 2018-02-25T18:21:57Z

Hmm, yeah like making the core-methods and essential plugins use one object as option-parameter and then write an parser on top, like

pdf Object:

{
type: 'pdf',
pages: []
}

page Object

{
type: 'page',
objects: [],
width: 100,
height: 200,
orientation: 'portrait'
}

Text Object

{
type: 'text',
data: 'Some text',
x: 10,
y: 10
}

Image Object

{
type: 'image',
data: 'data:image/png;base64,BADFACE0....',
x: 10,
y: 10,
width: 280,
height: 210
}

Etc. etc.

?

Flamenco · 2018-02-25T18:30:50Z

@arasabbasi That looks good for a spec. There will be issues concerning PDF 1.4 involving nested group ops (like transparency, blend modes, masks). Those will need a tree structure. It will be a bit more complicated than just serializing the current instructions into a list. Another issue will be to introduce API for defining the 1.4 stuff, or just use the context2d API for that.

Uzlopak · 2018-02-25T18:38:51Z

Actually I thought that the pdf-Object will contain the page-Objects in the pages-attribute and the page-Object will contain text and image-Objects in the objects-attribute.

Hmm... or maybe much more like the real PDF file. Having Dictionary Objects and stuff...

TBH I would appreciate it, if someone else would write a spec and we recode it according to the given Spec. In fact I am more like the debug and refactor guy and less the "hey lets invent something cool and innovative" guy. You know what i mean :D ?

Flamenco · 2018-02-25T18:54:00Z

The problem with the current API and implementation is that is that they do not allow expressing PDF 1.4 semantics. The workaround for me was to create backtracking hacks.

For example, if I want to create a mask of shapes, and apply it to another group of shapes, each will have to be defined before any 'writing' is done.

That's why enhancing the current method calls will not be enough. The AST will not be very complicated. When all is said and done it will actually look like SVG!

agilgur5 · 2018-02-25T23:16:40Z

@arasabbasi the draft "spec" you wrote out seems good to me and would work for the purposes of an AST (it is a tree). I think implementing it as close to the PDF spec as possible would make the abstraction easier to understand.
Perhaps, if going the OO-route (vs. another pattern), each of the "objects" (PDF, Page, Text, etc) would have it's own Class definition and each with a toString() method defined. The toString() method would print out the String version of the object for the PDF as well as call toString() on any child objects.

If the API were to use this spec, it would make modularity even easier -- implementing a new Plugin would just be creating a new Class with the same interface.

@Flamenco makes a good point about applying "transformations" to "groups" and that it would be similar to SVG as such. The 1.4 semantics do require some added complexity. A "transform" attribute and "groups" like in SVG could handle that well as an abstraction.

For instance,

text {
  transform: 'mask(...)'
}

The transform would call out to a mask function, as implemented by a plugin. Transform functions could be added to the top-level PDF object, via plugins. Which would allow for easy customizability and forking of plugins as they're decentralized and not defined in core.

For instance,

pdf {
  transformFunctions: {
    'mask': function () {},
    'etc': function () {}
  }
}

Each key in the transformFunctions dict would be added by a plugin, so keys / transform functions can be replaced at any time by any plugin / fork / etc. They would only be applied at string generation, when toString() is called, and not before.

Groups, like in SVG, would just be another object in a Page object.
For instance,

group {
  objects: [ ... ],
  transform: 'mask(...)'
}

It would also have an objects key, which allows for as much nesting as necessary. It's toString() would apply the functions in transform to all of its child objects (perhaps duplicate them first for immutability) and then subsequently call toString() on the now transformed child objects.

This might be straying from the PDF spec a bit (I haven't looked at it in ages), but I think this is a good draft and discussion point :)

An AST structure as such would also allow for something like a React interface for jsPDF that would be declarative instead of the current imperative API.

Flamenco · 2018-02-26T19:28:21Z

It might be best to have the 'AST model' have nothing at all to do with PDF. Just a nice structure of containing all the information needed to generate from the user's intentions.

Then validate it, clean it up a bit, ship it off to plugins to process it.

Then generate an almost PDF like structure, but without calculating references and such. Once again send that out to the plugins for processing.

Then generate it.

jean343 · 2018-02-26T19:36:55Z

If we start to use plugins this extensively, could we have a real ES6 way of defining plugins?
I hate to import a plugin and set it to window.
I was forced to do:

import jsPDF from "./jspdf";
window.jsPDF = jsPDF;
window.adler32cs = require( "adler32cs" );
require( './plugins/addimage' );
...

Uzlopak · 2018-02-26T19:48:45Z

I dont know... I removed as much as possible the references to the window-Object.

Flamenco · 2018-02-26T19:59:22Z

When all is said and done, you are going to have the same information as SVG. Just a subset of it. Since that language is already documented, why reinvent the wheel? Once you get into masks, groups, transparency, and blending modes, it's going to get complicated very quickly.

HackbrettXXX · 2020-06-25T15:03:09Z

I'm closing this since I guess it currently is very much out of scope and some of the transparency/graphics state features are now already implemented with the merge of the yWorks fork. Feel free to reopen :)

Initial PDF 1.4 support (embedded in core)

416b02e

Uzlopak reviewed Feb 20, 2018

View reviewed changes

Uzlopak mentioned this pull request Aug 31, 2018

Request to support opacity in setFillColor and setDrawColor function #1869

Closed

HackbrettXXX closed this Jun 25, 2020

Initial PDF 1.4 support (embedded in core) #920

Initial PDF 1.4 support (embedded in core) #920

Conversation

Flamenco commented Oct 23, 2016

Flamenco commented Oct 23, 2016

jean343 commented Feb 7, 2018

Flamenco commented Feb 20, 2018

Flamenco commented Feb 20, 2018

jean343 commented Feb 20, 2018

Flamenco commented Feb 20, 2018

Uzlopak left a comment

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Flamenco Feb 21, 2018 • edited

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Flamenco Feb 21, 2018

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Uzlopak Feb 20, 2018

Choose a reason for hiding this comment

Uzlopak commented Feb 20, 2018

jean343 commented Feb 21, 2018

Flamenco commented Feb 21, 2018

Uzlopak commented Feb 21, 2018

Flamenco commented Feb 21, 2018 • edited

Uzlopak commented Feb 21, 2018

Flamenco commented Feb 21, 2018

Uzlopak commented Feb 21, 2018

Flamenco commented Feb 21, 2018

jean343 commented Feb 21, 2018

agilgur5 commented Feb 25, 2018

Flamenco commented Feb 25, 2018 • edited

Uzlopak commented Feb 25, 2018 • edited

Flamenco commented Feb 25, 2018

Uzlopak commented Feb 25, 2018 • edited

Flamenco commented Feb 25, 2018

agilgur5 commented Feb 25, 2018

Flamenco commented Feb 26, 2018

jean343 commented Feb 26, 2018

Uzlopak commented Feb 26, 2018

Flamenco commented Feb 26, 2018

HackbrettXXX commented Jun 25, 2020

Flamenco Feb 21, 2018 •

edited

Flamenco commented Feb 21, 2018 •

edited

Flamenco commented Feb 25, 2018 •

edited

Uzlopak commented Feb 25, 2018 •

edited

Uzlopak commented Feb 25, 2018 •

edited