Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial PDF 1.4 support (embedded in core) #920

Closed
wants to merge 1 commit into from

Conversation

Flamenco
Copy link
Contributor

This commit paves the way for transparency, blending, masking, and grouping. The API is far from complete, and the implementation could use some optimizing in the generated PDF source.

There are many internal workarounds for the current pdf generation scheme, as it generates the PDF on the fly.

Defining pdfVersion:'1.4' in the CTOR options will enable the feature.

@Flamenco
Copy link
Contributor Author

screen shot 2016-10-22 at 8 18 20 pm

@jean343
Copy link

jean343 commented Feb 7, 2018

Can we have an update on this pull request?
It's a really useful feature to have.
Additionally, I am working in adding linear gradients to context2d, and I need this as a prerequisite.
master...jean343:background-gradient

As per the pull request itself, is there a need to specify the version, could we just add the code by default?

Thanks,

@Flamenco
Copy link
Contributor Author

@jean343 The generated PDF needs v1.4, and that needs to be set in the core JSPDF.

@Flamenco
Copy link
Contributor Author

@jean343 Also, I am not sure that JSPDF will be compatible with this PR since at least a year has passed.

@jean343
Copy link

jean343 commented Feb 20, 2018

I understand that it need core change, and we could update the PR to be compatible with the new core. Why does it need version 1.4, is it PDF specific or simply to be backward compatible?

@Flamenco
Copy link
Contributor Author

PDF version 1.3 has no transparency support

Copy link
Collaborator

@Uzlopak Uzlopak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions

//
// Graphics State
//
pdf.output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here happens nothing?

@@ -895,6 +983,35 @@ var jsPDF = (function(global) {
// putHeader()
out('%PDF-' + pdfVersion);

if (pdfVersion == '1.4') {
// add objects that were created using the newObject2 function
newObject2List.forEach(function (newObject2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this newObject-stuff?

Copy link
Contributor Author

@Flamenco Flamenco Feb 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are workarounds for the current jsPDF architecture. It allows for delayed rendering.

@@ -368,6 +427,11 @@ var jsPDF = (function(global) {
putXobjectDict = function() {
// Loop through images, or other data objects
events.publish('putXobjectDict');
if (pdfVersion == '1.4') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why dont we use PubSub and subssribe this to putXobjectDict?

@@ -380,10 +444,29 @@ var jsPDF = (function(global) {
}
}
out('>>');
if (pdfVersion == '1.4') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why dont we PubSub this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain what you mean by PubSub. Thanks.

* The returned object also has a convenience method to push lines.
* @returns {{objId: number, lines: *[]}}
*/
newObject2 = function () {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant we merge this with the existing code?

@@ -200,6 +202,12 @@ var jsPDF = (function(global) {
page = 0,
currentPage,
pages = [],
// pdf 1.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt we encapsulate this?

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 20, 2018

I think it is good think but maybe it should be encapsulated more?

@jean343
Copy link

jean343 commented Feb 21, 2018

I see, the pdf version is related to the Adobe version,
Considering version 1.4 is back in 2001, could we enable it by default?https://en.wikipedia.org/wiki/Adobe_Acrobat_version_history
https://acrobatusers.com/tutorials/understanding-pdf-compatibility-levels

@Flamenco, do you have time to resolve the conflicts?
JP

@Flamenco
Copy link
Contributor Author

Considering version 1.4 is back in 2001, could we enable it by default
Probably best to leave it at 1.3 unless transparency and groupings are needed.

The reason I never went full speed with this is that I was trying to get a conversation about the API for declaring groupings, and that never happened. They can get rather complicated...

I was also hoping the core would stop streaming a string to build the PDF and instead build an AST, and render after it was assembled. So many 'workarounds' were needed because of that architecture.

Moving forward, I think it would be best to define an API before merging this, or at least marking it as a beta feature.

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 21, 2018

By AST you mean that we should separate data-layer from the controller-layer?

How about we try to make all the core-methods like text, line, etc. accepting an configuration-object as a parameter which is then later used to fill an model-layer?

@Flamenco
Copy link
Contributor Author

Flamenco commented Feb 21, 2018

By AST, I mean the PDF should be built as a model, and then rendered as a string in a single pass. This would have many added benefits, not just for transparency/grouping. I spoke to James about this a couple of years ago.

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 21, 2018

I totally agree with you, that a layer separation would improve this project alot. What do you think about my proposition above? I mean first rewriting all core-methods to be accepting an config-object as parameter.

@Flamenco
Copy link
Contributor Author

I think your point of accepting config-object is good. I believe that methods should never take more or less than 1 argument...

But I think that is a different issue. IIRC most of the library generated strings and put them in lists, then flattened those lists into more strings. The nature of PDF is that you have object references and offsets, and those IDs and offsets really should not get generated until the final stage. That caused me a considerable amount of effort in developing plugins, was was a major setback for nested groups.

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 21, 2018

I agree with you, but where do we want to start... Completely rewriting would be a pain in the ass :D

@Flamenco
Copy link
Contributor Author

The current code relies so much on strings, that a total rewrite would be needed, and that's not going to happen anytime soon.

The newObject2 et al starts to do this though. It delays the rendering of the object id and offset. I did not want to pollute jspdf so I created a class to monkey-patch this stuff on top of the core. That's the solution we supplied to SAP for their opacity chart needs.

Quite frankly the code is small, and should not break current API, so I say we put it in and continue require a 1.4 flag to be set to activate it. The API to create groupings needs a bit of work though, and could use some user input along with 'borrowing' of other frameworks approaches. @jean343 How do you feel about the API this exposes?

@arasabbasi I forgot this PR was even active, and most likely we made some changes anyway. If you want to add it, let me know and I will pull up the latest version on my end and resubmit it.

@jean343
Copy link

jean343 commented Feb 21, 2018

I like the AST approach for sure, but I understand it's a lot of work.
From your patch, I really liked the jsonToPdfObject, it's much simpler to just write a JSON element to the PDF.
I would remove the unused function createPattern_Shading_Axial, and create real pattern functions taking gradient stops.

@agilgur5
Copy link

Agree that an AST approach would be better. It would open up a lot of new ways to use jsPDF as well. Perhaps AST approach would be simpler to implement once this library is properly modularized per #839

@Flamenco
Copy link
Contributor Author

Flamenco commented Feb 25, 2018

I think the best approach to this is to

  1. Refactor non_string writing PDF code to a lib file to be shared by both approaches. This will not break existing code and keep things DRY.
  2. Build an AST from the current API using a JSON model.
  3. Generate PDF from the AST using as much refactored shared lib code as possible.

This will break many plugins that rely on streaming callbacks, so a new plugin API will need to emerge.

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 25, 2018

Hmm, yeah like making the core-methods and essential plugins use one object as option-parameter and then write an parser on top, like

pdf Object:

{
type: 'pdf',
pages: []
}

page Object

{
type: 'page',
objects: [],
width: 100,
height: 200,
orientation: 'portrait'
}

Text Object

{
type: 'text',
data: 'Some text',
x: 10,
y: 10
}

Image Object

{
type: 'image',
data: '....',
x: 10,
y: 10,
width: 280,
height: 210
}

Etc. etc.

?

@Flamenco
Copy link
Contributor Author

@arasabbasi That looks good for a spec. There will be issues concerning PDF 1.4 involving nested group ops (like transparency, blend modes, masks). Those will need a tree structure. It will be a bit more complicated than just serializing the current instructions into a list. Another issue will be to introduce API for defining the 1.4 stuff, or just use the context2d API for that.

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 25, 2018

Actually I thought that the pdf-Object will contain the page-Objects in the pages-attribute and the page-Object will contain text and image-Objects in the objects-attribute.

Hmm... or maybe much more like the real PDF file. Having Dictionary Objects and stuff...

TBH I would appreciate it, if someone else would write a spec and we recode it according to the given Spec. In fact I am more like the debug and refactor guy and less the "hey lets invent something cool and innovative" guy. You know what i mean :D ?

@Flamenco
Copy link
Contributor Author

The problem with the current API and implementation is that is that they do not allow expressing PDF 1.4 semantics. The workaround for me was to create backtracking hacks.

For example, if I want to create a mask of shapes, and apply it to another group of shapes, each will have to be defined before any 'writing' is done.

That's why enhancing the current method calls will not be enough. The AST will not be very complicated. When all is said and done it will actually look like SVG!

@agilgur5
Copy link

@arasabbasi the draft "spec" you wrote out seems good to me and would work for the purposes of an AST (it is a tree). I think implementing it as close to the PDF spec as possible would make the abstraction easier to understand.
Perhaps, if going the OO-route (vs. another pattern), each of the "objects" (PDF, Page, Text, etc) would have it's own Class definition and each with a toString() method defined. The toString() method would print out the String version of the object for the PDF as well as call toString() on any child objects.

If the API were to use this spec, it would make modularity even easier -- implementing a new Plugin would just be creating a new Class with the same interface.

@Flamenco makes a good point about applying "transformations" to "groups" and that it would be similar to SVG as such. The 1.4 semantics do require some added complexity. A "transform" attribute and "groups" like in SVG could handle that well as an abstraction.

For instance,

text {
  transform: 'mask(...)'
}

The transform would call out to a mask function, as implemented by a plugin. Transform functions could be added to the top-level PDF object, via plugins. Which would allow for easy customizability and forking of plugins as they're decentralized and not defined in core.

For instance,

pdf {
  transformFunctions: {
    'mask': function () {},
    'etc': function () {}
  }
}

Each key in the transformFunctions dict would be added by a plugin, so keys / transform functions can be replaced at any time by any plugin / fork / etc. They would only be applied at string generation, when toString() is called, and not before.

Groups, like in SVG, would just be another object in a Page object.
For instance,

group {
  objects: [ ... ],
  transform: 'mask(...)'
}

It would also have an objects key, which allows for as much nesting as necessary. It's toString() would apply the functions in transform to all of its child objects (perhaps duplicate them first for immutability) and then subsequently call toString() on the now transformed child objects.

This might be straying from the PDF spec a bit (I haven't looked at it in ages), but I think this is a good draft and discussion point :)

An AST structure as such would also allow for something like a React interface for jsPDF that would be declarative instead of the current imperative API.

@Flamenco
Copy link
Contributor Author

It might be best to have the 'AST model' have nothing at all to do with PDF. Just a nice structure of containing all the information needed to generate from the user's intentions.

Then validate it, clean it up a bit, ship it off to plugins to process it.

Then generate an almost PDF like structure, but without calculating references and such. Once again send that out to the plugins for processing.

Then generate it.

@jean343
Copy link

jean343 commented Feb 26, 2018

If we start to use plugins this extensively, could we have a real ES6 way of defining plugins?
I hate to import a plugin and set it to window.
I was forced to do:

import jsPDF from "./jspdf";
window.jsPDF = jsPDF;
window.adler32cs = require( "adler32cs" );
require( './plugins/addimage' );
...

@Uzlopak
Copy link
Collaborator

Uzlopak commented Feb 26, 2018

I dont know... I removed as much as possible the references to the window-Object.

@Flamenco
Copy link
Contributor Author

When all is said and done, you are going to have the same information as SVG. Just a subset of it. Since that language is already documented, why reinvent the wheel? Once you get into masks, groups, transparency, and blending modes, it's going to get complicated very quickly.

@HackbrettXXX
Copy link
Collaborator

I'm closing this since I guess it currently is very much out of scope and some of the transparency/graphics state features are now already implemented with the merge of the yWorks fork. Feel free to reopen :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
next release
Awaiting triage
Development

Successfully merging this pull request may close these issues.

None yet

5 participants