Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noto font for pdfs - proof of concept #5031

Draft
wants to merge 5 commits into
base: hypernext
Choose a base branch
from

Conversation

MarcelBolten
Copy link
Contributor

This is a proof of concept to show the feasibility of using the Noto font in PDFs created by eLabFTW via the mPDF library.

The idea is to avoid the CJK user setting as a PDF should be created easily and independently of the language and the characters that are used. In order to avoid this setting we can use mPDFs character substitution but it "will add to the processing time". Perhaps the order of the backupSubsFont mpdf config setting should be changed according to the language selected by a user (UCP→General→Preferences→Language).

The Noto font is enormous and split into individual files to be able to cover many scripts/glyphys/languages/characters.
A comprehensive set is added with this PR but the files are added statically. Maybe a dynamic solution can be established.

I don't know enough about fonts but it seems like the added ttf files have overlapping coverage. This is not ideal as it potentially further increases the processing time during character substitution.

The noto dashboard and https://github.com/notofonts/ might be good entry points for further ideas, tools, solutions.

An important note: mPDF does not supports fonts with postscript outlines which can, but don't have to, be found in the otf format.

And here is an example PDF created via eLab using Noto.
image

@NicolasCARPi
Copy link
Contributor

Looks very good to me!

  1. We get rid of this use_cjk thing which is great
  2. the pdf/a are smaller in size by an order two orders of magnitude! Same experiment in japanese is 86 kb with Noto and 9.6 Mb with old code.

These two reasons by themselves are enough to go forward with this.

Regarding processing time of font substitution, it does indeed has a heavy impact. Takes about 1.2 seconds versus 500 ms for a big enough experiment with a mix of cjk+emojis characters (quite unlikely in the real world).

Untitled

if users complain I'll tell them to download more CPU 😄

BTW, now emojis show up properly, which is also a big win.

2 comments:

  • src/font should be src/fonts
  • can we make the font smaller? It's too big!

@MarcelBolten
Copy link
Contributor Author

MarcelBolten commented Apr 5, 2024

  1. the pdf/a are smaller in size by an order two orders of magnitude! Same experiment in japanese is 86 kb with Noto and 9.6 Mb with old code.

This is due to the character substitution and should be similar for other fonts, too.

2 comments:

  • src/font should be src/fonts

That is easy

  • can we make the font smaller? It's too big!

font-size: smaller 😂🤪

But seriously, I will need to do some research on how to reduce the file size of the ttf files. As I mentioned, there is probably overlap of the glyphs between the different files, and that should be removed.

I see two options:

  1. We build the font ourselves (https://github.com/notofonts/notobuilder)
  2. We find/use a tool that removes overlap (https://github.com/notofonts/nototools)

I am open to additional suggestions, hints, and/or directions.

@NicolasCARPi
Copy link
Contributor

I guess building the fonts ourself would be best, no?

@MarcelBolten
Copy link
Contributor Author

https://github.com/satbyy/go-noto-universal might be a better source for us.

@NicolasCARPi
Copy link
Contributor

This is a very good example why getting rid of this setting in UCP will be a good thing: #5056 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants