Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageFont module: getsize is slow? #4651

Closed
cristianocca opened this issue May 27, 2020 · 15 comments
Closed

ImageFont module: getsize is slow? #4651

cristianocca opened this issue May 27, 2020 · 15 comments
Labels

Comments

@cristianocca
Copy link

cristianocca commented May 27, 2020

What did you do?

See code sample.

What did you expect to happen?

getsize should be fast. I'm trying to use this to implement a text-wrap mechanism (for SVGs), but the moment I add a single call to this in a for loop, timers go up 10 to 20 times slower.

Calling getsize on small text multiple times adds a significant overhead. I wonder if it can be improved somehow, or it is really as good as it can get. For example, 50 calls takes about 50ms (with short text), and increases significantly the bigger the text.

What actually happened?

getsize is slow.

What are your OS, Python and Pillow versions?

  • OS: Windows
  • Python: 3.7
  • Pillow: 7.1.2
font = ImageFont.truetype(font_file, font_size)

# Run the following in a loop
font.getsize(text)
@nulano
Copy link
Contributor

nulano commented May 27, 2020

You are using Windows, so I'm assuming your default layout_engine is basic layout. With a quick look at the code, I have noticed that basic layout renders all glyphs before looking at their advance values. I wonder whether this is actually necessary?

load_flags = FT_LOAD_RENDER|FT_LOAD_NO_BITMAP;

I would assume the advance value only depends on hinting, which should probably run without the FT_LOAD_RENDER flag. In fact, the FreeType docs mention the flag FT_LOAD_BITMAP_METRICS_ONLY which seems appropriate here and conflicts with FT_LOAD_RENDER. https://www.freetype.org/freetype2/docs/reference/ft2-base_interface.html#ft_load_xxx

The other potential improvement could be to preserve loaded glyphs between the different functions, as is recommended in the FreeType tutorial; I'm not sure how much of an impact this would have.

@cristianocca
Copy link
Author

I'm pretty sure I observed the same speed on MacOS. I wonder if this is different/faster on Linux distros?

@nulano
Copy link
Contributor

nulano commented May 27, 2020

On Linux (and also Mac to a degree) it is more likely that you have Raqm installed; this is rare on Windows. The default behaviour if you don't explicitly set ImageFont.truetype(layout_engine=...) is to use Raqm if available, and basic layout otherwise. The two layout engines use a completely different code path (with basic layout glyphs are laid out in Pillow code, with Raqm layout the string is passed to libRaqm; in both cases the same function is then used to calculate the bounding box).

@nulano
Copy link
Contributor

nulano commented May 27, 2020

Testing with FreeMono.ttf at size 20 for text "Hello world" I get a 15% improvement in getsize by swapping the flags as I mentioned above, with a pixel-identical image. The change also passes the test suite. PR in #4652. I do not see any other potential changes that would improve getsize speed without a significant redesign.

@radarhere
Copy link
Member

PR #4652 has been merged, and will be part of the next Pillow release, due out on July 1.

@cristianocca
Copy link
Author

Thanks for the quick update!

@radarhere
Copy link
Member

@cristianocca the PR introduced by @nulano is advertised to improve speed by 15%. Is that sufficient, or do you think there is still more work that needs to be done here?

@cristianocca
Copy link
Author

Sorry I forgot to answer this. I think 15% is fair enough, and unless there's any other "obvious" improvement, it should be ok. I mean, it is still a bit slow in my opinion, but I guess the operation is by itself slow and nothing else can be done.

@millionhz
Copy link
Contributor

To get size you have to make an image then get the size of text on that image; there is no way to get the text size and then make an image out of the return value.
I think getsize is slow as it writes all the text on an image and then gets the size; the writing the text on the image might take alot of time.

@nulano
Copy link
Contributor

nulano commented Jun 26, 2020

@millionhz Not quite. Text rendering is performed in four steps:

  1. Perform text layout. The given string is converted to a set of glyphs, which are loaded, measured, and placed in relative coordinates.
  2. Measure text bounding box. For each placed glyph, its bounding box is computed. The combined bounding box is returned.
  3. Render text. Each glyph is loaded, rendered by FreeType, then copied to an image buffer in Pillow. This image is returned to ImageDraw.
  4. Paste buffer image to target image.

The getsize function performs the first two steps only. The bottleneck in getsize is that each glyph needs to be loaded from the ttf file, which is a slow operation. Before #4652 there was an issue where glyphs were being rendered needlessly in step 2, now they are only rendered in step 3. In #4724 I have proposed adding another function getlength (2-3x faster), which skips step 2 and only looks at the position of the last glyph from step 1 to get the length of text without the height.

@millionhz
Copy link
Contributor

millionhz commented Jun 26, 2020

@nulano
Isn't the height gonna be important for multi line text?
One more suggestion, the concept of measuring all the bounding boxes is important but what about monospace fonts. I think they have the same bounding box width and height. So if a user uses monoscape, just calculate the bounding box for one char and multiply the width with the number of chars in an line and height with the number of lines. Am I correct here?

@nulano
Copy link
Contributor

nulano commented Jun 26, 2020

Multiline text uses a constant line height; the current implementation uses the height of 'A' plus a user-specified pixel offset (4px by default).

Monospace fonts have equal advance length (used in step 1), but the bounding box may vary (used in steps 2-3). The bounding box merely describes which area is covered by a glyph, not how much area should be skipped during layout. Additionally, AFAIK there is no way to know in advance that a font is monospace, this can only be determined from the results of step 1.

@millionhz
Copy link
Contributor

@nulano What if i want to generate an image depending on the size of the text I am going to put in it. What is the procedure?
What I am currently doing is

  1. Generating a temporary image with:
    img = Image.new("1", (10, 10))
  2. Getting the size of the text:
    x, y = ImageDraw.Draw(img).multiline_textsize( text, font=font, spacing=spacing)
  3. And using the return values to make the image:
    img_out = Image.new("L", (x, y), color=bg)

But this seems like the wrong way to do it.

@nulano
Copy link
Contributor

nulano commented Jun 27, 2020

@millionhz There is also the identical function getsize_multiline in ImageFont.FreeTypeFont that you can call directly with font.getsize_multiline(text, spacing=spacing, ...): https://pillow.readthedocs.io/en/stable/reference/ImageFont.html#PIL.ImageFont.FreeTypeFont.getsize_multiline

You may want to also call font.getoffset on the first line of text to get the extra top-left padding necessary to make sure no part of the text is clipped (e.g. if your text exceeds the line-height). Unfortunately, there is currently no way to check how much extra padding you need on the bottom for multiline text. However, in most cases you will not need any extra padding on either edge (especially if you don't use/need Raqm layout).

@nulano
Copy link
Contributor

nulano commented Oct 14, 2020

getsize should be fast. I'm trying to use this to implement a text-wrap mechanism (for SVGs), but the moment I add a single call to this in a for loop, timers go up 10 to 20 times slower.

@cristianocca Pillow 8.0.0 has now been released adding font.getlength and draw.textlength intended for text-wrapping (among other things), which are 2-3x faster. The body of the new function does little else than perform layout (which in the case of LAYOUT_RAQM is just passing data to Raqm), so I doubt there is any more speed that can be gained here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants