Skip to content

Text Refactor

Andy Williams edited this page May 21, 2021 · 12 revisions

Moved to proposal

You can find the continuation of this work at https://github.com/fyne-io/proposals/pull/3


Introduction

The next major release (probably Abelour) will feature a refactor of the text handling parts to address the issues and limitations encountered.

Background

Text handling and typesetting is one of the hardest and most complex piece of a UI toolkit. The current implementation was a good first step, but as the project has matured and developers have shared their feedback, it is clear that we're ready to take the next step.

Problems

  • No rich text support; eg bulleted/numbered lists, headings, paragraphs, images, links
  • No bi-directional support (LTR, RTL, mixed)
  • Hard to convey semantic meaning of text; eg cannot configure widgets to be H1-H6, Subtitle, Body, Caption, et al. (https://gophers.slack.com/archives/CK5U4BU87/p1607524897399600, https://gophers.slack.com/archives/CK5U4BU87/p1610012275206000)
  • No support for mixing styles; eg neither widget.Label nor widget.Entry cannot show a clickable hyperlink
  • Difficult for devs to customize existing widgets; eg cannot create center-aligned, widget.Entry (https://gophers.slack.com/archives/CB4QUBXGQ/p1554989104009500, https://gophers.slack.com/archives/CK5U4BU87/p1603126884485900) <- Andy is not sure this is a problem to solve, as most of our widgets are not customisable in this way.
  • Difficult for devs to create custom text widgets and access toolkit behaviour such as wrapping, selecting, and editing. For example, a dev cannot render multiple editable lines in a custom font without re-implementing textProvider and Entry.
  • Performance is low as the entire text must be rendered - could be improved with "collectionification" (applying lessons learned from caches in collection widgets) - applies to TextGrid as well

Goal

The goal of this project is to develop a text handling solution that will have;

  • Extensibility - developers can create custom widgets and leverage the text handling utilities
  • Functionality - developers can provide a rich user experience
  • Testability & Performance - developers can test and benchmark their apps across all supported languages, locales, and regions.

Possibly out of scope for a refactor project, but we need to keep in mind:

  • Accessibility - users can read and write text via accessibility tools (eg screen readers)
  • Internationalization - users can configure an app to their mother-tongue

Prior art

Libraries of note that we should checkout to understand the complexities ahead:

Scope (added 23 April)

There are 3 main areas of scope when planning this project. Those areas that can be addressed by a shared library, areas specific to Fyne that must be addressed for this to work, and finally items that could be worked on later.

Shared library areas

The following areas could/should be part of a shared library so that multiple Go toolkits can benefit

  • Shaping of script/complex string of characters
  • Splitting a string into constituent LTR/RTL sections so we can pass in BiDi content
  • A common return type that can be rendered (rasterised - see golang.org's freetype Context.DrawString perhaps? or defined as a path)

There may be more than one library in the shared space - current plans are shaping and direction parsing are separate concerns.

Fyne areas that need addressing

These are core to refactoring text within Fyne that need to be worked on as a priority

  • Text measurement cache, with baseline info etc not just size
  • Removal of old textPresenter/textProvider etc - capture this in a cleaner API that can be encapsulated
  • Layout of newlines, wrapping, elipses etc
  • Layout of portions of text cut on style boundaries
  • Supporting rich text description, for example illustration at https://github.com/fyne-io/fyne/issues/21

Could be later

Areas of important work that could be addressed after the main refactor

  • Loading UI translations
  • Managing multiple language files
  • Accessibility

Proposed approach

The shared library is being tracked by work in a separate repository. This will take a string and some Style information as input and return a rendered, or renderable, result. The shaping work (and probably some parsing) will be called from within our text rendering code.

API / Refactor

The remaining elements for the fyne widget package will be updated as follows:

  • new RichText widget that encapsulates the parsing, styling and shaping for all our text requirements.
  • The textPresenter capabilities will be moved to RichText and the textProvider interface moved to configuration calls.
  • To continue supporting widgets that expose a Text field we will support passing a *string or *[]rune into RichText, working around the previous need for copying and comparing large strings in Refresh.
  • The Label widget will be re-worked to simply wrap a RichText.

(over time) the selection / cursor elements of Entry will move to RichText as well, so other widgets can get "edit mode".

Ideas

  • Content Hinting - enable devs to hint at the content of a UI Text element to leverage platforms APIs for providing suggestions (from dictionary or contacts), check spelling, keyboard tayloring (numeric vs alphanumeric, action button type, input language)
  • Mathematical expressions
  • Wrapping Binding - takes a mode, width, and string binding as inputs, then splits the text from the binding into lines less than or equal to the given length depending on the wrapping mode, providing as output a list binding which can be bound to a text line list collection. <- Andy notes here that whatever we do should not require binding, but could have binding helpers.
  • Using a strings.Builder or bytes.Buffer for providing an efficient way to append a lot of information to the string (like when typing). It would also open up for having canvas.Text be an io.Writer (and io.Reader In the case of the buffer).
  • Text Measurement Cache - measuring text is a time consuming operation and is called frequently, we could see a performance increase by caching the result for a given input (text, font, size, and style).
  • Adding MeasureTextDistanceToBaseline function (or something like this) alongside MeasureText or driver equivalent to enable future alignment according to widgets text baseline.
  • String Bundling - to assist in internationalization fyne bundle or a new fyne translate could be used to bundle string resources into the binary in such a way that they can be keyed with language tag (eg. "en-US") and ID (eg. "welcome_message")
  • We will need to support at a minimum RTL (right to left) as well as the default LTR, but bi-directional will be required for a complete solution
  • Shaping is exceptionally hard - I recommend everyone become familiar with Harfbuzz and pango
  • Gio project is keen to collaborate on these problems so a shared solution could be created for Go text rendering (https://gophers.slack.com/archives/CM87SNCGM/p1617180821092100)
  • k-d Trees could be used for Text/TextGrid, as well as possibly for other types of collections, to solve the scrollbar problem (that is, knowing where to put the scroller when a given item is selected). k-d trees would make it easy to determine how much space is "before" or "after" a given row in O(log n) time.
    • I (Charles) feel that the way to go here is to have a k-d tree of "rows", where each row has a height equal to the max() height across it's elements, plus any intra-line padding. Each row could then contain a nested k-d tree of "cells", which in turn could contain runes, formatting markets, images, etc. This approach would make it possible to handle things like in-line images, while still maintaining good performance.
      • With this approach, the worst-case for an insert or delete is to re-compute the height of the row, requiring a Refresh() on only the cells within that row, plus average case O(log n) to maintain the row's k-d tree of cells (n = # of cells in row).
      • Adding or removing a row requires O(log n) work on average with a k-d tree (n = # of rows in widget).
      • Using a single k-d tree could represent arbitrary placement of text or images anywhere in an area, but would make it very complicated, or perhaps impossible, to have a concept of a single "line" that can be edited, which is why I propose the nested k-d tree idea.
      • This approach would allow for heterogeneous character widths, which means we might be able to unify Text and TextGrid, however characters that combine with each other (e.g. ligatures) could be complicated.
  • We could have a color and font map which identify colors and fonts from a palette, then associate keys into these palettes with each cell (using the formulation from the above bullet point), plus some flags to store style (bold, italic, underline, etc.). Given that most text will involve the same letters with the same styles, we could save time by using a cache so that the GPU texture for a given (letter, font, color, style, size) tuple is only rendered once.
    • There may be a more efficient way of storing this data, e.g. on runs of cells somehow, but doing that in an efficient way while also keeping it in sync with edits could be tricky, so storing it on a per-cell basis is less complex, and probably fast enough.