Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alpha blending #230

Open
LeaVerou opened this issue Oct 22, 2022 · 17 comments · May be fixed by #231
Open

Add alpha blending #230

LeaVerou opened this issue Oct 22, 2022 · 17 comments · May be fixed by #231
Labels
beginner friendly Tractable issues, good for first time contributors enhancement New feature or request

Comments

@LeaVerou
Copy link
Member

LeaVerou commented Oct 22, 2022

We should have a method for alpha blending. This is required to be able to do contrast calculations properly (see w3c/csswg-drafts#7358 ).

Things to decide:

  • Naming: Perhaps overlay() or over()? Or overlayOn() or just on() which is more clear as a Color method, but not so much in the procedural API (See overlayOn(color1, color2) vs color1.overlayOn(color2)). Whatever we come up with should make sense in both APIs, so I’m leaning toward over() which is also the operator name.
  • Is there any reason for it to take more than 2 arguments? My understanding is that the operator is associative.

The math seems pretty straightforward. @svgeesus are there any considerations when applying this to other color spaces? What color space do we use for the result? I guess the color space of the color being overlaid?

@LeaVerou LeaVerou added enhancement New feature or request beginner friendly Tractable issues, good for first time contributors labels Oct 22, 2022
@svgeesus
Copy link
Member

Yes, we need to have blending and compositing available.

The W3C Compositing spec is old, RGB-only, and does compositing in a broken way which happens to be compatible with what browsers do and Adobe Photoshop does, or used to do.

It works in sRGB (or technically any RGB space) but luminance uses the old NTSC primaries (yes, really)

Lum(C) = 0.3 x Cred + 0.59 x Cgreen + 0.11 x Cblue

That means we need to calculate a separate "CSS blend luminance" which is not Y in CIE XYZ /facepalm

https://drafts.fxtf.org/compositing-1/

Be careful when choosing names, because we want to be able to add the blend-modes and source-over is the usual Porter-Duff name for what CSS blending calls normal which is the initial value of mix-blend-mode.

Is there any reason for it to take more than 2 arguments? My understanding is that the operator is associative.

Like mixing, compositing is defined on two arguments and compositing more than that requires specifying and order of operations for several two-argument steps.

In addition to all that broken compositing in gamma-encoded sRGB space we would also want to make available compositing in linear-light xyz-d65 which is what is used for commercial 2D and 3D graphics.

@LeaVerou
Copy link
Member Author

Let's focus on alpha blending for now please. Sure, we shouldn't make any decisions that make it harder to implement the other operators and blending modes later on, but the immediate need is for alpha blending, that is far more common than any other operators, and we don't need to figure out the API for blending modes to add alpha blending.

The W3C compositing spec is old, sure, but surely this is a solved problem in the literature? So we convert to XYZ and do it there, then convert back to the color space of the first color?

@svgeesus
Copy link
Member

surely this is a solved problem in the literature? So we convert to XYZ and do it there, then convert back to the color space of the first color?

Yes, that is the correct way.

It doesn't predict the actual contrast you will get in a browser, of course.

@svgeesus
Copy link
Member

the immediate need is for alpha blending, that is far more common than any other operators

All of the blend modes are doing alpha blending.

@svgeesus
Copy link
Member

Sure, we shouldn't make any decisions that make it harder to implement the other operators and blending modes later on

Exactly. So the correct approach is to implement the General Formula for Compositing and Blending:

Apply the blend in place

Cs = (1 - αb) x Cs + αb x B(Cb, Cs)

Composite

Co = αs x Fa x Cs + αb x Fb x Cb

and then provide useful defaults.

For the blend operator B, the default is normal which is simply

B(Cb, Cs) = Cs

and for source-over compositing,

Fa =1
Fb = 1 – αs

@facelessuser
Copy link
Collaborator

Let's focus on alpha blending for now please.

As someone that has already gone down this road, if you add support for source-over compositing and normal alpha blending, you have everything you need to do the others, you can just swap out different compositing and blend methods. So the key is just to make sure the compositing step and the blend step is generic. I think that is the main point that is being conveyed.

Though I had some missteps in the beginning 😅, I can pretty much mimic what browsers are doing now.

@LeaVerou
Copy link
Member Author

LeaVerou commented Oct 22, 2022

@svgeesus But in which color space? Will XYZ work well here? How does this change once we support CMYK spaces?

@facelessuser I see your point, what I meant was, from an API design pov, alpha compositing should be quick and easy to specify, not just a parameter of a more general compositing API.

Though I had some missteps in the beginning 😅, I can pretty much mimic what browsers are doing now.

What missteps did you have?

@svgeesus
Copy link
Member

svgeesus commented Oct 22, 2022

But in which color space? Will XYZ work well here?

For browser-compatible simple alpha source-over blending, premultiplied sRGB will work (oog values may need to be clipped, needs investigation of how the equations react to extended values)

For browser-compatible other blend modes, sRGB plus a bogus NTSC-luma will work; the luma is only needed for the non-seperable blend modes:

  • hue
  • saturation
  • color
  • luminosity (sic)

For higher quality physical light normal blending and source-over compositing, premultiplied XYZ-D65 will be fine

For compositing in CMYK then a) are you crazy and b) read the PDF spec for the full horror and c) have a drink with @faceless2 to get extended horror. Also to actually do compositing that is fully general and includes some CMYK then we would need ICC support to be able to get the Lab values. So, not for now.

@svgeesus
Copy link
Member

From an API design pov, alpha compositing should be quick and easy to specify, not just a parameter of a more general compositing API.

An API design would have two methods, blend and composite. Each would take the source and destination colors as parameters plus an optional options object where one could specify the blend mode and the compositing operator and the blend colorspace, and these would default to normal and source-over and either xyz-d65 or srgb respectively.

@LeaVerou
Copy link
Member Author

LeaVerou commented Oct 22, 2022

For browser-compatible simple alpha source-over blending, premultiplied sRGB will work (oog values may need to be clipped, needs investigation of how the equations react to extended values)

For browser-compatible other blend modes, sRGB plus a bogus NTSC-luma will work.

For higher quality physical light normal blending and source-over compositing, premultiplied XYZ-D65 will be fine

Let's use XYZ-D65 then, and allow for a parameter if people want to produce shittier results that are browser-compatible (or not, what's the use case?).

For compositing in CMYK then a) are you crazy and b) read the PDF spec for the full horror and c) have a drink with @faceless2 to get extended horror. Also to actually do compositing that is fully general and includes some CMYK then we would need ICC support to be able to get the Lab values. So, not for now.

Of course we'd need ICC support to do CMYK properly, but that's in scope for the future, and wanting to overlay a CMYK value with transparency over another is not a crazy ask. Once we do have Lab values, does alpha blending work the same way?

@svgeesus
Copy link
Member

Once we do have Lab values, does alpha blending work the same way?

For the proper way: yes. Lab-D50 → XYZ-D50 → XYZ-D65 and then as normal

For the web-compatible way, Lab-D50 → XYZ-D50 → XYZ-D65 → srgb-linear → sRGB and hope

@facelessuser
Copy link
Collaborator

facelessuser commented Oct 22, 2022

What missteps did you have?

  • I forgot to undo the premultiplication after the blending was done. This broke the associative nature. That was just a dumb mistake on my part.
  • It also seems that some of the operator algorithms can exceed 0 and 1 for αo (Lighter for instance). I initially didn't clamp them, and instead applied them, and then the final color would have the alpha clamped. But I realized that to mimic browsers, I needed to clamp αo then apply it.

EDIT: Just to clarify, clamping αo is specifically related to premultiplication as well, and in my case, I was using it to undo premultiplication on each channel without first clamping it to a realistic 0 - 1 range. Basically, all my issues were related to undoing premultiplication.

@LeaVerou
Copy link
Member Author

From an API design pov, alpha compositing should be quick and easy to specify, not just a parameter of a more general compositing API.

An API design would have two methods, blend and composite. Each would take the source and destination colors as parameters plus an optional options object where one could specify the blend mode and the compositing operator and the blend colorspace, and these would default to normal and source-over and either xyz-d65 or srgb respectively.

color1.composite(color2) is not an intuitive way to do alpha blending, which is by far the most common kind of compositing. I do think we should have a separate method for alpha blending (which can internally be a shortcut to compose or whatever once we have that). I laid out some options in the first post for what that function can be named.

@facelessuser
Copy link
Collaborator

facelessuser commented Oct 23, 2022

I personally always liked the name overlay. I figured if enough people ever complained that compose was too obtuse for basic alpha blending, that is what I'd add. For me, it creates a good mental picture of what is being attempted, but I can certainly understand the idea of using over with its tie to the operator name. If it were me, I'd vote for overlay or overlayOn.

@LeaVerou LeaVerou linked a pull request Oct 28, 2022 that will close this issue
@LeaVerou
Copy link
Member Author

I started a draft PR yesterday, so we can iterate: #231
Still needs tests

@kepstin
Copy link

kepstin commented Oct 8, 2023

I personally always liked the name overlay.

Note that overlay is the name of a specific blend mode defined in CSS which has some unusual/strange non-linear properties.

To avoid confusion, the name "overlay" should not be used for a function which performs normal alpha blending.

@Myndex
Copy link
Contributor

Myndex commented Oct 12, 2023

... the name "overlay" should not be used for ... normal alpha blending....

I agree. "over" is usually the correct term. Overlay is a specific blend mode. I believe the PR is using "over".

Things I Learned About Alpha the Hard Way

Alpha in video, and mattes in film, were oft treated like black-magic potions with arcane setups and workflows... in the transition from chemical imaging to digital, the old voodoo methods were dragged along, and strange monsters under-the-bed grumbled as flows shifted to linear space blendings...

If you are doing everything "in house" with no inter-facility interchange of files, then alpha issues are less likely to bite you -- however, interchange is common, and understanding what/how alpha is interpreted in various use cases across color spaces is key. (yea, that pun was intended).

Also, I hope this is the right thread for this post, feel free to move if it isn't.

Alpha Gamma, what big teeth you have...

The statement "alpha has no gamma" is technically true, but also not correct.

Wait... wut?

While the alpha channel just specifies a percentage of transparency, a specific percentage of transparency will not have the same perceptual effect among different color spaces and most especially, gamma vs linear tone response curves.

Click here

Under this twisty, a brief but useful background regarding alpha, mattes, and blending for various imaging technologies.

The following history intended to put best practices into context.

Nostalgia Lane

Historically (meaning NTSC and PAL video), switchers, DVO, and other video gear always dealt with the gamma encoded signal. All color adjustments, composites, whatnot, happened in gamma encoded space. In the earlier days the working space, input space, and output space, were all the same space—no LUTs, no ICC, and also no color management, as everything was aligned and calibrated to the one space.

That one space (YIQ for NTSC) was analog composite, meaning a gamma encoded $Y^\prime$ and the color difference quadrature encoded on a 3.58MHz subcarrier, as a single signal on coax going to analog switchers, where analog electronics could mix (for a dissolve) two signals.

If a video had an alpha, it was on a separate coax out of the DVO, or on a separate video tape (one machine with the image, one with the alpha, rolling in sync).

Composite (1") gave way to component (Betacam) YPbPr (analog) eventually to YCbCr (digital)... and still all were gamma encoded (though back then we said "gamma-corected".)

Blending (like for a T-bar dissolve) using essentially linear analog electronics, but the gamma encoded signal from camera or tape source remained gamma encoded throughout, until it hit the display.

Chemical imaging

Film mattes were high contrast black & white, bi-packed with the element to be matted in the optical printer. Film blends like dissolve were either interpositive dissolves, usually done on an optical printer as an effect cut into the O-neg, or negative dissolves done as part of an AB cut negative roll at the lab.

Interpositive dissolves look different than negative AB dissolves, and look different than analog gamma encoded video dissolves. And the same is true of alpha or matte transparencies.

When film started going digital, among the first was Kodak's Cineon uncompressed 10 bit integer log (later as DPX)... although cin/dpx was actually a 10 bit linear encoding of a film negative, which is itself log. The digital processes initially all working in a log space as well.

10 bit integer was used as three color channels then fit across 4 bytes, and as integer, it didn't need computationally expensive decompressing. Technology of the 90s could move data better than process data, and a 32bit per pixel uncompressed file format (10 bit per channel integer log) could play smoothly at 2K from RAM or from SCSI RAIDs of the day.

The linear in the sand

Early 2000s, people were starting to experiment with linear working spaces. And just to be confusing, some were calling gamma encoded video "linear" to distinguish it from "log" even though video is not linear, either... Stu Maschwitz was a thought leader in correcting the terminology to what we know today, with linear (1.0), log, gamma etc.

And that was about the time computers were getting fast enough that a 32bit (per channel) floating point linear (gamma 1.0) working space was reasonably doable. It's not just the processing cost of the working space itself, but also the added costs of transforming back to a gamma encoded space suitible for the display, and emulating looks of final film output through LUTs etc.

In some cases, the added complexity of linear is well worth emulating physical light. But plenty of reasons to be in some gamma/perceptual/curvy space too for some tasks.

.

Why History

The reason for the history lesson in the above twisty, was to show that compositing operations were handled in non-linear spaces for a very long time, and people became accustomed working that way. Linear does have advantages, but it's also not always the ideal working space, depending on the specific task.

Linear is good for a lot of compositing, but not so good for gradients, and of questionable utility for color grading, where you'd rather have controls that are perceptually uniform as opposed to illumination uniform.

The fact that compositing is good in linear, but perceptual spaces are better for some other things, brings us to the actual problem of "how best to handle alpha blends", which is answered with "it depends".

Alpha Workflows

If you are preparing an element in linear workspace, and that element is going to be exported with an alpha channel, it is important to know the kind of space it is going to be used in. The element is perhaps going to the DI, and what space are they grading in? P3? $X^{\prime}Y^{\prime}Z^{\prime}$ ? Rec709/2.4?

Let's say you are sending text elements with alpha. How you handle the alpha, and how the recipient handles the alpha, depends on your working space gamma, the interchange file format, and the working space & gamma of the recipient. For these examples we'll assume premultiplied alpha.

Click here

this twisty hides some verbose hypothetical workflow examples.

No gamuts were injured in these examples.

We are working in a linearized (gamma 1.0) working space based on P3 primaries.

  1. The DI is working in a linearized space and wants EXR files: Great, this is easy, send over linear EXR files with embedded premultiplied alpha channel, just as you had it adjusted in your linear working space. Easy, but almost never the actual case.

  2. The DI is working in a log space, and they want a 3 channel dpx (10bpc) for image and a 4 channel TIFF for alpha (up to 4 alpha channels, usually 8bpc), with separate alpha for the spaceship, the exhaust, and the text (for grading) and the compositing alpha (for over).

    • If you are doing most of the work in a linear working space, but must deliver the log DPX, an alpha straight out of your working space will have unexpected behavior with transparent areas, for instance the antialiasing may seem jagged, and the density of the transparancy will appear wildly different.
    • FLOW 1) In a case like this, one workflow is to setup lin2log nodes for the text, spaceship, and exhaust, and comp with the merge nodes in log, and then output the alphas from the merge nodes which are now adjusted in log.
    • FLOW 2) An alternate workflow is to keep everything in lin until output, where the output has a lin2log node. In this case, process the alpha as well—though probably not with a standard lin2log. Setup a viewer that takes the final log output image, does a merge onto a proxy background in log, and compare in split-screen to the linear composite, and adjust the logified alpha now to match (this flow usually setup early and kept as a LUT for the alphas)
    • The above are for the compositing alpha. The separated grading alphas are used by the colorist for hold-outs for different grading nodes. Personally I prefer separated grading nodes as solids, since I'm going to adjust transparancy if needed to suit, but I might use the compositing alpha for the grade also if it's a complex transparency like water or smoke... YMMV.
  3. The DI is grading in a Rec709 gamma 2.4 space, but they want files as an ARRI Log C 12bit ProRes 4444 with alpha.

    • Find out if they are doing composites/alpha blends in the 709/2.4 timeline (post LUT) or in LogC composite clips, or what..? before cutting into the 709/2.4 timeline.
    • As in the previous example, adjust the alpha for the correct perceptual intent for the color space the composite is being performed in—when a clip is converted from one space to another, the alpha is usually untouched (and resultant unexpected behavior).
  4. You are doing the work in a vacuum on a low budget project, and have no idea what the DI is going to be as they haven't been brought one on yet. The camera originals are an assortment of RED, dSLRs, a DJI mini, GoPro, an iPhone, and Super 8mm film telecined to 10bit ProRes422 log C, but no indications of the space or primaries. The filmmakers say they want the DI to be in linear because they heard "linear is good for stuff".

    • Without smiling too patronizingly, do the work in linear and output linear 16bit half-float EXRs and an alpha the was adjusted in linear space. Use a linearized 709 or P3.
    • A year later when they get funding, the filmmakers call you in a panic saying "nothing looks right", after some discussion, you learn they are working in $DCI\ X^{\prime}Y^{\prime}Z^{\prime}\ with \gamma2.6$ so you walk them through doing the comps in DaVinci in linear space.

The Notes that Fell Off a Cliff

Okay, the tidbit to gleen for the foregoing is that:

  • The alpha channel as adjusted/created, is only valid for the color space and especially gamma, of the instance where it was created/adjusted.
  • Transforming the image data to a different space, particularly the TRC or gamma, will invalidate any partial transparency of the alpha channel.
  • In other words, the alpha data must be adjusted to maintain perceptual intent if the image data is mapped to another space and/or gamma.
    • The TRC of the final compositing workspace is of course also important.
  • Thus, the alpha data must be managed/adjusted as the image data transforms into different working spaces.

Put another way, there is no "best" default for the alpha in a multi-space environment, there is only the "it needs to be handled like this" where "this" is the space the alpha was created or adjusted for.

The following table does not imply best practices, only common cases and potential flows.

Source
Working Space
Intermediate
File Specs
Destination
Working Space
Alpha Requriements
$\gamma 2.6$ ProRes4444 $\gamma 2.6$ Same alpha as adjusted in source workingspace
Linear $\gamma 1.0$ EXR (lin) Linear $\gamma 1.0$ Same alpha as adjusted in source workingspace
Log DPX + TIFa Log Same alpha as adjusted in source workingspace
Linear $709 \gamma 1.0$ ProRes4444 $\gamma 2.4$ $\gamma 2.4$ Alpha adjusted to work with the
$\gamma 2.4$ transformed image data
Linear $P3\ \gamma 1.0$ ProRes4444 $\gamma 2.2$ $DCI\ P3\gamma 2.6$ Alpha adjusted to work with the $\gamma 2.2$ transformed image data in the ProRes, and then likely adjusted further for the final comp 2.6

Thread responses

...Be careful when choosing names...

over is appropriate, and common term for the basic alpha "sticking things on top of other things".

...So we convert to XYZ and do it there, then convert back to the color space of the first color?...

As I tried to illustrate above, it depends. What space is the alpha intended for? If the alpha is attached to a gamma encoded image, then usually that alpha is intended for providing transparency for that image data at its current gamma. Usually.

And of course, this applies only to alphas with partial transparency. It does not apply to alpha 0 or 1. Text for instance has antialiasing, and this is partially transparent. It needs to be composited using image data and alpha at the gamma it was built for.

...But in which color space? Will XYZ work well here?...

The color space that the image+alpha was built/intended for first, or if the image data is transformed to another space, then a transform on the alpha as well to match perceptual intent.

XYZ is essentially an RGB space with imaginary primaries, so long as fg/bg are both in XYZ, I suyppose it should work—but keep in mind that as linear spaces with imaginary primaries, it's easy to exceed the gamut of the actual destination space.

Opinion: I like working spaces that are the same as the destination space, or at leas the intermediate space. For additive chromatic spaces like RGB, they can easily be linearized, if linear blending is desired.

...How does this change once we support CMYK spaces?...

Not being snarky: convert CMY to RGB to blend, then back. K is reserved or used in some blend modes. You might like this paper on blend modes.

Some simple blend modes might (big maybe) be emulated while staying in CMY. I've tried doing subtractions for instance, though trying to find a workable CMYK blending space may be a rich-black alley.

RGB is already an additive space, and three primaries where each brandishes a share of the luminance. CMYK as a subtractive space with 4 colors including one that modulates luminance much more than the others adds non trivial complexity. As a result, Grassman's laws apply to RGB but not so much to CMYK. I.e. you can get straight-line mixes in RGB, but you don't get straight line mixes in pigments or inks.

And you need a perceptually uniform space to transform CMYK to RGB... so now. thinking out loud, could a perceptually uniform CMYK mixing space be created... and is there a compelling reason to do so?


For those that read this, thank you for joining me in this trip down memory lane...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginner friendly Tractable issues, good for first time contributors enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants