VideoFrame should use display size, not coded size #4525

kainino0x · 2024-03-19T18:29:16Z

copyExternalImageToTexture() currently uses the coded size of VideoFrame inputs.

Source type Width Height

VideoFrame VideoFrame.codedWidth VideoFrame.codedHeight

Coded size (codedWidth/codedHeight):

Width of the VideoFrame in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.

Display size (displayWidth/displayHeight):

Width of the VideoFrame when displayed after applying aspect ratio adjustments.

While we're here, the table that describes the width/height for each source type needs to be moved so it applies to both copyExternalImageToTexture() and importExternalTexture().
Display size is what we want (it's what you see if you put a video on screen and IIUC it should be the same as the "intrinsic" size of HTMLVideoElement for any given frame of a video. So we should use it instead.

The text was updated successfully, but these errors were encountered:

kdashg · 2024-03-27T05:03:47Z

That sounds correct from an expected behavior standpoint.
I'm somewhat nervous about this, especially for importExternalTexture. (and its textureLoad operation, which I'm honestly surprised that it has, given the complexity of mappings between video data and its expected display/output)
I think it is probably fine but I am nervous about this area.

greggman · 2024-03-27T06:32:43Z

I feel like I'd expect the coded size and if I want to display that coded size correctly then I'd apply the display size to CSS.

Otherwise it's lossy.

kainino0x · 2024-04-02T05:12:49Z

@greggman Not sure I understand. The coded size for a 1920x1080 video is typically 1920x1088 (120x68 16x16 macroblocks). The last half block of the last row is garbage data that is not part of the video and for most purposes doesn't exist. From the web's perspective I think it was totally invisible until WebCodecs came along.

Tangentially, while looking that up, I found some explanation of the aspect ratio thing that Kelsey remembered but didn't know the details of, here: w3c/webcodecs#26
(WebCodecs' VideoFrame has coded size, visible rect, and display size, but I don't know if that spec defines those three terms.)

Then, the question Kelsey raised would be, should the copyExternalImageToTexture() size and external_texture textureDimensions()/textureLoad() size match the visible rect, or the display size. I say it should be whatever WebGL's texImage2D() does - likely the display size, but I really don't know.

kainino0x · 2024-04-02T05:23:06Z

HTML is pretty clear that 2d canvas drawImage uses the "natural size" which must certainly be the display size. But I wouldn't be super confident that real WebGL implementations match that. @shaoboyan I wonder if you could find or create a video with non-square pixels so we can test this in WebGPU? I don't see one in the WebGL test resources, which means it needs one, too.

It seems to simply require the -aspect flag in ffmpeg to change the "Display Aspect Ratio":
https://superuser.com/a/738833
https://superuser.com/a/908257

shaoboyan · 2024-04-02T07:42:10Z

@kainino0x
I did a quick try in current cts resources. By doing following commands:

/ Generate four-colors-vp9-bt601.webm, mimeType: 'video/webm; codecs=vp9'
ffmpeg.exe -loop 1 -i .\four-colors.png -c:v libvpx-vp9 -pix_fmt yuv420p -frames 50 -colorspace smpte170m -color_primaries smpte170m -color_trc smpte170m -color_range tv -aspect 2:1 four-colors-vp9-bt601-2-1.webm

Drop the video into browser page directly and the chrome://media-internal page shows that:

info | "FFmpegDemuxer: created video stream, config codec: vp9, profile: vp9 profile0, level: not available, alpha_mode: is_opaque, coded size: [320,240], visible rect: [0,0,320,240], natural size: [480,240], has extra data: false, encryption scheme: Unencrypted, rotation: 0°, flipped: 0, color space: {primaries:SMPTE170M, transfer:SMPTE170M, matrix:SMPTE170M, range:LIMITED}"

So, the info said coded size: [320,240], visible rect: [0,0,320,240], natural size: [480,240]. And the video plays on screen is streched.

Without -aspect 2:1

With -aspect 2:1

greggman · 2024-04-02T18:42:59Z

I definitely don't want garbage. I guess in my mind there are more sizes. I can imagine

The number of pixels in the data (eg. 1920x1088)
The number of valid pixels in the data (eg 1920x1080)
The size to display the valid data (eg. 3000x2000 or maybe aspect ratio) - this could be anything? (does this exist? It seems to be ignored for images)

I'd expect to get 2, not 3?

kainino0x · 2024-04-03T00:48:59Z

Nice @shaoboyan, that'll be a useful test case. I'm also curious to see how it interacts with rotation (I'm guessing it applies before rotation, but not sure).

Before we land a decision here, I'd like to know the results of drawImage, texImage2D, copyExternalImageToTexture, and importExternalTexture in each browser's current implementations. I don't know what to expect, but it would be nice if they're all consistent (with the possible exception of textureDimensions and textureLoad on external_texture).

3. The size to display the valid data (eg. 3000x2000 or maybe aspect ratio) - this could be anything?

I'm unsure whether it's possible for the visible rect size and natural size to be very different, or if they have to be equal in one dimension like they are in Shaobo's test.

shaoboyan · 2024-04-03T01:43:26Z

@greggman
Per my understanding, the aspect attribute is one of the media stream/container metadata. My understanding is that the media stream expect that the stream present on screen by applying all metadatas(e.g. matrix).

The size to display the valid data (eg. 3000x2000 or maybe aspect ratio) - this could be anything?

Per my experiment, I think the 3 situation only happens when we define the aspect attribute. Or we'll get a frame in their visible dimensions. And I think that behaviour (applying aspect scale/downscale) is a expected behaviour.

shaoboyan · 2024-04-03T08:44:50Z

@kainino0x @greggman @kdashg
I did experiments to "render" video frames through multiple ways. Here are results on chrome canary(version 125.0.6396.0 (Official Build) canary (64-bit)).

I reused a exist uploading sample with some changes (e.g. remove video size assignment). And I added a canvas2d context rendering logic like this:

  const canvas = container.appendChild(document.createElement('canvas'));
  canvas.width = videos[0].videoWidth;
  canvas.height = videos[0].videoHeight;

  const context = canvas.getContext('2d');
  context.drawImage(videos[0], 0, 0);

Canvases are all the same size as video.videoWidth, video.videoHeight. For HTMLVideoElement, I don't assign any size.

I still use the cts resource. By executing the commands:

// Generate four-colors-h264-bt601-rotate-90.mp4, mimeType: 'video/mp4; codecs=avc1.4d400c'
ffmpeg.exe -loop 1 -i .\four-colors.png -c:v libx264 -pix_fmt yuv420p -frames 50 -colorspace smpte170m -color_primaries smpte170m -color_trc smpte170m -color_range tv -vf transpose=2 temp.mp4

I got a video frame with info (by chrome://media-internals):

"FFmpegDemuxer: created video stream, config codec: h264, profile: h264 high, level: not available, alpha_mode: is_opaque, coded size: [240,320], visible rect: [0,0,240,320], natural size: [240,320], has extra data: true, encryption scheme: Unencrypted, rotation: 0°, flipped: 0, color space: {primaries:SMPTE170M, transfer:SMPTE170M, matrix:SMPTE170M, range:LIMITED}"

According to video info, coded_size=[240, 320], visible rect = [0, 0, 240, 320], natural_size = [240, 320];

Getting the video element and query size get video.vidoeWidth = 240 ; video.videoHeight =320

Using multiple way to 'draw' this frame:

WebGPU: ImportExternalTexture:

WebGPU: CopyEI2T:

WebGL2: TexImage2D:

WebGL2: TexStorage2D:

WebGL: TexImage2D:

Canvas2D: DrawImage:

VideoElement: Playback:

shaoboyan · 2024-04-03T08:51:19Z

(Cont'd)
Execute ffmpeg commands to get 90 degree rotated video frame based on above video frame:

ffmpeg.exe -display_rotation 270 -i temp.mp4 -c copy four-colors-h264-bt601-rotate-90.mp4

I got a video frame with info (by chrome://media-internals):

"FFmpegDemuxer: created video stream, config codec: h264, profile: h264 high, level: not available, alpha_mode: is_opaque, coded size: [240,320], visible rect: [0,0,240,320], natural size: [240,320], has extra data: true, encryption scheme: Unencrypted, rotation: 90°, flipped: 0, color space: {primaries:SMPTE170M, transfer:SMPTE170M, matrix:SMPTE170M, range:LIMITED}"

According to video info coded size: [240,320], visible rect: [0,0,240,320], natural size: [240,320].

Getting the video element and query size get video.vidoeWidth = 320; video.videoHeight =240

Using multiple way to 'draw' this frame:
WebGPU: ImportExternalTexture:

WebGPU: CopyEI2T:

WebGL2: TexImage2D:

WebGL2: TexStorage2D:

WebGL: TexImage2D:

Canvas2D: DrawImage:

VideoElement: Playback:

shaoboyan · 2024-04-03T08:55:26Z

(Cont'd)
Execute ffmpeg commands to get 90 degree rotated combined with -aspect 2:1 video frame based on origin video frame:

ffmpeg.exe -display_rotation 270 -i temp.mp4 -aspect 2:1 -c copy four-colors-h264-bt601-rotate-90-2-1.mp4

NOTE: FFmpeg has a warning: Overriding aspect ratio with stream copy may produce invalid files

I got a video frame with info (by chrome://media-internals):

"FFmpegDemuxer: created video stream, config codec: h264, profile: h264 high, level: not available, alpha_mode: is_opaque, coded size: [240,320], visible rect: [0,0,240,320], natural size: [640,320], has extra data: true, encryption scheme: Unencrypted, rotation: 90°, flipped: 0, color space: {primaries:SMPTE170M, transfer:SMPTE170M, matrix:SMPTE170M, range:LIMITED}"

According to video info coded size: [240,320], visible rect: [0,0,240,320], natural size: [640,320].

Getting the video element and query size get video.vidoeWidth = 320; video.videoHeight =640

Using multiple way to 'draw' this frame:
WebGPU: ImportExternalTexture:

WebGPU: CopyEI2T:

WebGL2: TexImage2D:

WebGL2: TexStorage2D:
Failed , with console log WebGL: INVALID_OPERATION: texSubImage2D: source sub-rectangle specified via pixel unpack parameters is invalid

WebGL: TexImage2D:

Canvas2D: DrawImage:

VideoElement: Playback:

shaoboyan · 2024-04-03T08:58:43Z

(Cont'd)
Execute ffmpeg commands to get 90 degree rotated combined with -aspect 1:2 video frame based on origin video frame:

ffmpeg.exe -display_rotation 270 -i temp.mp4  -aspect 1:2 -c copy four-colors-h264-bt601-rotate-90-1-2.mp4

NOTE: FFmpeg has a warning: Overriding aspect ratio with stream copy may produce invalid files

I got a video frame with info (by chrome://media-internals):

"FFmpegDemuxer: created video stream, config codec: h264, profile: h264 high, level: not available, alpha_mode: is_opaque, coded size: [240,320], visible rect: [0,0,240,320], natural size: [240,480], has extra data: true, encryption scheme: Unencrypted, rotation: 90°, flipped: 0, color space: {primaries:SMPTE170M, transfer:SMPTE170M, matrix:SMPTE170M, range:LIMITED}"

According to video info coded size: [240,320], visible rect: [0,0,240,320], natural size: [240,480].

Getting the video element and query size get video.vidoeWidth = 480; video.videoHeight =240

Using multiple way to 'draw' this frame:
WebGPU: ImportExternalTexture:

WebGPU: CopyEI2T:

WebGL2: TexImage2D:

WebGL2: TexStorage2D:
Failed, with console log : WebGL: INVALID_OPERATION: texSubImage2D: source sub-rectangle specified via pixel unpack parameters is invalid

WebGL: TexImage2D:

Canvas2D: DrawImage:

VideoElement: Playback:

kainino0x · 2024-04-03T21:15:34Z

This is fantastic, thank you! Could you put the test pages up somewhere so they can be easily run in other browsers or in local builds? (like github pages, or send it to me and I can put it on github pages)

This is the most interesting outcome:

WebGL2: TexStorage2D:
Failed, with console log : WebGL: INVALID_OPERATION: texSubImage2D: source sub-rectangle specified via pixel unpack parameters is invalid

This sounds like the texSubImage2D call is treating the video as having a resolution that doesn't match the texture you've allocated with texStorage2D (which is the videoWidth/videoHeight, the intrinsic size). But also it might be a Chromium bug with the unpack parameters? Because video_webgl.js doesn't seem to use unpack parameters at all.

shaoboyan · 2024-04-06T03:28:18Z

Could you put the test pages up somewhere so they can be easily run in other browsers or in local builds? (like github pages, or send it to me and I can put it on github pages)

Sure, I'll send the sample after some code clean up

shaoboyan · 2024-04-08T10:49:38Z

@kainino0x Pls ref to https://shaoboyan.github.io/webgpu-samples/?sample=webgl-texImage2D under the categary “Video uploading through various APIs”. The bottom of the page.

kdashg · 2024-04-08T19:51:10Z

Different frames of the video can technically have different sizes iirc.
Don't base future work off of what webgl specifies, because I certainly don't have confidence in it any further than "it seems to work for enough people that it's useful".
I believe it was a mistake to try to specify the resulting texture size requirements based on anything short of the actual decoded frame.
This class of uncertainties is why I pushed for importExternalTexture, because we can sidestep almost all this uncertainty.

shaoboyan · 2024-04-09T01:21:19Z

(Pls correct me if I'm wrong).
@kdashg @greggman @kainino0x
One thing I'd like to mention is :
For HTMLVideoElement, developers don't have ability to get the coded_size (However webcodec VideoFrame could). The only size related attribute for HTMLVideoElement is videoWidth and videoHeight. And these width and height is so called display size.

(Based on above fact)
if developers want to create a texture and copy video frame from HTMLVideoElement to it, they could only use videoWidth and videoHeight.

Kangz · 2024-04-09T11:35:00Z

GPU Web WG 2024-03-27

* KG: coded size likely to be larger than display size. Likely to be rotated w.r.t. display size. Might have unusual scaling, etc.
* At WebGPU's level - with developer guarantees we're giving to users (video frames are simpler than reality) - think display size is correct. Was enough to make me nervous. Two things - if you're copying from this image to something, it's strange - it's more than a copy, more like a resolve operation. Gives you something different from the input as the output. And for external textures in WGSL, you can sample them with uber-sampler but also textureLoad them - saying you textureLoad but don't operate on the coded size makes me nervous. Not sure why we added textureLoad from external textures…
* GT: feels strange to not just get the data, but rather some interpolation of the data. And if you want to get back to the original data you can't because it's been manipulated. The data inside a video isn't necessarily color data. Like Kinect 3D data. If all info was there - here's what is needed to display it correctly - would be easier.
* KN: would be more advanced usage of WebCodecs' VideoFrame. VideoFrame can give you a lot of raw stuff about the frame. We can iterate on giving more metadata about the video frames. Thought it was more useful to get zero-copy with magic, then putting it off until we could do it without magic.
* KG: just as one example, coded size for 1080p video is often 1088 pixels tall. (rounded to 16x16 blocks)
* GT: is that stretched, or cropped?
* KG / KN: cropped.
* KG: there's also aspect ratio - e.g. something like a square video but it's stretched.
* GT: bug in WebGL - a 320x240 video in Chrome showing 256x256 in Chrome. Metadata we were treating differently.
* KG: it's challenging. Still, think displaySize is the right thing to use.

kainino0x added the tacit resolution queue Editors have agreed and intend to land if no feedback is given label Mar 19, 2024

kainino0x added this to the Milestone 0 milestone Mar 19, 2024

kainino0x assigned shaoboyan Mar 19, 2024

kainino0x removed the tacit resolution queue Editors have agreed and intend to land if no feedback is given label Apr 2, 2024

kainino0x added the needs investigation Requires an investigation into implementation constraints label Apr 2, 2024

kainino0x added the api WebGPU API label Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VideoFrame should use display size, not coded size #4525

VideoFrame should use display size, not coded size #4525

kainino0x commented Mar 19, 2024 •

edited by shaoboyan

kdashg commented Mar 27, 2024

greggman commented Mar 27, 2024 •

edited

kainino0x commented Apr 2, 2024

kainino0x commented Apr 2, 2024

shaoboyan commented Apr 2, 2024 •

edited

greggman commented Apr 2, 2024

kainino0x commented Apr 3, 2024

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

kainino0x commented Apr 3, 2024 •

edited

shaoboyan commented Apr 6, 2024

shaoboyan commented Apr 8, 2024

kdashg commented Apr 8, 2024 •

edited

shaoboyan commented Apr 9, 2024

Kangz commented Apr 9, 2024

VideoFrame should use display size, not coded size #4525

VideoFrame should use display size, not coded size #4525

Comments

kainino0x commented Mar 19, 2024 • edited by shaoboyan

kdashg commented Mar 27, 2024

greggman commented Mar 27, 2024 • edited

kainino0x commented Apr 2, 2024

kainino0x commented Apr 2, 2024

shaoboyan commented Apr 2, 2024 • edited

greggman commented Apr 2, 2024

kainino0x commented Apr 3, 2024

shaoboyan commented Apr 3, 2024 • edited

shaoboyan commented Apr 3, 2024 • edited

shaoboyan commented Apr 3, 2024 • edited

shaoboyan commented Apr 3, 2024 • edited

shaoboyan commented Apr 3, 2024 • edited

kainino0x commented Apr 3, 2024 • edited

shaoboyan commented Apr 6, 2024

shaoboyan commented Apr 8, 2024

kdashg commented Apr 8, 2024 • edited

shaoboyan commented Apr 9, 2024

Kangz commented Apr 9, 2024

kainino0x commented Mar 19, 2024 •

edited by shaoboyan

greggman commented Mar 27, 2024 •

edited

shaoboyan commented Apr 2, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

shaoboyan commented Apr 3, 2024 •

edited

kainino0x commented Apr 3, 2024 •

edited

kdashg commented Apr 8, 2024 •

edited