Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSS Rich Browsing Proposal #175

Open
HongZheng opened this issue May 23, 2023 · 18 comments
Open

CSS Rich Browsing Proposal #175

HongZheng opened this issue May 23, 2023 · 18 comments
Labels

Comments

@HongZheng
Copy link

HongZheng commented May 23, 2023

Hi everyone,
We are from Intel and want to introduce a CSS effects/animations rich workload into Speedoemeter3. Design doc is at https://docs.google.com/document/d/19vK5G11Kc4xbvhpkkXDf5WXdFQWyiK_a0WRwTUcc9j4/
I copied the contents of the document here in case you don't have access to it.

Objective

The objective of this proposal is to introduce a CSS effects/animations rich test case that can help browsers measure and improve CSS performance.

Motivation

CSS is an essential component of modern web development that helps developers build a large number of appealing and engaging websites. When CSS enables fancy webpages, the performance of CSS becomes an important factor affecting user experience on the web. Some web benchmarks normally measure CSS/DOM operations and JS tasks together, making it hard to check CSS performance/impact alone. Therefore, we propose adding a CSS heavy test case into Speedometer3 to help browsers measure and improve CSS performance.

Description

By learning from some real-life scenarios, such as image switching and table updating in https://top10.netflix.com/tv and https://www.imdb.com/, this proposal simulates a food menu with 5 kinds of food. Each food category contains 100 choices, and the first one is recommended. The proposal can automatically switch the 5 kinds of food one by one through clicking food pictures on the top of the page by JS. In the real world, web developers generally use CSS effects/animations to make web pages more appealing and engaging to end users. The proposal also exercises many CSS property operations (referencing the statistics from https://chromestatus.com/), such as transform animation, opacity/color setting, position/size adjustment etc. The web page of this proposal looks like the image below.

 

Measurement methodology

For performance measurement, the proposal utilizes performance.mark API to mark CSS animation frame start in requestAnimationFrame callback, and mark frame completion in the callback of afterframe API (https://www.npmjs.com/package/afterframe). The image below shows how to measure frame duration in Chrome trace. The final time reported is the average duration to render a frame in milliseconds.


Code can be reviewed at https://github.com/intel-staging/Speedometer/tree/rich_css/resources/tentative/rich-css

Note: The workload can be run by launching index.html from the rich-css folder. It is not integrated into the benchmark runner yet. 

A live demo is available at https://hongzheng.github.io/


Results

Run the proposal 10 rounds in Chrome/Firefox/Safari on M2, choose the median of the average frame time as final result and calculate variance using Coefficient of variation (Standard deviation divided by the mean) 


M2 (MacOS Ventura 13.3.1) Time(ms) CV (Coefficient of variation)
Chrome (112.0.5615.137) 11.7845 2.25%
Firefox (112.0.2) 6.709 2.05%
Safari (16.4) 21.072 7.63%
@camillobruni
Copy link
Contributor

  • Are you measuring Score or Time? (Your table header might be a bit confusing :))
  • I tend towards excluding CSS animation from speedometer, as we might introduce frame-rate-based measurements here
  • Other CSS and composting properties do seem fine with me

@MayuraRam
Copy link

  • Yes- Time and not Score ( will edit)
  • This methodology aims to avoid a frame-rate based measurement while reflecting the performance of heavy CSS operations seen in common real world use cases. It could be a good addition to Speedometer 3, which aims to reflect real-world user experiences. It stays away from Pure CSS animation, which would be hard to measure in a real world scenario.

@bgrins
Copy link
Contributor

bgrins commented May 24, 2023

the proposal utilizes performance.mark API to mark CSS animation frame start in requestAnimationFrame callback, and mark frame completion in the callback of afterframe API (https://www.npmjs.com/package/afterframe).

I'm not familiar with that library, but it looks like it's essentially stopping the timer after the next rAF is fired following the rAF callback which starts the timer?

This isn't currently possible within the Speedometer framework, since we don't have the ability to perform async steps (and most likely won't within the version 3 timeframe, though we are definitely interested in developing this ability so we can test things like Workers). So if this is a requirement for the test we should look at this test as a potential addition for a future version. Though I wonder if it's possible to build a test with this content using a similar pattern to the NewsSite in #167 - for example having content get appended or toggling classes etc in a sync step.

@rniwa
Copy link
Member

rniwa commented May 26, 2023

Yeah, we don't support this kind of async workload at the moment. Deferring this to v4 seems like the right course of action here.

@rniwa rniwa added the v4 label May 26, 2023
@HongZheng
Copy link
Author

Thanks all for your great comments!

Yes, the measurement of whole set of CSS animation frames depends on the async step in SP driver, which seems won’t be ready until SP4. We are updating the workload to adopt the measurement similar as NewsSite case mentioned by Brian. This is quite good because initial frame is measured, in particular if with rAF based async measurement (#173).

The workload thus reflects web runtime’s responsiveness to user actions, when facing CSS heavy and animation rich scenarios in real world like Netflix and Amazon etc. The animations we use are based on statistics from chromestatus, which mentioned that animations are utilized by more than 40% webpages.

Again, thanks for the comments and we’ll soon have a new update and welcome your further insights to it!

@HongZheng
Copy link
Author

HongZheng commented May 29, 2023

We have integrated CSS rich workload into SP3. You can review the code at https://github.com/intel-staging/Speedometer/tree/rich_css/resources/tentative/rich-css
A live demo at https://hongzheng.github.io/sp3-rich_css/?suite=Rich-CSS#home

@MayuraRam
Copy link

MayuraRam commented Jun 7, 2023

From discussion during meeting on 06/07/2023 - Please review the workload above. Could we do a PR and add the test to tentative? And labeled accordingly? Currently its labeled as V4.

@MayuraRam
Copy link

Discussion and comments from the slack Channel

smfr
5 days ago
this seems more like a painting benchmark, with things like CSS filters

smfr
5 days ago
is it intended to stress CSS parsing etc, or painting, or both?

bgrins
1 day ago
I do like the idea in general of a rich data table / image carousel etc that rely on complex CSS as being part of Speedometer. But in practice I'd prefer the test to be more heavily inspired by real world content & patterns so we can be confident that optimizations to it will drive increased performance for content on the Web. A couple things for example in the test that I don't think I've seen on pages are having a table where the focused row is sharp and the others have a blur effect, and big multistop CSS gradient background images on the body element.
One thing that I think would help me understand how to review this is to reposition this test from "Rich CSS" (the tech that's being tested) to something more descriptive of the experience that's being modeled. Is it meant to be a "Interacting wtih a dashboard to find popular content" (a la Netflix / IMDB), "Searching for a restaurant" (a la Yelp or a Maps app), or something else? It may seem silly, but since there's so much open space on implementation details I've found doing this helps set some constraints around what content to study and model. See also my proposal around workload definition
https://docs.google.com/document/d/1BCAlKWqILFtoqH6wLRQc1RtFukY60nuEotjvEQZANgg/edit#heading=h.i542mtoho4us - and I know we've done a bit of documentation on these types of tests in https://docs.google.com/document/d/155PztxZ-I-Epk_Fm_l7FmCerhKFwyoGnYpOhMsdVdY8/edit#heading=h.2uml6sq22frj but I think it could be a bit more clear for this case.

bgrins
1 day ago
FWIW I can see the inspiration from https://top10.netflix.com/tv and think something of this shape could be a good test. I haven't studied this closely, but clicking around there the most interesting work I see is driven from the UI change from i.e. TV to Films and not a linkage between the table and the carousel. There are even more permutations with "type" and "country" dropdowns on https://www.netflix.com/tudum/top10/united-states - when changing these params it drives a change to the images in the carousel, the contents of the table, and a complex "card" for each show beneath the table. Not sure if it's a similar story for IMDB which is also referenced in the issue

@MayuraRam
Copy link

[From Hong Sheng] Thanks for the great and valuable comments! Yes, it’s better to be changed to something like “Interacting with Featured Page to Navigate or Search Items” as suggested. We’re also considering to remove CSS gradient and blur from the implementation as well and replace with something more typical in real world. The proposed case stresses CSS processing, DOM and a little bit of painting.

Regarding the real-world reflection, we’re inspired by a list of websites that put an image based carousel/slideshow along with list/table of contents as the UI pattern to organize and present their information. Some of the them are listed below.

Real World Scenario URL Typical UI Elements Typical Interactions Typical Web tech exercised
Netflix Top 10 TV https://top10.netflix.com/tv Carousel & Table of Popular contents Change selection in Dropdown triggers the update of carousel images and contents in table CSS transform, opacity/size settings
IMDB Index https://www.imdb.com/ Carousel & List of contents in DIV blocks Click the arrow button in carousel to update the list of the contents CSS transform, opacity/color settings
Facebook photo https://www.facebook.com/photo/?fbid=742932955788401&set=a.636245978533785 Slideshow & List of contents in DIV blocks Click the arrow button in carousel to update the list of the contents CSS opacity/color/padding settings
Walmart Complete the look https://www.walmart.com/ip/The-Beatles-Men-s-Abbey-Road-Graphic-T-Shirt-with-Short-Sleeves/1786976753?athbdg=L1600 Carousel & List of contents in DIV blocks Click the arrow button in carousel to update the list of the contents in “View details” window CSS fade-out animation, opacity/visibility settings

@HongZheng
Copy link
Author

@rniwa
Copy link
Member

rniwa commented Jun 16, 2023

Is there a PR to add this workload somewhere?

@HongZheng
Copy link
Author

Is there a PR to add this workload somewhere?

I have submitted a PR #247, please review.

@camillobruni
Copy link
Contributor

I like the CSS heavy part of this workload, but I still have some doubts:

  • A lot of time is spent in just building up a table (~ 25% of the time), maybe that's something to optimise
  • Having animations in this workload seems a bit counter intuitive. In order to boost the score, the most straight-forward (and wrong) thing to do, would be to not do any animation (in the extreme case) to free up resources for the main-thread JS / DOM part.
  • The CSS is rather small for a typical website (maybe you had plans to extend this?)

Maybe we can keep the CSS rules with animations in there, but don't trigger them.

@HongZheng
Copy link
Author

As I said in PR #247, the generated table contains some heavy CSS effects and animations, which reflects the purpose of the workload. But current solution may introduce much HTML parsing overhead which we will optimise later.
As async step won’t be ready until SP4, the workload currently only measures the initial frame after the table is generated. So animations don't happen in the scoring region. The result is something like "keep the CSS rules with animations in there, but don't trigger them".

@rniwa
Copy link
Member

rniwa commented Jun 23, 2023

I'm at a bit of loss as to what state this proposal is in. What is the proposed PR for Speedometer 3? Is it #247 ?

If so, that workload seems to induce a bunch of stray async tasks that run outside of the measured time window both on Chrome & Safari so we should fix that.

More generally, while I appreciate & value your contributions, this workload seems to be more about measuring the page load speed than web app responsiveness. Since the goal of Speedometer is measuring web app responsiveness, not page load speed, this test might be out of scope by concept. If we were to include this test in Speedometer 3, we need to put more focus on app responsiveness after the page load had completed.

@camillobruni
Copy link
Contributor

I share rniwa's sentinment here. Maybe let's not rush this and put investigation / development on pause for this workload (and maybe think about it again for the next version 4) until we have stabilised everything else.

@MayuraRam
Copy link

Thanks for your comments. rniwa - regarding your comment "measuring the page load speed than web app responsiveness" - The measured part of the workload occurs after the page is loaded. So it does measure page responsiveness, by measuring the time it takes to create the initial frame and not the page load speed.

@rniwa
Copy link
Member

rniwa commented Jun 28, 2023

Thanks for your comments. rniwa - regarding your comment "measuring the page load speed than web app responsiveness" - The measured part of the workload occurs after the page is loaded. So it does measure page responsiveness, by measuring the time it takes to create the initial frame and not the page load speed.

Measuring the initial frame still sounds like a page load test to me. I think we should not measure the initial frame / page load at all for this workload to be stay focused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants