Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose adding to TESTING pass without crash or performance regression on WebGL and WebGPU conformance tests #41

Open
brierjon opened this issue Apr 10, 2023 · 6 comments

Comments

@brierjon
Copy link

I would like to propose adding running the WebGL tests suites and evaluate adding the WebGPU test suite too when Firefox officially supports WebGPU (currently in draft).

  1. WebGL 1 latest test suite currently: https://registry.khronos.org/webgl/sdk/tests/webgl-conformance-tests.html?version=1.0.4
  2. WebGL 2 latest test suite currently: https://registry.khronos.org/webgl/sdk/tests/webgl-conformance-tests.html?version=2.0.1
  3. (evaluate adding) WebGPU via https://github.com/gpuweb/cts -> https://gpuweb.github.io/cts/standalone/

In the past two months I've discovered two crashes in Firefox just by running the web based WebGL conformance test suite which seems to be isolated to Linux. The WebGL conformance testing doesn't appear to be catching these issues upstream in the Firefox continuous integration (not sure if or where the tests are integrated).

Given tighter integration of browsers and GPU drivers to allow access to GPU abilities, adding these tests would help to ensure the experience remains smooth. It may also worth considering running the test procedures when a GPU driver is updated or other Firefox dependencies are updated.

Example of browser crash bugs discovered while running the WebGL test suite:

@mmstick
Copy link
Member

mmstick commented Apr 10, 2023

Does Mozilla run these tests before making a Firefox release? I'm not sure if it would be useful to us since some WebGPU crashes would still be better than having unpatched security vulnerabilities. These builds come straight from Mozilla so they should be reported to Mozilla

@kdashg
Copy link

kdashg commented Apr 11, 2023

Yes, we (Mozilla) run the WebGL CTS tests as part of CI, so by the time something hits release it has run the CTS many many times. We also just added the WebGPU CTS tests to CI, though we fail so many of them still that it's not very useful to talk about their results except as a ratchet for forward improvement.

We do omit the deqp tests since those historically have been high-cost-low-value. It looks like 1820914 was from that section, so that's unfortunate.

We don't run it via the test runner though, as we have our own harness and we chunk test sets to keep test job times reasonable. So we run the individual tests, but we might indeed not hit the issue you ran into with running the whole suite in one page. Both of the other bugs here (1826981 and the crash report) are for OOMs, so that's something we'll have to look into.

@jacobgkau
Copy link
Member

As mentioned in #42, the WebGL 2 test takes significant time (on lemp10 with Intel integrated graphics and 40GB of RAM), and would likely delay approval/release by at least a few hours for every future Firefox version if we decide to implement it as part of our QA procedures. WebGL 1 was faster, but had some failing tests. I'm currently running the WebGL 2 test on oryp8 (with discrete NVIDIA graphics but only 16GB of RAM) to see if it does better with discrete graphics.

If we were to implement the tests, we'd need to decide what to do if we see e.g. more tests failing after an update than before, or a new crash happening in a release containing a security fix. Since we don't maintain Firefox's codebase, that would likely mean delaying releases until Mozilla fixes the regressions (and ideally reporting issues to upstream Mozilla when we come across them.) But if Mozilla has these tests as part of their own CI, then that may not be very useful.

@jacobgkau
Copy link
Member

The tab crash occurred much quicker on oryp8, on the currently-released version of Firefox. So the crash I saw while testing #42 was not a regression, and we wouldn't be able to get a full pass/fail count as long as that's happening (I suppose I could watch or record it to try and see which exact test is causing the crash, then we could uncheck that test and run the suite again.)

@brierjon
Copy link
Author

(regarding why CTS and/or other tests might be worth considering) Per Mozilla Firefox testing channel discussion on Matrix - "we only run it on one combo of hardware though, and rely on external reports for broader testing" - and example CTS run - and the tested software version appears to be on 18.04.6 LTS (Bionic Beaver).

I wouldn't have CTS test failures block new releases as security fixes are more important, but consider them added quality checks for the GPU accelerated browser on System76 hardware and Pop!_OS software combinations with baselines as suggested by jacobgkau. It may also prompt checking if newer drivers / patches exist or are needed to fix issue(s) (mainly looking at Mesa).

An alternative to consider that wouldn't delay new releases and should decrease the likelihood of issues landing in the stable release would be to run periodic (weekly?) automated QA tests on earlier builds -> Firefox Nightly or Firefox Beta and report issues. It should have the same goal and impact of helping verify Firefox will run smoothly on System76 hardware/software/driver combinations prior to users experiencing them.

As to note: https://support.mozilla.org/en-US/kb/upgrade-graphics-drivers-use-hardware-acceleration

@brierjon
Copy link
Author

Linking to an issue found while testing the upstream Firefox Nightly & a subgroup of the WebGPU CTS - pop-os/mesa#19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants