With privacy.resistFingerprinting set to true in Firefox, Speedometer shows the result Infinity #399

julienw · 2024-03-13T15:38:48Z

This was reported by some people on social networks. We found that setting privacy.resistFingerprinting to true was triggering this. But this might happen in some other cases too.

When doing it, some subtests take 0ms on every run, this may mess up the results.
Any chance we can fix this?

bgrins · 2024-03-13T16:00:06Z

The simplest fix here IMO would be to detect an infinity score and instead show an error message

bgrins · 2024-03-13T16:02:28Z

I could imagine a warmup step where it looks for anomalies and shows a warning, like MotionMark does with refresh rate, but this case is likely rare in practice so a guardrail on the score display may be sufficient.

julienw · 2024-03-13T16:31:54Z

But I don't understand why one single substep being at 0 would yield an Infinity result. I'm not great at statistics, but I smell something fishy here.

julienw · 2024-03-13T18:29:06Z

(update: I got it wrong, I rewrote the comment)

But I don't understand why one single substep being at 0 would yield an Infinity result.

The problem isn't a single substep, but all substeps for 1 test being 0. I've seen it happening for TodoMVC-Preact-Complex-DOM for example, but svelte-complex and lit-complex could happen too.

The geomean is a product, so a suite at 0 poisons the geomean for this run.
Then I think that we geomean the geomeans to get the result for all iterations.
Then a score is 1000 / geomean, so we get Infinity.

The wikipedia page for geomean says:

The geometric mean can be an unreliable measure of central tendency for a dataset where one or more values are extremely close to zero in comparison to the other members of the dataset.

Therefore I wonder if the use of geomean is appropriate.
Geomeaning the scores between runs, or the tests between runs, makes sense. But geomeaning the various tests together, I'm not so sure. I guess this ship has sailed for v3 though.

A quick fix might be to replace 0 by 1 when adding values in

Speedometer/resources/benchmark-runner.mjs

Lines 530 to 531 in bd9ab77

    
           const suiteTotal = this._measuredValues.tests[suiteName].total; 
        
           product *= suiteTotal;

(possibly also in

Speedometer/resources/metric.mjs

Lines 65 to 69 in bd9ab77

    
           add(value) { 
        
               if (typeof value !== "number") 
        
                   throw new Error(`Adding invalid value=${value} to metric=${this.name}`); 
        
               this.values.push(value); 
        
           }

but after a quick look I don't think the geomean calculated here is actually used).

What do you think @camillobruni ?

julienw · 2024-03-13T18:30:14Z

Note: I also noticed errors with NaN values in the SVG rendering of the Details view, I believe this comes from the same issue.

julienw · 2024-03-13T18:58:19Z

I find our score computation very difficult to follow, where we do the same thing in different ways in different files: benchmark-runner.mjs vs metric.mjs vs Statistics.mjs, with no clear comment about the formula we use to calculate the score. It's very unclear to understand where the geomean that we use as the basis for the score is computed from.

camillobruni · 2024-03-13T19:14:44Z

You are absolutely right :) I kind of kept some backwards compatibility with v2.1 but ultimately we should all migrate to metrics.mjs (the Statistics.mjs is from the original v.21 version as to no alter any existing scoring). So

Geomean over all the different tests makes sense to not favour the longer running ones disproportionally (we could spend some effort to make each test run the same length in the future?).
For everything else, the arithmeticMean makes sense IMO.

Given that this is something that should never happen, we could also just fail and not display anything if the sum of all subtests equal to 0.

julienw · 2024-03-13T19:38:25Z

Yeah indeed, 0 is an absurd number in this case. Is it practical to ignore the test instead of failing generally ?

rniwa · 2024-03-13T20:23:26Z

The point of using geometric mean across suites is to give each suite the same weight. If we used arithmetic mean, we would give more weights to the long running / slow suites.

camillobruni self-assigned this Mar 13, 2024

julienw added the v3.1 label Mar 20, 2024

camillobruni linked a pull request Apr 8, 2024 that will close this issue

Don't display invalid results #406

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With privacy.resistFingerprinting set to true in Firefox, Speedometer shows the result Infinity #399

With privacy.resistFingerprinting set to true in Firefox, Speedometer shows the result Infinity #399

julienw commented Mar 13, 2024

bgrins commented Mar 13, 2024

bgrins commented Mar 13, 2024

julienw commented Mar 13, 2024

julienw commented Mar 13, 2024 •

edited

julienw commented Mar 13, 2024

julienw commented Mar 13, 2024

camillobruni commented Mar 13, 2024

julienw commented Mar 13, 2024

rniwa commented Mar 13, 2024

With privacy.resistFingerprinting set to true in Firefox, Speedometer shows the result Infinity #399

With privacy.resistFingerprinting set to true in Firefox, Speedometer shows the result Infinity #399

Comments

julienw commented Mar 13, 2024

bgrins commented Mar 13, 2024

bgrins commented Mar 13, 2024

julienw commented Mar 13, 2024

julienw commented Mar 13, 2024 • edited

julienw commented Mar 13, 2024

julienw commented Mar 13, 2024

camillobruni commented Mar 13, 2024

julienw commented Mar 13, 2024

rniwa commented Mar 13, 2024

julienw commented Mar 13, 2024 •

edited