Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LCP and CLS report callback are significantly delayed #405

Closed
jashsayani opened this issue Dec 4, 2023 · 5 comments
Closed

LCP and CLS report callback are significantly delayed #405

jashsayani opened this issue Dec 4, 2023 · 5 comments

Comments

@jashsayani
Copy link

According to the README:

Also, in some cases a metric callback may never be called:

  • FID and INP are not reported if the user never interacts with the page.
  • CLS, FCP, FID, and LCP are not reported if the page was loaded in the background.

Currently, I see FCP and TTFB being reported immediately when page is loaded, but LCP and CLS are significantly delayed (page is not loaded in background and callback is fired, but its too late). FYI: I am not using reportAllChanges.

My network timeline:

Screenshot 2023-12-01 at 5 31 09 PM

FCP and TTFB happen at 6.8 secs which is reasonable:

Screenshot 2023-12-01 at 5 31 32 PM

LCP and CLS are at 43 seconds!! Users don't stay on a page that long!

Screenshot 2023-12-01 at 5 31 40 PM

Is this a bug? If I setup a PerformanceObserver with type: "layout-shift", the callback happens instantly. Looking at the code, I see it tracks "session" in the 5 sec window or creates a new session, etc. But that should not take 40 secs to report. LCP is also delayed and it just checks visibilityWatcher.firstHiddenTime which should be Infinity as the window is always visible in this case.

@tunetheweb
Copy link
Member

This is working as intended.

Both LCP and CLS (and also INP btw) can change throughout the life of the page (1). For example a larger element could still be inserted into the page at any time and become the LCP element and time. Therefore it is not possible to know when the value we have is the "final" value of this metric and since you are not using reportAllChanges we should only report the final value (2). When it does report it will report the largest value up until then - even if it occured much earlier. For LCP the timestamp will have been much earlier. It just delays reporting that time until it's sure that is the final LCP (and similarly for CLS and INP).

This differs to TTFB (which can be finalised and reported as soon as the first bytes are received as it will not change after that), FCP (similarly as soon as first contentful paint happens) and FID (when first interaction happens).

(1) Note for LCP it can be finalised on interaction:

The browser will stop reporting new entries as soon as the user interacts with the page (via a tap, scroll, or keypress), as user interaction often changes what's visible to the user (which is especially true with scrolling).

In these case it is reported earlier:

web-vitals/src/onLCP.ts

Lines 99 to 107 in 16e6e93

// Stop listening after input. Note: while scrolling is an input that
// stops LCP observation, it's unreliable since it can be programmatically
// generated. See: https://github.com/GoogleChrome/web-vitals/issues/75
['keydown', 'click'].forEach((type) => {
// Wrap in a setTimeout so the callback is run in a separate task
// to avoid extending the keyboard/click handler to reduce INP impact
// https://github.com/GoogleChrome/web-vitals/issues/383
addEventListener(type, () => setTimeout(stopListening, 0), true);
});

(2) Other than the above case for LCP when an interaction happens, LCP, CLS, and INP should really only be reported when the user navigates away from the page. The web-vitals.js library does do this, but also reports on other other case, which is when the page visibility changes (i.e. the tab is backgrounded). We do this because there this is the last reliable time to report for a page in certain circumstances (e.g. if the page is never foregrounded again), though this can also lead to duplicate reporting (e.g. if the CLS is reported on backgrounding, the page is foregrounded again, has some more CLS for example, and then you leave, then you will get two CLS reports), but it's the best we can do (FYI, the Chrome team are working on a new API for circumstances like this to only report this once!).

You can see LCP stops listening with the onHidden function (which includes both backgrounding or navigating away, and for LCP also reports the current value):

onHidden(stopListening);

CLS reports (but then continues listening):

web-vitals/src/onCLS.ts

Lines 113 to 116 in 16e6e93

onHidden(() => {
handleEntries(po.takeRecords() as CLSMetric['entries']);
report(true);
});

and INP does similar:

web-vitals/src/onINP.ts

Lines 227 to 238 in 16e6e93

onHidden(() => {
handleEntries(po.takeRecords() as INPMetric['entries']);
// If the interaction count shows that there were interactions but
// none were captured by the PerformanceObserver, report a latency of 0.
if (metric.value < 0 && getInteractionCountForNavigation() > 0) {
metric.value = 0;
metric.entries = [];
}
report(true);
});

LCP is also delayed and it just checks visibilityWatcher.firstHiddenTime which should be Infinity as the window is always visible in this case.

LCP has an additional check that ignores pages that were opened in the background (as measuring LCP in this case is tricky, as technically the paint doesn't happen until the user foregrounds the tab but that might be way after the LCP content has actually loaded). But that is unrelated to late reporting (it never reports in this case).

So all in all it is expected to report LCP, CLS, and INP much later.

@jashsayani
Copy link
Author

it is not possible to know when the value we have is the "final" value of this metric

Yes, but at some point the value has to be "final". I can have window.setTimeout(renderNewLargeContent, 50000) and change the largest content right after it being reported. I think there should maybe be a "maxTimeout" that I can set of like maybe 15 secs, where the value is reported (whatever it happens to be until that point). If you want to be accurate throughout the lifecycle of the page, then this should not fire/report at 40 secs, but rather on page unload/navigation away.

It just delays reporting that time until it's sure that is the final LCP

You cannot know its the final LCP unless the lifecycle of the page is over (like on unload). Otherwise its just an arbitrary point in time when it says "i guess page won't change now". So it being reported at 40 seconds right now is a bug! Should not report till I navigate away or interact.

(1) Note for LCP it can be finalised on interaction

Yes, I saw this in the implementation but when recording field metrics, it is not great to rely on user interaction to report. If they open the page, then close it in 5 secs, the metric is lost. We are not blocking unload till the reporting call is successful (which would be bad anyway). Also for websites that are not Single Page Apps, clicking on any content can mean navigating away and lost metrics.

LCP, CLS, and INP should really only be reported when the user navigates away from the page

I think having a maxTimeout of 10 or 15 secs and reporting is better as relying on navigating away is not reliable and its basically lost metrics. In any case, the LCP and CLS being reported at 43 secs is a bug?

Tl;dr: (1) What do you think of maxTimeout? (2) What is the recommended approach if I want to have a "final" value at say 15 seconds and not rely on interact/end of lifecycle?

@tunetheweb
Copy link
Member

it is not possible to know when the value we have is the "final" value of this metric

Yes, but at some point the value has to be "final". I can have window.setTimeout(renderNewLargeContent, 50000) and change the largest content right after it being reported. I think there should maybe be a "maxTimeout" that I can set of like maybe 15 secs, where the value is reported (whatever it happens to be until that point). If you want to be accurate throughout the lifecycle of the page, then this should not fire/report at 40 secs, but rather on page unload/navigation away.

There is no 40 second timeout. You must have backgrounded the tab at the 40 second mark. That is also why CLS was reported at the same time.

It also seems somewhat contradictory that you are complaining about a max timeout (that does not exist) but then asking for a user-defined max timeout.

It just delays reporting that time until it's sure that is the final LCP

You cannot know its the final LCP unless the lifecycle of the page is over (like on unload). Otherwise its just an arbitrary point in time when it says "i guess page won't change now". So it being reported at 40 seconds right now is a bug! Should not report till I navigate away or interact.

As discussed above the web-vitals library reports metrics on visibility change (including page unload), or sooner if they can be finalised sooner. It is not arbitrary.

(1) Note for LCP it can be finalised on interaction

Yes, I saw this in the implementation but when recording field metrics, it is not great to rely on user interaction to report. If they open the page, then close it in 5 secs, the metric is lost. We are not blocking unload till the reporting call is successful (which would be bad anyway). Also for websites that are not Single Page Apps, clicking on any content can mean navigating away and lost metrics.

No, if they open the page and then close it after 5 seconds, the page visibility will change, and there the beacons will fire. The page will hold navigating away until the event fires. Pages do block navigating on event handlers firing, otherwise there would be no point in having event handlers for these.

LCP, CLS, and INP should really only be reported when the user navigates away from the page

I think having a maxTimeout of 10 or 15 secs and reporting is better as relying on navigating away is not reliable and its basically lost metrics. In any case, the LCP and CLS being reported at 43 secs is a bug?

If it happened like that without backgrounding the page that would certainly be a bug, but I don't believe it does. Please provide a site where that happens.

Tl;dr: (1) What do you think of maxTimeout? (2) What is the recommended approach if I want to have a "final" value at say 15 seconds and not rely on interact/end of lifecycle?

This has been discussed before (for example in #394), but the point of the web-vitals.js library is as detailed in the home page of this project:

for measuring all the Web Vitals metrics on real users, in a way that accurately matches how they're measured by Chrome and reported to other Google tools (e.g. Chrome User Experience Report, Page Speed Insights, Search Console's Speed Report).

So we are against adding such an option because we believe the current methodology are the most accurate ways of measuring the Web Vitals.

@jashsayani
Copy link
Author

You must have backgrounded the tab at the 40 second mark

This is possible, if I can repro it reporting, I will file an issue.

complaining about a max timeout (that does not exist) but then asking for a user-defined max timeout

Yeah not sure why I saw it reported at round 40 secs, but I am mainly asking for a timeout.

Pages do block navigating on event handlers firing, otherwise there would be no point in having event handlers for these

Yes, the unloading JS will run, but I am not sure if there is a guarantee of network calls for reporting being fired successfully (with 200 OK response; I think HTTP/2 server would close connection if the client fires request but connection is closed/lost before 200 OK?). There are other issues, for example on mobile, bad connection/network when closing it, etc. Like in the metro/tube, you might lose network and close the page.

we are against adding such an option because we believe the current methodology are the most accurate ways of measuring the Web Vitals

Thanks, I will maybe use reportAllChanges and implement my own timeout. I think that would work.

@tunetheweb
Copy link
Member

Yes, there are definitely cases where beacons can be lost, and this only increases with late beacons. However, that must also be weighed against the cases that beaconing early reports incorrect data. The library currently chooses to choose correctness over more beaconing.

Thanks, I will maybe use reportAllChanges and implement my own timeout. I think that would work.

Do read my comments here: #394 (comment).

Will close this for now, but let me know if you have any more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants