Cumulative layout shift tests the stability of content on pages, how much it shifts around after it appears to have loaded. To calculate the score CLS measures and sums every individual throughout the lifespan of the page. Go to read or click an element and another buts in, pushing everything around it to a new space, CLS aims to remove a frustrating element of the user experience.
Because CLS adds up every shift throughout a session, the longer the session, the more likely it becomes that the site gets a high CLS score.
Even shifts that have very little effect on the overall quality of the session, are barely even noticed through a long session, can add up to a 'poor' rating. This is the unintended consequence that Google is trying to refine out of the process.
What causes high CLS in long sessions?
In a long session, the majority of the page content is loaded and has stopped moving, so high CLS is brought on by:
Infinite scrollers shifting content as the user scrolls.
Input handlers that take more than 500ms to update the UI after user interaction, without implementing a placeholder or a 'skeleton' framework.
Pages that automatically load the next story at the bottom of the page, or pages with many, slow interactive elements can build up high scores. This was having a negative impact on pages that were keeping users engaged for longer.
CLS needs to evaluate the user experience through the full page lifetime, as users can have negative experiences while scrolling or navigating through longer sessions. But, pages shouldn't get a bad score just because the users interact with a lot of content.
What is the solution?
To test the real-world implications of this possible flaw in the scoring system, Google recorded videos and Chrome traces of 34 user journeys through various websites.
The result is one tangible change to the measurement. Rather than collecting all of the shifts throughout a long session, Google will now look at the amount pages shift across what it defines as 'session windows'. These are periods when the user is active on the page. If the user is inactive for over a second, a new session window starts.
The goal of this change is to reduce the correlation between time and spent on the page and cumulative layout shift. If users are engaging with the page, this shouldn't hurt the site's SEO.
What is a session window?
Often pages have multiple layout shifts bunched closely together because elements can shift multiple times as new content comes in piece by piece.
A session window starts from the first layout shift and continues to expand until there is a gap with no shifts (this is set as one second). If or when another layout shift occurs, a new session window starts. With session windows defined, there are then two ways to measure the CLS across them.
The average score of all the session windows, for very large session windows (uncapped windows with 5-second gaps between them).
The maximum score of all the session windows, for smaller session windows (capped at 5 seconds, with 1-second gaps between them).
Imagine loading an article, letting the page settle, and reading the content. Once you're done with that viewport, the next screen loads and there is a little CLS. In one method, the CLS from the first window is used for the scoring because it is the greater of the two. In the other, the average score across the two windows would be used. In this second example, the CLS score would be much lower than the first.
In the end, Google chose to measure the maximum CLS in a session window, with a 1-second gap between sessions, capped at 5 seconds, rather than taking the average.
This is because, if the average is taken, it can rely too much on CLS that doesn't really affect the experience. In the case above, if the webmaster fixed the small shift in the second window, the average score would be much higher.
The site has improved the experience, but their score will worsen because of it. Measuring using the biggest window measures the CLS most likely to be noticed by the user and actually have an impact on their experience.
What will the effect on Core Web Vital scores be?
This change will not have a major impact on most publishers. It is mostly designed to prevent extreme examples where there is a strong correlation between time on site and cumulative layout shift.
In Google's own words, 'Most will only see a slight improvement, but about 3% will see their scores improve from having a "needs improvement" or "poor" rating to having a "good" rating.'
This change is a good example of how Core Web Vitals can evolve and adapt to make sure they promote the best experiences. Even simple metrics can require refinement to make sure that they truly match the users’ definition of a good experience.