Understanding the Inconsistencies in PageSpeed Insights Performance Metrics
Achieving outstanding website performance is a continuous journey, especially when optimizing for core metrics like First Contentful Paint (FCP), Largest Contentful Paint (LCP), and Speed Index (SI). Recently, I embarked on a mission to fine-tune the front end of my website to consistently score 100/100 on Google’s PageSpeed Insights (PSI), and I’ve made significant progress—particularly on desktop. My mobile scores are close behind, with most of the time hovering around 99/100, which I find acceptable.
However, an intriguing challenge has emerged: the PSI results display significant variability over short periods. I’ve observed the same page source yielding a performance score as low as 79/100 in one test and soaring to 99/100 just five minutes later. This inconsistency—despite consistent source code and similar testing conditions—has become a source of frustration.
In my efforts to diagnose the issue, I’ve employed multiple strategies:
– Integrating LogRocket to monitor behavior during PSI testing, then removing it once I realized that console errors are factored into PSI reports.
– Implementing comprehensive frontend logging and recording every event to analyze errors and performance metrics line-by-line.
– Optimizing critical resources by inline CSS, compressing HTTP responses, and leveraging preload hints for essential fonts and hero images.
– Structuring JavaScript loading into multiple layers, prioritizing above-the-fold functionality, consent and tracking scripts, core UI components, and deferring less critical scripts until user interaction.
Despite these measures, the core problem persists—namely, the inconsistency of PSI’s performance evaluations. Interestingly, Lighthouse scores remain stable at a perfect 4×100, provided that network and CPU are throttled to reflect real-world mobile conditions.
From my observations, the testing environment’s behavior points toward external scripts—particularly cookie consent and analytics (such as CookieYes and GA4)—as potential culprits. These scripts, which often involve dynamic content and asynchronous loading, could be influencing PSI’s perception of performance. For example, even though window.load triggers approximately within 0.5 seconds in every test, the observed FCP and LCP times sometimes significantly lag behind, suggesting other factors at play.
Here are examples of the variance:
– Low score example: https://bit.ly/45S8W3b —

