Boost Your Website Performance By Learning Caching Policies
But for the browser, painting the DOM is not the only concern. It also deals with files and network calls; A bunch of HTML, CSS, JS, fonts, images, etc. are used to build the page.
as Sam Thorogood talks in VideoFrontend engineers often focus on improvement Lighthouse Score, which mainly focuses on the first time loading experience of a website. We should also prioritize the second (or subsequent) load speed when users revisit the website. This is where caching and efficient caching policies come into picture.
Caching is the idea of storing reusable data in a high-speed data storage layer, so it can be retrieved quickly to meet future requests.
If certain files don’t change frequently on a website, browsers can cache them so that they can be served faster for future requests and save a bunch of network calls. (and internet bill!)
“There are only two hard things in computer science: cache invalidation and naming things.” Phil Carlton
In this article, I want to share what I’ve learned about caching on the web.
Here’s a simplified approach to better understand the main players involved. Along with client-side caches, we also have shared caches implemented on proxy servers or CDNs that can be used by multiple clients.
Caching for a resource is done with the help of
Cache-Control HTTP Headers. It contains both request and response instructions that tell the browser and shared cache how to access the resource (more on that later).
Here’s my Twitter profile page loading for the first time (or after emptying the cache and hard reloading). Inspect the Size and Time tabs.
Here this is the second time I am opening this. Note that the time is very short.
And here it is for every time I open it. It loads in 0 ms.
Items like the Twitter logo, icon, my profile banner, profile image, etc. don’t change often and are therefore cached.
But why is there a disk cache and a memory cache?
Google Chrome typically has four types of cache:
- Memory Cache: Memory cache is a short-term cache that stores all resources cached during the current document’s lifetime (nonpersistent). The resources stored here remain until the tab/session is closed. It is stored in memory.
- Disk Cache: Disk cache (HTTP cache) is a permanent cache that allows resources to be reused between sessions and sites. It is a disk-based cache.
- Service Worker Cache: Service worker has a cache API which we can use to control the cache and it is persistent. Service Worker is a JS file, which is a fundamental component of building PWAs (Progressive Web Apps).
- Push Cache: The push cache is where HTTP/2 push resources are stored. Pushing is a performance optimization technique where the server sends some resources to the browser before making the request. We won’t go into too much depth here as I’m also just figuring out how to use it (lemme know if you know more about push caching)
Read more about Here,
Comment: The service worker cache will only exist if a service worker is registered on the page.
Twitter’s JS files are loaded from
Service worker cache. This is because Twitter Web App is a PWA, so it has a service worker registered.
browser’s caching flow
According to web.dev, the browser follows the caching flow order when checking for a resource. The memory cache is present above the service worker cache (missing in the picture below).
Let’s say you have a website
style.css , You want the item to be cached for a long time (like a year or more), but you’re also making changes to it
style.css Deploy regularly. Since the name doesn’t change, users will be served files from the cache.
Cache busting is one way of dealing with this problem where we use versions or hashes in the file names so that the browser can load them and update the cache.
We will prevent browser from caching HTML file and loaded files
Tags will also have a version of their name (most content is), so the browser can load them and update the cache.
if you used webpack or similar tools to build your app (or using something like CRA), you may have seen bundled CSS and JS files in a format with strange names
main.[content-hash].js , Whenever the content of the file changes, the content hash also changes.
This is the holy grail of caching on the web. A Cache-Control header that is part of the response from the server can tell the browser or proxy whether to cache the resource, for how long it should be cached, whether it should revalidate, etc.
For example, the above configuration says that the response can be stored in the cache for a week and reused when it is refreshed. if there is a reaction rancidmust be verified with the origin server before reusing it (because
must-revalidate response instructions).
Let's say a file is in the cache, and the browser needs to verify it again before it can use it. This can be done with the help of a Entity Tag (ETag) ie HTTP headers. The value of ETag is an identifier that represents a specific version of the file. They also help prevent simultaneous updates of the file from overwriting each other. For example:
Here's a simple overview of how it works. Suppose browser requests server for file
- The server sends the file back, generates an ETag, and appends it to the response header with a 200 HTTP status code. The browser then proceeds to cache it.
- Now, if the browser needs to verify the file in the cache again, it sends a request with the ETag it received earlier.
- The server then checks the file's ETag, and if the version hasn't changed, it sends back a response with a 304 HTTP status code, which tells the browser that the file hasn't been modified and to reuse the old one. Could
Otherwise, the server sends a new version of the file with a new ETag and a 200 HTTP status code.
Caching is an important component of improving web performance. Therefore it is important to understand how it works. This post is a simple introduction to caching and is not an in-depth guide.
Thanks for reading!