Building Touch Interfaces with HTML5: Speeding Up the Next Visit with Browser Cache, LocalStorage, and the Application Cache
- Caching in HTTP
- Optimizing for mobile
- Using web storage
- The application cache
- Wrapping up
So much about computing performance depends on caching. Fundamentally, caching is putting data somewhere after you get it the first time so you can access it much more quickly the next time. On the web, we want to take advantage of caching as often as possible to speed up users’ subsequent visits to the site, keeping in mind that their next visit is quite frequently within seconds of their first, when they ask for another page.
On mobile, as much as anywhere, we want to make the best possible use of caching. The main tools we have for caching on touch devices are the normal browser cache, localStorage, and the application cache. In this chapter we’ll look at normal browser cache, which isn’t as good as it should be; LocalStorage, a newish API for persistent storage that’s an incredibly powerful tool for manual caching; and the application cache.
Caching in HTTP
HTTP was designed with caching in mind. The cache we’re most familiar with is the browser cache, but additional caching proxies often exist as well, and they follow the same rules defined in the specification. There are three ways to control HTTP caches:
- Freshness
- Validation
- Invalidation
Freshness
Freshness, sometimes called the TTL (Time To Live), is the simplest. Using headers, caching agents are told how long to hold on to a cached resource before it should be considered stale and refetched. The simplest way this is handled is with the Expires header. You might remember that YSlow and PageSpeed recommend setting far-future Expires headers for static content.
The goal here is that so-called static assets (like CSS and JavaScript) are never fetched again, if possible. YSlow advises that you set an expiration some time in the distant future:
Expires: Thu, 15 Apr 2025 20:00:00 GMT
The intent is that the browser (or a caching proxy) will keep this file around until it runs out of room in cache.
Validation
Validation provides a way for a caching agent to determine if a stale cache is actually still good, without requesting the full resource. The browser can make a request with an If-Modified-Since header. The server then can send a 304 Not Modified response and the browser uses the file already in the cache, rather than refetching from the server.
Another validation feature is the ETag. ETags are unique identifiers, usually hashes, which allow cache validation without dates by comparing a short string. The requesting agent makes a conditional request as well, but this time with an If-None-Match header containing the ETag. If the current content matches the client’s ETag, then the server can again return a 304 response.
Validating the cache does require a full round-trip to the server. That is better than redownloading a file, but avoiding a round trip altogether is preferable. That’s the reason for the far-future expiration date. If the cached item hasn’t expired, then the browser won’t attempt to validate it.
Invalidation
Browsers invalidate cached items after some actions, the most common being any non-GET request to the same URL.
What is normal cache behavior?
So what is the normal behavior of the browser cache, if you don’t mess with the headers or do anything else? Most browsers have a maximum cache size. When that size is reached they begin removing items from the cache that were least recently used. So a cached item that hasn’t been used in a long time will be purged, keeping items used more frequently.
The result of this algorithm is that what is purged is completely based on user behavior and there’s no reliable way to predict how it will work. It’s safe to assume that if you don’t think about cache headers, then some browser will cache something you don’t want cached and won’t cache something you do.