- Caching in HTTP
- Optimizing for mobile
- Using web storage
- The application cache
- Wrapping up
The application cache
The traditional browser cache, as mentioned previously in this chapter, isn’t particularly reliable on mobile. On the other hand, the HTML5 application cache is very reliable on mobile—maybe even too reliable.
What is the application cache?
With features like localStorage, you can easily see how a web application could continue to be useful even when not connected to the network. The application cache is designed for that use case.
The idea is to provide a list of all the resources your app needs to function up front, so that the browser can download and cache them. This list is called the manifest. The manifest is identified with a parameter to the <html> tag:
<!DOCTYPE html> <html manifest="birds.appcache"> <head>
This file must be served with the mime-type text/cache-manifest. If it’s not, it will be ignored. If you can’t configure a custom mime type on your server, you can’t use the application cache.
The manifest contains four types of entries:
- MASTER
- CACHE
- NETWORK
- FALLBACK
MASTER
MASTER entries are the files that reference the manifest in their HTML. By including a manifest, these files are implicitly adding themselves to the list. The rest of the entries are included in the manifest file.
CACHE
The CACHE entries define what to cache. Anything in this list will be downloaded the first time a visitor comes to the page. The entries will then be cached forever, or until the manifest (not the resource in question) changes.
NETWORK
Because the application cache is designed for offline use, network access actually has to be whitelisted. That means that if a network resource is not listed under network it will be blocked, even if the user is online. For example, if the site includes a Facebook “like” widget inside an iframe, if http://www.facebook.com is not listed in the NETWORK entry, that iframe will not load. To allow all network requests you can use the ‘*’ wildcard character.
FALLBACK
These entries allow you to specify fallback content if the user is not online. Entries here are listed as pairs of URLs: the first is the resource requested, the second is the fallback. You have to use relative paths, and everything listed here has to be on the same domain. For example, if you serve images from a CDN on a separate domain you can’t define a fallback for that.
Creating the cache manifest
Here’s a manifest for the Birds of California site from the previous chapter:
CACHE MANIFEST # Timestamp: # 2013-03-15r1 CACHE: jquery-1.8.0.min.js gull-360x112.jpg gull-640x360.jpg gull-720x225.jpg FALLBACK: NETWORK: *
Notice that there are entries for all the different images. Because these are explicit, the browser will download and cache all of them on the first visit to the page, but will never again need to fetch them.
Pitfalls of the application cache
The application cache is the nuclear option. That’s because the files in here will never expire until the manifest file itself changes, the user clears the cache, or the cache is updated via JavaScript (more on that later). That’s why we included a timestamp in the manifest so we can easily force a change to the file if we want to invalidate cached versions in the wild.
The application cache is also completely separate from the browser cache. For example, it is possible to create an application cache that will never revalidate. If you set a far-future Expires header on the manifest file, the browser will cache that file forever. When the application cache checks whether it has changed, it will get the version in the browser cache, see that it is unchanged, and then hold on to the cached files forever (or until the user explicitly clears her cache).
Once the page is cached, it’s possible to visit Birds of California without network connectivity. On iOS, offline is guaranteed to work only if the user has bookmarked the page on her home screen. In iOS Safari the contents of the application cache may be evicted if the browser needs to reclaim the space for the browser cache. The cache will still be used.
One of the other pitfalls of the application cache is that once it expires it won’t be updated until the next time the user visits. So if a user comes to your site with a stale cache, she’ll still see the cached version, even though it’s been updated. To make sure users get the latest and greatest bird info, we’ll take advantage of the application cache JavaScript API to programmatically check for a stale cache.
Avoiding a stale cache with JavaScript
The API for the cache hangs off the window.applicationCache object. The most important property there is “status.” As shown in Table 4.2, it has an integer value that represents the current state of the application cache.
Table 4.2. Application Cache Status Codes
Code |
Name |
Description |
0 |
UNCACHED |
The cache isn’t being used. |
1 |
IDLE |
The application cache is not currently being updated. |
3 |
CHECKING |
The manifest is being downloaded and updates are being made, if available. |
4 |
UPDATEREADY |
The new cache is downloaded and ready to use. |
5 |
OBSOLETE |
The current cache is stale and cannot be used. |
Thankfully, you don’t have to remember these numbers; there are constants on the applicationCache object that keep track of the association:
> console.log(window.applicationCache.CHECKING) 2
On the Birds of California site, we’ll add a short script to check the cache every time the page loads:
//alias for convenience var appCache = window.applicationCache; appCache.update();
This goes at the bottom of page and doesn’t need to be ready for the window onload event to do its stuff. At this point we could start polling appCache.status to see if a new version is loaded. When it’s calling the swapCache method, it will force the browser to update the changed files in the cache (it will not change what the user is seeing; a reload is still required). It’s simpler to use the built-in events that the applicationCache object provides. We can add an event handler to automatically reload the page when the cache is refreshed:
var appCache = window.applicationCache; appCache.addEventListener('updateready', function(e) { //let's be defensive and double check the status if (appCache.status == appCache.UPDATEREADY) { //swap in the new cache! appCache.swapCache(); //Reload the page window.location.reload(); } }); appCache.update();
In addition to the extremely useful “updateready” event, there’s a bigger set of events available on the applicationCache object, one for each state we already saw in the status property.
Having the page automatically reload, particularly when the user is in the middle of looking at the site, is a terrible user experience. There are several ways to handle this. Using a confirm dialog box or whisper tip to ask the user to reload to fetch new content is better, but still not great. In the next chapter we’ll explore a much better way to handle this, and other cases, by dynamically updating the content with AJAX.
The 404 problem
If any of the resources in the CACHE entry can’t be retrieved when the browser attempts to fetch them, the browser ignores the cache manifest. This means that if a user visits your site and for some reason one of the requests fails, it will be as if she were a completely new visitor the next time she visits—the cache will be useless. That means the cache is quite brittle: unless all the requests are successful, there’s no caching at all—it’s all or nothing.
The application cache: Worth the pain?
The application cache is obviously fraught with difficulties, not the least of which is how difficult it is to invalidate. It gives you a lot of power, but at the cost of flexibility and maintainability. Users love an app that launches instantly, but everyone hates strange errors. The stickiness of the application cache leads necessarily to strange bugs that are hard to chase down. When you use it, you’ll eventually end up with a file that you just can’t seem to get out of cache. It isn’t that the application cache is buggy; it’s that it’s completely unforgiving. If you deploy a bad cache, it can be a real problem to undo the error.
Optimizing for browser cache and using the much more flexible web storage API is usually a better choice, but when you want the fastest possible launch time, and you’re willing to accept the difficulties, the application cache is an incredible tool.