Speed Up Your Site: Web Site Optimization
URL abbreviation is one of the most effective techniques you can use to optimize your HTML. First seen on Yahoo!'s home page, URL abbreviation substitutes short redirect URLs (like r/ci) for longer ones (like Computers_and_Internet/) to save space. The Apache and IIS web servers, and Manila (http://www.userland.com) and Zope (http://www.zope.org) all support this technique. In Apache, the mod_rewrite module transparently handles URL expansion. For IIS, ISAPI filters handle URL rewrites. Here are some IIS rewriting filters:
ISAPI_Rewrite (http://www.isapirewrite.com/)
OpCode's OpURL (http://www.opcode.co.uk/components/rewrite.asp)
Qwerksoft's IISRewrite based on mod_rewrite (http://www.qwerksoft.com/products/iisrewrite/)
Cocoon
Many of these fancy browser detection and performance techniques can be handled elegantly with Cocoon. Cocoon is an open source web application platform based on XML and XSLT. Actually a big Java servlet, Cocoon runs on most servlet engines. Cocoon can automatically transform XML into (X)HTML and other formats using XSLT. You can set up a style sheet and an output format for certain types of browsers (WAP, DOM, etc.), and Cocoon does the rest. It makes the difficult task of separating content, layout, and logic trivial. Cocoon can handle the following:
Server-side programming
URL rewriting
Browser detection
PDF, legacy file formats, and image generation
Server-side compression
You can read more about Cocoon at http://xml.apache.org/cocoon/ or from the New Riders' book at (http://www.informit.com/content/index.asp?product_id={C3C05052-BE3B-4E06-A60A-13FB40AF58F6}).
URL abbreviation is especially effective for home or index pages, which typically have a lot of links. URL abbreviation can save anywhere from 20 to 30 percent off of your HTML file size. The more links you have, the more you'll save.
NOTE
As with most of these techniques, there's always a tradeoff. Using abbreviated URLs can lower search engine relevance, although you can alleviate this somewhat with clever expansions with mod_rewrite.
The popular Apache web server1 has an optional module, mod_rewrite, that enables your server to automatically rewrite URLs.2 Created by Ralf Engelschall, this versatile module has been called the "Swiss Army knife of URL manipulation."3 mod_rewrite can handle everything from URL layout, load balancing, to access restriction. We'll be using only a small portion of this module's power by substituting expanded URLs with regular expressions.
The module first examines each requested URL. If it matches one of the patterns you specify, the URL is rewritten according to the rule conditions you set. Essentially, mod_rewrite replaces one URL with another, allowing abbreviations and redirects.
This URL rewriting machine manipulates URLs based on various tests including environment variables, time stamps, and even database lookups. You can have up to 50 global rewrite rules without any discernable effect on server performance.4 Abbreviated URI expansion requires only one.
Tuning mod_rewrite
To install mod_rewrite on your Apache Web server, you or your IT department needs to edit one of your server configuration files. The best way to run mod_rewrite is through the httpd.conf file, as this is accessed once per server restart. Without configuration file access you'll have to use .htaccess for each directory. Keep in mind that the same mod_include performance caveats apply; .htaccess files are slower as each directory must be traversed to read each .htaccess file for each requested URL.