Crunching and Obfuscation
Not to be confused with compression, crunching (or crushing or packing) is a term programmers have adopted to describe removing excess to reduce code to a minimum size. Although you can manually crunch by removing whitespace, comments, and abbreviating, automated programs are a more practical option for larger projects. There are several JavaScript crunchers available, including these:
JavaScript Crunchinator from BrainJar's Mike Hall (http://www.brainjar.com/js/crunch/)Removes whitespace and comments from JavaScript files and combines literal strings.
ESC (ECMAScript Cruncher) from Saltstorm (http://www.saltstorm.net/depo/esc/)This free Windows program is an ECMAScript pre-processor written in JScript. In addition to removing whitespace and comments from JavaScript, it can optionally rename variables in JavaScript. For IE5.5+ Win.
JSCruncher, by Nebiru Software, on the DOMAPI page (http://www.domapi.com/)Based on BrainJar's specifications, this free Windows application packs CSS and JavaScript files. Requires semicolons.
Script Squisher by Darren Semotiuk (http://batman.getmyip.com/projects/scriptsquisher/)This updated 5K entry squishes JavaScript by removing whitespace and comments. Does not require semicolons.
SpaceAgent from Insider Software, Inc. (http://www.insidersoftware.com/)This powerful Windows/Mac web site optimizer optimizes (X)HTML, XML, JavaScript, GIFs, and JPEGs. Server version also available.
VSE HTML Turbo from VSE Online (http://www.vse-online.com/)Like SpaceAgent, this Mac application optimizes (X)HTML, JavaScript, GIFs, and JPEGs.
These programs all work the same way, removing whitespace and comments to compact your code. Some, like ESC, optionally abbreviate object and variable names.
Obfuscation Anyone?
Because JavaScript is an interpreted language, hiding your scripts is impossible. You can, however, make them more difficult to decipher. Crunching certainly makes your code more difficult to read. But for some this is not enough. That's where obfuscators come in. Code obfuscators substitute cryptic string tokens and scramble names to make your code virtually unintelligible but still functional.
Blue Clam: JavaScript Obfuscator
By the time you read this, Solmar Solutions, Inc., will have released Blue Clam, a Java-based JavaScript obfuscator designed to protect your intellectual property and optimize JavaScript files. In development for two years, Blue Clam includes features not found in other JavaScript obfuscators, including recursive directory tree parsing, a user-defined keyword dictionary, variable-length obfuscated keyword support, extended file types (such as .js, .jsp, and .asp), and a graphical environment. For more information, see http://www.solmar.ca.
All obfuscators work in a similar way to transform your program internally while preserving the same external functionality. One common obfuscation is to substitute short meaningless sequences like "cq" for longer descriptive names like "setAvatarMood." Let's look at some real-world obfuscated code. So this:
function setAvatarMood(theMood) { try { //see if we have to reset the mood's duration var resetMoodDuration = ((theMood != null) && (theMood != 'anim')); //make sure there is a mood if (!theMood) theMood = avatar.data.mood; //store the new mood in the avatar if (theMood != 'anim') avatar.data.mood = theMood; //see if the mood exists if (!globals.moods[theMood]) theMood = globals.defaultMood; //set the appropriate mood-image to visible and all others to invisible //by moving them in or out of view for (var aMood in globals.moods) { avatar.labeledElements['avatar' + globals.moods[aMood] + 'Image'].style.top = ((aMood == theMood)?0:-10000) + 'px'; } //let the mood expire if it is not equal to the default mood if (resetMoodDuration && (theMood != globals.defaultMood)) { if (theMood != 'anim') delayedEval(avatar.id + ".setMood", null); delayedEval(avatar.id + ".setMood", "try { engine.getAvatarByID('Quek', '" + avatar.id + "').setMood('" + globals.defaultMood + "'); } catch(e){;}", avatar.MOODDURATION); } } catch(e){} }
Becomes this (without whitespace removal):
function cq(de) { try { var ch = ((de != null) && (de != 'anim')); if (!de) de = kj.data.hu; if (de != 'anim') kj.data.hu = de; if (!io.uy[de]) de = io.we; for (var ty in io.uy) { kj.op['avatar' + io.uy[ty] + 'Image'].style.top = ((ty == de)?0:-10000) + 'px'; } if (ch && (de != io.we)) { if (de != 'anim') qw(kj.id + ".pw", null); qw(kj.id + ".setMood", "try { po.pp('Quek', '" + kj.id + "').pw('" + io.we + "'); } catch(e){;}", kj.ua); } } catch(e){} }
Even better (with whitespace removed):
function cq(de){try{var ch=((de!=null)&&(de!='anim'));if(!de)de=kj.data.hu; if(de!='anim')kj.data.hu=de;if(!io.uy[de])de=io.we;for(var ty in io.uy){ kj.op['avatar'+io.uy[ty]+'Image'].style.top=((ty==de)?0:-10000)+'px';} if(ch&&(de!=io.we)){if(de!='anim')qw(kj.id+".pw",null); qw(kj.id+".setMood","try{po.pp('Quek', '" +kj.id+"').pw('"+io.we+"');} catch(e){;} ",kj.ua);}}catch(e){}}
Without a map, these internal transformations make your program extremely difficult to reverse engineer, plus it's 65 percent smaller (from 1,091 to 376 characters). The code is part of Quek (http://www.quek.nl), a browser-based surf/animate/chat application written in JavaScript. This function changes the mood of an avatar. Lon Boonen of Q42 (http://www.q42.nl) obfuscates his JavaScripts to prevent prying eyes with a home-grown script and some manual tweaking. Thanks to Lon Boonen for these snippets.
JavaScript Obfuscators
JavaScript obfuscators are few and far between. Here are some examples:
JavaScript Scrambler (http://www.quadhead.de/jss.html)
Jmyth (http://www.geocities.com/SiliconValley/Vista/5233/jmyth.htm)
You can go further and substitute extended ASCII characters to obfuscate and tokenize your code even more. For maximum confusion, obfuscate reserved words by breaking them up into strings and use a concatenated variable. So instead of this:
bc.getElementById = kj;
Do this:
jh='ge';kl='tEleme';oi='ntB';zy='yID';ui=jh+kl+oi+zy; bc[ui]=kj;
Self-Extracting Archives
Some extreme programmers have gone so far as to create their own self-extracting archives. Trading time for space, they store their encoded script into one long string by substituting shorter tokens for longer repeated strings. Tack on a small decompressor at the end to replace the tokens with the original strings and eval the decompressed code and voilá!a self-extracting script.
These self-extracting archives take longer to decompress and execute, but download much faster. Some 5K contestants (http://www.the5k.org/) have adopted this approach to squeeze the maximum functionality into as little space as possible.
Fans of Chris Nott's 1K DOM API used a similar technique to reduce his tiny API to 634 bytes. Chris has automated the process with his compression utilities at http://www.dithered.com/experiments/compression/.
Compression ratios average about 25 percent for 5K files and higher for larger files. Because the decompressor adds about 130 bytes, smaller files actually can become larger. Nott recommends using files over 500 bytes for his client-side compression scheme.
Chris Johnson's Extended ASCII JavaScript Packer substitutes single byte-token extended ASCII characters for longer strings for efficient packing of JavaScripts (http://members.optusnet.com.au/~kris_j/javacomp.html).
With both of these programs, there are reserved letters and techniques that you must avoid to make them work. For the 5K contest, only client-side techniques are allowed. For most sites, server-side compression is a more practical solution.