Introducing ColdFusion
IN THIS CHAPTER
The Basics
If you're embarking on learning ColdFusion then you undoubtedly have an interest in applications that are Web (shorthand for World Wide Web) based. ColdFusion is built on top of the Internet (and the Web), so before getting started, a good understanding of the Internet and related technologies is a must.
There is no need to introduce you to the Internet and the Web. The fact that you're reading this book is evidence enough that these are important to you (as they should be). The Web is everywhereand Web site addresses appear on everything from toothpaste commercials to movie trailers to cereal boxes to car showrooms. In August 1981, 213 hosts (computers) were connected to the Internet. By the turn of the millennium that number had grown to about 100 million! And most of them are accessing the Web.
What has made the World Wide Web so popular? That, of course, depends on whom you ask. But most will agree that these are the two primary reasons:
Ease of use. Publishing information on the Web and browsing for information are relatively easy tasks.
Quantity of content. With millions of Web pages from which to choose and thousands more being created each day, there are sites and pages to cater to almost every surfer's tastes.
A massive potential audience awaits your Web site and the services it offers. Of course, massive competition awaits you too. Most Web sites still primarily consist of static information, sometimes dubbed brochureware. That's rather sad, the Web is a powerful medium and is capable of so much more. You could, and should, be offering much more than just static text and images. You need features like:
Dynamic, data-driven Web pages
Database connectivity
Intelligent, user-customized pages
Sophisticated data collection and processing
Email interaction
Rich and engaging user interfaces
ColdFusion enables you to do all thisand more.
But you need to take a step back before starting ColdFusion development. As I mentioned, ColdFusion takes advantage of existing Internet technologies. As such, a prerequisite to ColdFusion development is a good understanding of the Internet, the World Wide Web, Web servers and browsers, and how all these pieces fit together.
The Internet
Much ambiguity and confusion surround the Internet, so we'll start with a definition. Simply put, the Internet is the world's largest network.
The networks found in most offices today are local area networks (LANs), comprised of a group of computers in relatively close proximity to each other and linked by special hardware and cabling (see Figure 1.1). Some computers are clients (more commonly known as workstations); others are servers (also known as file servers). All these computers can communicate with each other to share information.
Figure 1.1 A LAN is a group of computers in close proximity linked by special cabling.
Now imagine a bigger networkone that spans multiple geographical locations. This type of network is typically used by larger companies with offices in multiple locations. Each location has its own LAN, which links the local computers together. All these LANs in turn are linked to each other via some communications medium. The linking can be anything from simple dial-up modems to high-speed T1 or T3 connections and fiber-optic links. The complete group of interconnected LANs, as shown in Figure 1.2, is called a wide area network (WAN).
Figure 1.2 A WAN is made up of multiple, interconnected LANs.
WANs are used to link multiple locations within a single company. Suppose you need to create a massive network that links every computer everywhere. How would you do this?
You'd start by running high-speed backbones, connections capable of moving large amounts of data at once, between strategic locationsperhaps large cities or different countries. These backbones would be similar to high-speed, multilane, interstate highways connecting various locations. You'd build in fault tolerance to make these backbones fully redundant so that if any connection broke, at least one other way to reach a specific destination would be available.
You'd then create thousands of local links that would connect every city to the backbones over slower connectionslike state highways or city streets. You'd allow corporate WANs, LANs, and even individual users with dial-up modems to connect to these local access points. Some would stay connected at all times, whereas others would connect as needed.
You'd create a common communications language so that every computer connected to this network could communicate with every other computer.
Finally, you'd devise a scheme to uniquely identify every computer connected to the network. This would ensure that information sent to a given computer actually reached the correct destination.
Congratulations, you've just created the Internet!
Even though this is an oversimplification, it is exactly how the Internet works.
The high-speed backbones do exist. Many are owned and operated by the large telecommunications companies.
The local access points, more commonly known as points of presence (POPs), are run by phone companies, online services, cable companies, and local Internet service providers (also known as ISPs).
The common language is IP, the Internet protocol, except that the term language is a misnomer. A protocol is a set of rules governing behavior in certain situations. Foreign diplomats learn local protocol to ensure that they behave correctly in another country. The protocols ensure that no communication breakdowns or serious misunderstandings occur. Computers also need protocols to ensure that they can communicate with each other correctly and that data is exchanged correctly. IP is the protocol used to communicate across the Internet, so every computer connected to the Internet must be running a copy of IP.
The unique identifiers are IP addresses. Every computer, or host, connected to the Internet has a unique IP address. These addresses are made up of four sets of numbers separated by periods208.193.16.100, for example. Some hosts have fixed (or static) IP addresses, whereas others have dynamically assigned addresses (assigned from a pool each time a connection is made). Regardless of how an IP address is obtained, no two hosts connected to the Internet can use the same IP address at any given time. That would be like two homes having the same phone number or street address. Information would end up in the wrong place all the time.
Internet Applications
The Internet itself is simply a massive communications network and offers very little to most users, which is why it took 20 years for the Internet to become the phenomenon is it today.
The Internet has been dubbed the Information Superhighway, and that analogy is quite accurate. Highways themselves are not nearly as exciting as the places you can get to by traveling themand the same is true of the Internet. What makes the Internet so exciting are the applications that run over it and what you can accomplish with them.
The most popular application now is the World Wide Web. It is the Web that single-handedly transformed the Internet into a household word. In fact, many people mistakenly think that the World Wide Web is the Internet. This is definitely not the case, and Table 1.1 lists some of the more popular Internet-based applications.
All these various applicationsand many othersuse IP to communicate across the Internet. The information transmitted by these applications is broken into packets, small blocks of data, which are sent to a destination IP address. The application at the receiving end processes the received information.
Table 1.1 Some Internet-Based Applications
APPLICATION |
DESCRIPTION |
|
Simple Mail Transfer Protocol (SMTP) is the most popular email transmission mechanism, and the Post Office Protocol (POP) is the most used mail access interface. |
FTP |
File Transfer Protocol is used to transfer files between hosts. |
Gopher |
This menu-driven document retrieval system was very popular before the creation of the World Wide Web. |
IRC |
Internet Relay Chat enables real-time, text-based conferencing over the Internet. |
NFS |
Network File System is used to share files among various hosts. |
Newsgroups |
Newsgroups are threaded discussion lists, of which thousands exist (accessed via NNTP). |
Telnet |
Telnet is used to log on to a host from a remote location. |
VPN |
Virtual Private Networks facilitate the secure access of private networks over the Internet. |
WWW |
The World Wide Web. |
DNS
IP addresses are the only way to uniquely specify a host. When you want to communicate with a hosta Web server, for exampleyou must specify the IP address of the Web server you are trying to contact.
As you know from browsing the Web, you rarely specify IP addresses directly. You do, however, specify a hostname, such as www.forta.com (my Web site). If hosts are identified by IP addresses, how does your browser know which Web server to contact if you specify a hostname?
The answer is the Domain Name Service (DNS). DNS is a mechanism that maps hostnames to IP addresses. When you specify the destination address www.forta.com, your browser sends an address resolution request to a DNS server asking for the IP address of that host. The DNS server returns an actual IP address, in this case 208.193.16.100. Your browser can then use this address to communicate with the host directly.
If you've ever mistyped a hostname, you've seen error messages similar to the one seen in Figure 1.3, which tell you the host could not be found, or that no DNS entry was found for the specified host. These error messages mean the DNS server was unable to resolve the specified hostname.
Figure 1.3 Mistyping a URL often causes DNS errors.
DNS is never actually needed (well, usually, there is an exception that I'll get to in a moment). Users can always specify the name of a destination host by its IP address to connect to the host. There are, however, some very good reasons not to:
IP addresses are hard to remember and easy to mistype. Users are more likely to find www.forta.com than they are 208.193.16.100.
IP addresses are subject to change. For example, if you switch service providers, you might be forced to use a new set of IP addresses for your hosts. If users identified your site only by its IP address, they'd never be able to reach your host if the IP address changed. Your DNS name, however, stays the same even if your IP address switches. You need to change only the mapping so the hostname maps to the new, correct IP address (the new service provider usually handles that).
IP addresses must be unique, as already explained, but DNS names need not. Multiple hosts, each with a unique IP address, can all share the same DNS name. This enables load balancing between servers, as well as the establishment of redundant servers (so that if a server goes down, another server will still process requests).
A single host, with a single IP address, can have multiple DNS names. This enables you to create aliases if needed. For example, ftp.forta.com, www.forta.com, and even just plain forta.com might point to the same IP address, and thus the same server.
DNS servers are special software programs. Your ISP will often host your DNS entries, so you don't need to install and maintain your own DNS server software.
You can host your own DNS server and gain more control over the domain mappings, but in doing so, you inherit the responsibility of maintaining the server. If your DNS server is down, there won't be any way of resolving the hostname to an IP address, and no one will be able to find your site.
Intranets and Extranets
Intranets and Extranets were the big buzzwords a few years back, and while some of the hype has worn off, Intranets and Extranets are still in use and still of value. It was not too long ago that most people thought intranet was a typo; but in a very short period of time, intranets and extranets became recognized as legitimate and powerful new business tools.
An intranet is nothing more than a private Internet. In other words, it is a private network, usually a LAN or WAN, that enables the use of Internet-based applications in a secure and private environment. As on the public Internet, intranets can host Web servers, FTP servers, and any other IP-based services. Companies have been using private networks for years to share information. Traditionally, office networks have not been information friendly. Old private networks did not have consistent interfaces, standard ways to publish information, or client applications that were capable of accessing diverse data stores. The popularity in the public Internet has spawned a whole new generation of inexpensive and easy-to-use client applications. These applications are now making their way back into the private networks. The reason intranets are now getting so much attention is that they are a new solution to an old problem.
Extranets take this new communication mechanism one step further. Extranets are intranet-style networks that link multiple sites or organizations using intranet-related technologies. Many extranets actually use the public Internet as their backbones and employ encryption techniques to ensure the security of the data being moved over the network.
The two things that distinguish intranets and extranets from the Internet is who can access them and from where they can be accessed. Don't be confused by hype surrounding applications that claim to be intranet ready. If an application can be used over the public Internet, it will work on private intranets and extranets, too.
Web Servers
As mentioned earlier, the most commonly used Internet-based application is now the World Wide Web. The recent growth of interest in the Internet is the result of growing interest in the World Wide Web.
The World Wide Web is built on a protocol called the Hypertext Transport Protocol (HTTP). HTTP is designed to be a small, fast protocol that is well suited for distributed, multimedia information systems and hypertext jumps between sites.
The Web consists of pages of information on hosts running Web-server software. The host is often referred to as the Web server, which is technically inaccurate. The Web server is software, not the computer itself. Versions of Web server software can run on almost all computers. There is nothing intrinsically special about a computer that hosts a Web server, and no rules dictate what hardware is appropriate for running a Web server.
The original World Wide Web development was all performed under various flavors of Unix. The majority of Web servers still run on Unix boxes, but this is changing. Now Web server versions are available for almost every major operating system. Web servers hosted on high-performance operating systems, such as Windows 2000 and Windows XP, are becoming more and more popular. This is because Unix is still more expensive to run than Windows and is also more difficult for the average user to use. Windows XP (built on top of Windows NT) has proven itself to be an efficient, reliable, and cost-effective platform for hosting Web servers. As a result, Windows' slice in the Web server operating system pie is growing. At the same time, Linux (a flavor of Unix) is growing in popularity as a Web platform thanks to its low cost, its robustness, and the fact that it is slowly becoming more usable to less technical users.
What exactly is a Web server? A Web server is a program that serves Web pages upon request. Web servers typically don't know or care what they are serving. When a user at a specific IP address requests a specific file, the Web server tries to retrieve that file and send it back to the user. The requested file might be a Web page's HTML source code, a GIF image, a Flash file, a XML document, or an AVI file. It is the Web browser that determines what should be requested, not the Web server. The server simply processes that request, as shown in Figure 1.4.
Figure 1.4 Web servers process requests made by Web browsers.
It is important to note that Web servers typically do not care about the contents of these files. HTML code in a Web page, for example, is markup that the Web browsernot the Web serverwill process. The Web server returns the requested page as is, regardless of what the page is and what it contains. If HTML syntax errors exist in the file, those errors will be returned along with the rest of the page.
Connections to Web servers are made on an as-needed basis. If you request a page from a Web server, an IP connection is made over the Internet between your host and the host running the Web server. The requested Web page is sent over that connection, and the connection is broken as soon as the page is received. If the received page contains references to additional information to be downloaded (for example, GIF or JPG images), each would be retrieved using a new connection. Therefore, it takes at least six requests, or hits, to retrieve all of a Web page with five pictures in it.
NOTE
This is why the number of hits is such a misleading measure of Web server activity. When you learn of Web servers that receive millions of hits in one day, it might not mean that there were millions of visitors. Hits do not equal the number of visitors or pages viewed. In fact, hits are a useful measure only of changes in server activity.
Web servers often are not the only IP-based applications running on a single host. In fact, aside from performance issues, there is no reason a single host cannot run multiple services. For example, a Web server, an FTP server, a DNS server, and an SMTP POP3 mail server can run at the same time. Each server is assigned a port address to ensure that each server application responds only to requests and communications from appropriate clients. If IP addresses are like street addresses, ports can be thought of as apartment or suite numbers. A total of 65,536 ports are available on every hostports 01023 are the Well Known Ports, ports reserved for special applications and protocols (such as HTTP). Vendor-specific applications that communicate over the Internet (such as America Online's Instant Messenger, Microsoft SQL Server, and the Real Media player) typically use ports 102449151. No two applications can share a port at the same time.
Most servers use a standard set of port mappings, and some of the more common ports are listed in Table 1.2.
Table 1.2 Common IP Port Numbers
PORT |
USE |
20 |
FTP |
21 |
FTP |
23 |
Telnet |
25 |
SMTP |
43 |
Whois |
53 |
DNS |
70 |
Gopher |
79 |
Finger |
80 |
HTTP |
107 |
Remote Telnet service |
109 |
POP2 |
110 |
POP3 |
119 |
NNTP |
143 |
IMAP4, Interactive Mail Access Protocol version 4 (previously used by IMAP2) |
194 |
IRC |
220 |
IMAP3 |
389 |
LDAP, Lightweight Directory Access Protocol |
443 |
HTTPS, HTTP running over secure sockets |
540 |
UUCP, Unix to Unix Copy |
1723 |
PPTP (used by VPN's, Virtual Private Networks) |
Most Web servers use port 80, but you can change that. If desired, Web servers can be installed on nonstandard ports to hide Web servers, as well as host multiple Web servers on a single computer by mapping each one to a different port. Remember that if you do use a nonstandard port mapping, users will need to know the new port number.
NOTE
This discussion of port numbers is very important in ColdFusion MX, we'll come back to it in a few pages.