Scaling with ColdFusion MX
IN THIS CHAPTER
The Importance of Scaling
Scaling Considerations
How to Write ColdFusion MX Applications that Scale
Keeping Web Site Servers in Sync
Hardware Versus Software Load-Balancing Options
Scaling with ClusterCATS
Hardware Load-Balancing Options
Finishing Up
In the first two chapters of this book, you learned about high availability and monitoring system performance. In the following two chapters you will learn about scaling with Java and managing session state in a cluster. This chapter will concentrate on what you need to know about scaling with Cold-Fusion MX: scaling considerations, writing ColdFusion MX applications that will scale, keeping server data in sync, the differences between hardware and software load-balancing options, scaling with ClusterCATS, and scaling with hardware-based load-balancing devices. I'll focus on the developer's point of view when considering scaling with ColdFusion MX. This chapter highlights what to do in order to build highly scalable ColdFusion MX applications that can be deployed on one, two, or many ColdFusion MX servers.
The Importance of Scaling
There are at least two different methods for hosting a single Web site across multiple Web servers. These include:
Distributed Functionality. Hosting a site's functionality across multiple machines.
Clustered. Combining two or more servers together, mirroring all web site functionality on each machine. All clustered servers take turns hosting a web site user.
If you find that indexing your site and running full-text searches is slow, you can set up a separate Web server that just does indexing and call it search.mycompany.com. If e-commerce and credit card validation are your bottleneck, you can set up another machine called store.mycompany.com. Many successful Web sites use machines added in this manner to accomplish dedicated tasks; these machines enable the Web servers to focus on what they do best. Some sites under particularly heavy traffic even put images on a separate server, such as images.mycompany.com, to speed up processing by separating their images from other traffic. This type of distributed scaling is relatively easy because each machine can have a unique configuration and perform a very specific duty. You don't have to deal with the issues involved in keeping content consistent and synchronized across servers.
This strategy might work in some situations, but it has many weaknesses. The first weakness is that this strategy doesn't provide any server redundancy. For example, if you move a search to a separate server, and your search machine crashes, you've just lost all search functionality, even though your main Web server is delivering content properly. However, if you provide identical services on the two machines, the failure of one server has an impact only on your ability to handle high traffic. The failure doesn't deactivate any features of your site.
The second problem you might encounter with this distributed strategy happens when you get so much traffic that one box isn't enough to handle your dedicated function. What happens when your search function becomes so popular that your single search box is running out of resources? Do you subdivide your search into, say, a site search and news feed search and set up site.search.mycompany.com and newsfeed.search.mycompany.com? This approach is more complicated than it might sound. You probably have to go into your existing code and change all the old references to search.mycompany.com. You also must deal with people who might have bookmarked your search site. And that's assuming you can subdivide your search function into two separate pieces. How do you split an e-commerce application that just does credit card queries? Visa.store.yourmachine.com and amex.store.yourmachine.com? You can see that this solution isn't reasonable.
Clustering, the second alternative for hosting your Web site across multiple servers, is potentially more scalable and viable in the long term. You can still distribute Web site functionality onto separate servers. When a dedicated server cannot handle the current volume, you can add another dedicated server to provide this functionality as well. The two dedicated servers can then be clustered. This method provides users a seamless experience on your site; ideally, your users don't know the site they're visiting is a collection of servers. This group of computers providing identical content and services is generally known as a cluster. Your entire Web site infrastructure, including Web servers, ColdFusion servers, database servers, and files servers can be called a server farm.
Running one Web site on one server is relatively straightforward: You know that every Web request goes to the same Web server software and ColdFusion MX service, with the same settings and environment. But as soon as you add a second server, you are faced with a host of technical challenges. I'll discuss some of these implications in the following sections. Later, in this chapter, we'll review some of the main technologies that enable you to effectively distribute your traffic across multiple servers and how such technologies are implemented.