- The Importance of Scaling
- Scaling Considerations
- How to Write ColdFusion MX Applications that Scale
- Keeping Web Site Servers in Sync
- Hardware Versus Software Load-Balancing Options
- Scaling with ClusterCATS
- Hardware Load- Balancing Options
- Finishing Up
How to Write ColdFusion MX Applications that Scale
Pay special attention to scaling issues when you are writing applications for a clustered environment. Poorly written code can suffocate any Web site, no matter how much hardware you throw at it. Building applications that scale follows good coding techniques concentrating on writing clean, well-thought-out code. Scalable code is well-organized, modular in nature, structured, and avoids common bottlenecks.
Code Organization
A stable and scalable Web site typically contains well-organized code. This code is commented and easy to follow. All images are located in their own directories and not intermixed in with CFML and HTML templates. Subdirectories exist for partitioning the application into manageable units. This organization structure can be used to place different applications, all self contained in individual directories, on different servers for distributed scaling. Good-quality code organization and application partitioning eases deployment to multiple servers and reduces maintenance time. Bottlenecks are more easily definable and maintain existing code is easier, allowing for a more stable and error free Web site. Limiting the number of templates in any directory encourages good code organization on your site, which can lead to a more scalable site.
All Web pages for an application should follow a defined coding style and conformity needs to be enforced. Implementing coding styles and standards eases application maintenance. Understanding the functionality of an existing template that looks similar to other templates in style is easier. Well-documented templates tell the developer what functions the template is supposed to perform. Not following a coding style encourages random coding habits that are hard for other developers to understand and to maintain. In turn, applications that do not follow a coding style are harder to test for quality assurance and eventually crumble under their own unmanageable weight.
Modularity
Modular code helps promote code re-use. Code that is used many times in an application, either in a custom tag or inline, might become more stable over time as developers fix bugs and tweak it for performance. The code will have undergone quality assurance testing multiple times and endured many load tests, therefore proving its durability. Well-written modular code follows good coding practices and avoids common bottlenecks. It also eases development efforts because developers do not have to rewrite this code every time they need similar functionality.
Streamlined, Efficient Code
Implementing best practices for Web site development is an important discipline for developers building highly scalable applications. The code in this example illustrates that point. The code attempts to find the name of the first administrator user. Each administrator user has a security level of 1. It queries all users and loops through the record set searching for the first administrator record and returns their names:
<cfquery name="getAdminUser" datasource="db_Utility"> SELECT * FROM tbl_User </cfquery> <!--- Loop until you find first user with security level of 1 ---> <cfloop query="getAdminUser"> <cfif trim(getAdminUser.int_Security) IS 1> <cfset AdminName = getAdminUser.vc_name> </cfif> </cfloop> Admin User Name: <cfoutput>#AdminName#</cfoutput>
The example shows inefficient code that can slow your Web site if this piece of code sustains many hits. In addition, even after it finds the first administrator record, it does not stop looping through the returned user record set. What if the user table contained thousands of records? This code would take a long time to process and consume valuable system resources.
Here's an example of more efficient code for finding the first administrator record and returning the name:
<cfquery name="getAdminUser" datasource="db_Utility"> SELECT TOP 1 vc_name FROM tbl_User WHERE int_security = 1 </cfquery> <cfif getAdminUser.RecordCount GT 0> <cfset AdminName = getAdminUser.vc_name> </cfif> Admin User Name: <cfoutput>#AdminName#</cfoutput>
This code is much more efficient and is easier to understand. The query isolates only the records and columns that need to be used in the code. It will only return one record if any records have a security level of 1.
Avoiding Common Bottlenecks
The preceding example illustrated a simple way to write more efficient code. Let's look at other coding bottlenecks and discuss ways to avoid them.
Querying a Database
Pay careful attention to the number of records to be returned and the structure of the SQL itself when writing queries to retrieve data for outputting on the screen or into form variables. A bottleneck, common to complex queries, results from a query returning more records than are required and using only a subset of the returned records. Such a query should be rewritten to return only the required records.
In addition, database software is much more efficient at processing database requests than Cold-Fusion is. For a highly scalable Web site, it is best to create views for selecting data and stored procedures for inputting, adding, and deleting data from of the database. Design your ColdFusion templates to call these views and stored procedures to interact with the database. Asking the database server to perform this kind of work is much more efficient and tends to stabilize performance. Here is an example of a poorly coded set of queries to retrieve data. This code is not scalable and will affect Web site performance. Notice that the same table is queried twice to return different data. One query, in this case, is sufficient:
<cfquery name="getUser" datasource="db_Utility"> SELECT vc_name FROM tbl_User WHERE int_userID = 26 </cfquery> <cfset userName = getUser.vc_name> Hello <cfoutput>#userName#</cfoutput> some more code here ...... <cfquery name="getUserInfo" datasource="db_Utility"> SELECT int_userid, vc_username, vc_password, vc_email, dt_createdate FROM tbl_User WHERE vc_name = '#userName#' </cfquery> Here is the information you requested:<br> <cfoutput query="getUserInfo"> Your User ID: #int_userid#<br> Your User Name: #vc_username#<br> Your Password: #vc_password#<br> Your Email: #vc_email#<br> Date you joined: #dt_createdate# </cfoutput>
As you can see, only one query needs to be called to return this data. This is a common mistake.
Absolute Path, Relative Path, and Other Links
One of the more common problems I have seen in Web applications is confusion about when to use the absolute or relative path for a link. Both methods can be employed while coding, but you must be cognizant of the impact of each approach when you are coding for a clustered environment. Questions to ask before utilizing absolute or relatives paths in your application include:
Will the link be moved at any point in time? If the answer is yes, an absolute path will be a more viable option, since it is assumed the new path can be mapped on the Web server to be the same mapping as before.
Does the path exist under the current subdirectory? If the answer is yes, then relative path mapping will work.
NOTE
Relative path is relative to the current template. Absolute path is the path relative to the root of the Web site.
Hard-coding links will cause problems with clustered machines. Say that you have an upload facility on your Web site that allows users to upload documents. The code needs to know a physical path in order to upload the documents to the correct place. Server 1 contains the mapped drive E pointing to the central file server where all the documents are stored. The file server has an uploadedfiles directory located on its D drive, so the path can be set to e:\uploadedfiles. But Server 2 does not contain a mapped drive named E pointing to the file server. If you deploy your code from Server 1 to Server 2, the upload code will break because Server 2 does not know where e:\uploadedfiles is. It is better to use Universal Naming Convention (UNC) syntax in the upload path: \\servername\d\uploadedfiles. Note that having one file server in the configuration described creates a single point of failure for your Web site.
NOTE
Universal Naming Convention (UNC) is a standard method for identifying the server name and the network name of a resource. UNC names use one of the following formats:
\\servername\netname\path\filename \\servername\netname\devicename
Nesting Levels Too Deeply
Nesting is considered a valuable tool for developers to build complex applications. Nesting too many levels, however, can cause code to become unmanageable and virtually incomprehensible. A developer working on a Web site where nesting is deep may eventually stop trying to follow all of the levels and write new work-around code. This approach may affect how the Web site performs. Too many nested levels in code can also affect performance because nested code almost always attempts to perform too many functions at once. Simplify your applications to perform fewer a functions with each call. Doing so will streamline the application, reduce nested layers, improve code readability, and increase performance.