- Chapter 17: Working with XML and JavaScript
- What Is XML?
- Reading and Showing XML Data with JavaScript
- Summary
Reading and Showing XML Data with JavaScript
As noted previously, Version 6 JavaScript browsers seem to be coming together over the W3C DOM. Several key methods and properties in JavaScript can help in getting information from an XML file. In the section, a very simple XML file is used to demonstrate pulling data from XML into an HTML page using JavaScript to parse (interpret) the XML file. Unfortunately, the examples are limited to using IE5+ on Windows. (The same programs that worked fine using IE5+ on Windows bombed using IE5+ on the Mac using either OS 9+ or OS X.)
However, the great majority of keywords used in the scripts are W3C DOM compliant, and the only keywords required from the Microsoft-unique set are XMLdocument and document.all(). All of the other keywords are found in NN6+. Table 15.1 shows the W3C JavaScript keywords used in relationship to the XML file examples.
Table 15.1 Selected Element Keywords in JavaScript
Property |
Meaning |
documentElement |
Returns the root element of the document |
firstChild |
Is the first element within another element (the first child of the current node) |
lastChild |
Is the last element within another element (the last child of the current node) |
nextSibling |
Is the next element in the same nested level as the current one |
previousSibling |
Is the previous element in the same nested level as the current one |
nodeValue |
Is the value of a document element |
getElementsByTagName |
Used to place all elements into an object |
Finding Children
To see how to pull data from an XML file, all examples use the following XML file. The intentional simplicity of the XML file is to help clarify using JavaScript with XML and does not represent a sophisticated example of storing data in XML format.
writers.xml
<?xml version="1.0" ?> <writers> <EnglishLanguage> <fiction> <pen> <name>Jane Austin</name> <name>Rex Stout</name> <name>Dashiell Hammett</name> </pen> </fiction> </EnglishLanguage> </writers>
The XML file contains a typical arrangement of data using a level of categories that you might find in a bookstore or library arrangement. It is meant to be intuitively clear, as is all XML.
The trick in all of the following scripts is to understand how to find exactly what you want. The first three scripts that follow use slightly different functions to find the first child, last child, and sibling elements. The first script provides the entire listing, and the second two just show the key JavaScript function within the script. They all use the following common CSS file.
readXML.css
body { font-family:verdana; color:#ff4d00; font-size:14pt; font-weight:bold; background-color:#678395; } div {background-color:#c1d4cc;} #blueBack {background-color:#c1d4cc}
To read the first child of an element, the reference is to document.firstChild. Given the simplicity of the sample XML file (writers.xml), the script just keeps adding .firstChild to each of the elements as it makes its way to the place in the XML file where the information with the data can be found.
However, before even going after the first child of the <name> element, the HTML page sets up a connection to the XML page using an <xml> container understood by Internet Explorer 5+ in a Windows context. (At the time of this writing, IE6 was available, and it worked fine with the following scripts, but only on a Windows PC.) The ID writersXML is defined as the XML object first, and then it becomes part of a document, myXML, in this line:
myXML= document.all("writersXML").XMLDocument
The document.all().XMLDocument is a Microsoft IE subset of JavaScript. After this point, though, the JavaScript is pure W3C DOM and is consistent with NN6+. With this line, writersNode is defined as the root element of the XML file with the documentElement property:
writersNode = myXML.documentElement
Its first child is the <EnglishLanguage> node, so the variable languageNode is defined as writersNode.firstChild. Then the rest of the nodes in the XML document are defined until the first child of the <name> node is encountered and its node value is placed into a variable to be displayed in a text window. All of the processes are placed into the findWriter() user function. Figure 17.1 shows how the page looks when opened in a browser.
readFirstChild.html
<html> <head> <link rel="stylesheet" href="readXML.css" type="text/css"> <title>Read First Child</title> <xml ID="writersXML" SRC="writers.xml"></xml> <script language="JavaScript"> function findWriter() { var myXML, writersNode, languageNode, var penNode,nameNode,display myXML= document.all("writersXML").XMLDocument writersNode = myXML.documentElement languageNode = writersNode.firstChild fictionNode = languageNode.firstChild penNode = fictionNode.firstChild nameNode = penNode.firstChild display =nameNode.firstChild.nodeValue; document.show.me.value=display } </script> </head> <body> <span ID="blueBack">Read firstChild</span> <div> <form name="show"> <input type=text name="me"> <input type="button" value="Display Writer" onClick="findWriter()"> </form> </div> </body> </html>
Figure 17.1 The first child of <pen> is displayed.
Reading the last child uses an almost identical function. However, when the script comes to the parent element <pen> of the <name> node, it asks for the last child, or simply the one at the end of the list before the </pen> closing tag.
readLastChild.html (Function Only)
function findWriter() { var myXML, writersNode, languageNode, var penNode,nameNode,display myXML= document.all("writersXML").XMLDocument writersNode = myXML.documentElement languageNode = writersNode.firstChild fictionNode = languageNode.firstChild penNode = fictionNode.firstChild nameNode = penNode.lastChild //Here is the key line display =nameNode.firstChild.nodeValue; document.show.me.value=display }
Because the DOM contains keywords for the first and last children, finding the beginning and end of an XML file is pretty simple. What about all of the data in between? To display the middle children, first you have to find the parent and start looking at the next or previous sibling until you find what you want. This next function shows how that is done using the nextSibling property.
readSibling.html (Function Only)
function findWriter() { var myXML, writersNode, languageNode var penNode,nameNode,nextName,display myXML= document.all("writersXML").XMLDocument writersNode = myXML.documentElement languageNode = writersNode.firstChild fictionNode = languageNode.firstChild penNode = fictionNode.firstChild nameNode = penNode.firstChild nextName=nameNode.nextSibling //Not the first but the next! //The first child is the only child in the next node. display =nextName.firstChild.nodeValue; document.show.me.value=display }
The three functions differ little in what they do or how they do it. However, using this method to find a single name in a big XML file could take a lot of work. As you might have surmised, because the XML file is part of an object, you can extract it in an array-like fashion.
Reading Tag Names
Instead of tracing the XML tree through child and parent nodes, you can use the getElementByTagName() method. By specifying the tag name that you're seeking, you can put all of the tag's values into an object and pull them out using the document.item() method. The process is much easier than going after first and last children or siblings and, I believe, much more effective for setting up matching components. The following script is similar to the others and uses the same external Cascading Style Sheet. The form is slightly different at the bottom, so the whole program is listed rather than just the function. Figure 17.2 shows the output in the browser.
readNode.html
<html> <head> <link rel="stylesheet" href="readXML.css" type="text/css"> <title> Read the whole list </title> <xml ID="writersXML" SRC="writers.xml"></xml> <script language="JavaScript"> function findWriters() { var myXML, myNodes; var display=""; myXML= document.all("writersXML").XMLDocument; //Put the <name> element into an object. myNodes=myXML.getElementsByTagName("name"); //Extract the different values using a loop. for(var counter=0;counter<myNodes.length;counter++) { display += myNodes.item(counter).firstChild.nodeValue + "\n"; } document.show.me.value=display; } </script> </head> <body> <span ID="blueBack"> Read All Data </span> <div> <form name="show"> <textarea name="me" cols=30 rows=5></textarea><p> <input type="button" value="Show all" onClick="findWriters()"> </form></div> </body> </html>
Figure 17.2 All of the data in the specified tag category are brought to the screen.
At this stage in browser development, the great majority of terms used in extracting data from an XML file are cross-browsercompatible, especially when Version 6 of both browsers are compared side to side. In large measure, this is due to the fact that the browser manufacturers are beginning to comply with the W3C DOM recommendations. The Microsoft extensions to the W3C DOM could become adopted as part of the DOM (as some have already), or the W3C DOM could develop functional equivalents. However, at the time of this writing, there might not actually be a W3C DOMcompliant method of the crucial first step of loading an XML document into an HTML page. So, in the meantime, which I hope is short, it is necessary to use the single-browser, single-platform techniques shown previously.
Well-Formed XML Pages
A well-formed XML page requires either a DTD or a schema (exclusively Microsoft).The DTD tells the parser what kind of data is contained in the XML file. If XML pages were parsed only by JavaScript, no one would worry too much about DTD. However, when a browser parses an XML file, it looks at the DTD to determine what kind of data are in the file and how it is ordered. XML validators scan XML files and determine whether they are valid, but browsers do not validate XML files. (A good validator can be found at Brown University's site, http://www.stg.brown.edu/service/xmlvalid/.) If an XML file is not valid, problems are likely to crop up.
Validation takes a little extra work, but you will know that your XML file is well formed, and it won't run into problems down the line somewhere. Using the example XML file used previously, a DTD has been added in the following file, writersWF.xml.
All document type definitions begin with this line:
<!DOCTYPE rootName [
Because writers is the root element, it goes in as the root name. Next, the first child of the root is declaredin this case, the child is <EnglishLanguage>, so the !ELEMENT declaration is as follows:
<!ELEMENT writers (EnglishLanguage)>
You continue with !ELEMENT declarations until all of them are made. If more than one instance of an element is within another element's container, a plus sign (+) is added to the end of the element name. Because three nodes using <name> are within the <pen> element, the !ELEMENT declaration for <name> has a plus after it:
<!ELEMENT pen (name+)>
Finally, close up the !DOCTYPE declaration using this code:
]>
Your file is ready for validation. The complete listing follows.
writersWF.xml
<?xml version="1.0" ?> <!DOCTYPE writers [ <!ELEMENT writers (EnglishLanguage)> <!ELEMENT EnglishLanguage (fiction)> <!ELEMENT fiction (pen)> <!ELEMENT pen (name+)> <!ELEMENT name (#PCDATA)> ]> <writers> <EnglishLanguage> <fiction> <pen> <name>Jane Austin</name> <name>Rex Stout</name> <name>Dashiell Hammett</name> </pen> </fiction> </EnglishLanguage> </writers>
Will this new validated file work with the example scripts provided previously? You bet! In all of the previous files showing how JavaScript parses XML files, substitute writersWF.xml for the original writers.xml in this line:
<xml ID="writersXML" SRC="writers.xml"></xml>
When you re-run the script in IE5+ on your Windows PC, you will see exactly the same results. The only difference is that now your XML file is well formed.
XHTML
Using XML, HTML, and JavaScript together can be a bit confusing. You might want to take a look at XHTML, where you will find better integration between XML and HTML. XHTML brings well-formed code to HTML. At the same time, you can insert JavaScript into the middle of XHTML pages for adding dynamic action. A good place to start is with XHTML, by Chelsea Valentine and Chris Minnick (New Riders, 2001).