- Markup Languages
- What Does XML Look Like?
- What Does XML Look Like in a Browser?
- What's So Great About XML?
- Well-Formed XML Documents
- Valid XML Documents
- Parsing XML Yourself
- XML Resources
- XML Editors
- XML Browsers
- XML Parsers
- XML Validators
- CSS and XSL
- XLinks and XPointers
- URLs Versus URIs
- ASCII, Unicode, and the Universal Character System
- XML Applications
What Does XML Look Like?
So what does XML look like and how does it work? Here's an example that mimics the HTML page just introduced:
Listing ch01_02.xml
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> <MESSAGE> Welcome to the wild and woolly world of XML. </MESSAGE> </DOCUMENT>
We'll see the parts of an XML document in detail in the next chapter, but in overview, here's how this one works: I start with the XML processing instruction <?xml version="1.0" encoding="UTF-8"?> (all XML processing instructions start with <? and end with ?>), which indicates that I'm using XML version 1.0, the only version currently defined, and using the UTF-8 character encoding, which means that I'm using an 8-bit condensed version of Unicode (more on this later in the chapter):
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> <MESSAGE> Welcome to the wild and woolly world of XML. </MESSAGE> </DOCUMENT>
Next, I create a new tag named <DOCUMENT>. As we'll see in the next chapter, you can use any name, not just DOCUMENT, for a tag, as long as the name starts with a letter or underscore (_) and the following characters consist of letters, digits, underscores, dots (.), or hyphens (-), but no spaces. In XML, tags always start with < and end with >.
XML documents are made up of XML elements, and (much like HTML) you create XML elements with an opening tag (such as <DOCUMENT>), followed by any element content (if any) (such as text or other elements) and ending with the matching closing tag that starts with </ (such as </DOCUMENT>). (There are additional rules we'll see in the next chapter if the element doesn't contain any content.) It's necessary to enclose the entire document, except for processing instructions, in one element, called the root element, and that's the <DOCUMENT> element here:
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> . . . </DOCUMENT>
Now I'll add a new element that I made up, <GREETING>, which encloses text content (in this case, that's Hello From XML), to this XML document, like this:
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> . . . </DOCUMENT>
Next, I can add a new element as well, <MESSAGE>, which also encloses text content:
<?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> <MESSAGE> Welcome to the wild and woolly world of XML. </MESSAGE> </DOCUMENT>
Now the <DOCUMENT> root element contains two elements<GREETING> and <MESSAGE>. And each of the <GREETING> and <MESSAGE> elements holds text. In this way, I've created a new XML document.
Note the similarity of this document to the HTML page we saw earlier. Note also, however, that in the HTML document, all the tags were predefined and a Web browser knows how to handle them. Here we've just created these tags, <DOCUMENT>, <GREETING>, and <MESSAGE>, from thin airhow can we use an XML document like this one? What would a browser make of these new tags?