Integrated Web Design: The Meaning of Semantics (Take I)
Surely by now you've heard or seen the term semantics being bandied about by web standards evangelists and document purists. But what does the term really signify in the context of markup, and what do you need to know about semantics to improve your markup practices? This article helps define semantics in HTML and XHTML, and gets you started using elements semantically.
Semantics Is Meaning
In English, the word semantic means "of or relating to meaning." In the science of linguistics, semantics is more explicit: It's the study of meaning based on the historical and psychological significance of words and terms. While the academic study of markup vocabularies can be thought of as a form of linguistics, the real-world practice of marking up documents semantically follows the first definition; in markup, semantics is concerned with the meaning of an element, and how that element describes the content it contains. This issue was always meant to be part of HTML, but the hacking of HTML for presentational purposes made short order of any semblance of semantic purity within the language.
However, with CSS now the primary means of managing presentation of documents in contemporary web design, and the influence of XML bringing rigor back to markup, the emphasis as we write our HTML or XHTML has moved from how our content looks to what our content means.
H1 Does Not Mean "Big, Bold, and Ugly"
There are many examples of how a markup element can be meaningful, but I've found none as crystal clear as the use of headings. If you've been designing web sites for a while, you've probably run across this problem: You want to use a heading, but you really dislike the font size that the h1 element produces. So you go with an h3 because it produces a much nicer look. (I used to do this all the time!) But markup was never really meant to be presentational. We hacked it to get presentational results because, even after CSS became available, browser support was so maddeningly inaccurate and incomplete that it was downright unreliable to use CSS. Those bad old days are gone now, and we can style h1 or h3 any way we like. This gets us back to the meaning of headings, which is as straightforward as it gets: An h1 signifies the most important heading on the page, h2 is a subheading of h1, and so on. This description has nothing at all to do with the way the heading will be styled—in fact, you can make your h1 headings appear to be visually smaller than h3 if you like. The point is this: The element you choose to use has to do with the significance of the content of that element.
A Paragraph Is a Paragraph
As we study the elements we regularly use, we begin to see how meaning versus presentation creates a more ordered, logical document.
Some of the elements we've misused are also some of markup's most critical ones. Take the p element, which is used to denote paragraphs, and the br element, used to force a line break. Anyone ever commit this markup crime?
<p><br><br>
If you place the line above in a document between some text sections, you'll get some white space, but the markup has absolutely no meaning. A paragraph tag should be used to denote a paragraph, period. A line break should be used to force a break in a line, not to gain white space. I should be taken to markup prison and/or fined for having done this for years! Fortunately, I've got CSS by my side now, and can get back to cleaner living.
Oh, I'm sure you've seen this one, too:
<p> </p>
Many visual editors toss in that bit for the same reason: to get more space. Translated into literal terms, this means "a paragraph about a nonbreaking space character." If you see this line appear in your documents via a visual editor, best to get it out of there. It's far better to use CSS to gain that space, and you get more specific control to boot.
List Mania
One of the areas where semantic markup takes off running is in the use of lists. Here's another markup bit I've been guilty of:
<p><a href="home.html">Home</a><br> <a href="about.html">About Us</a><br> <a href="products.html">Products</a><br> <a href="services.html">Services</a><br> <a href="contact.html">Contact Us</a></p>
In fact, I still have markup like the above example on my current web site. It's like the old sayings: "The cobbler's kids never have shoes," or "A painter never paints her house." As an instructor, I go a step farther and hide behind the old "Do as I say, not as I do" approach.
I'm guilty of old-school markup here because the elements in use are simply not meaningful in reference to the content. Translated into plain speak, this markup means "Here's a paragraph. No, wait, here's a link. No, wait, here's a break." But, viewed in a browser, this markup displays a list of links.
Wait...a list of links? How meaningful is that? Extremely meaningful! So the semantically proper way to achieve the same goal is to place this information in a list. Most people use unordered lists for this purpose, but if to really go purist with a sequential navigation list, an ordered list is the most semantically correct approach:
<ol> <li><a href="home.html">Home</a></li> <li><a href="about.html">About Us</a></li> <li><a href="products.html">Products</a></li> <li><a href="services.html">Services</a></li> <li><a href="contact.html">Contact Us</a></li> </ol>
Of course, you're probably thinking "But what about the numbered items that the ordered list generates?" You're on the right track: CSS, of course. We can turn off the numbers using the CSS list-style-type property:
ol {list-style-type: none;}
With a value of none, no numbers (or in the case of an unordered list, bullets) will appear. Then you can add other styles to create beautiful navigation schemes, including horizontal designs, using lists in combination with CSS.
A Few Other Suspects
Along with headings, paragraphs, and breaks, many of us have been guilty of using other elements without concern for semantics, paying little or no attention to when we can put them to work to better help define our content. Here are a couple of instances of each:
- blockquote. The blockquote element is best used for a quote that's at least a paragraph long—literally a block of quote. Because browsers typically add padding to this element, it's often used to hack documents or areas of documents to appear indented or padded with white space. Semantically, reserve the blockquote element for quotes, and use CSS to get margins and padding anywhere else.
- table. The biggest HTML hack of all time. Using the table element to create a positioning grid was our only option to make our sites look great. But a table has semantic meaning: tabular data. In today's CSS-oriented design, restricting the use of table and its related elements to properly describing tabular data is semantic. Any other use is presentational.
- address. An element of yore that no one ever uses, and that can serve to mark up literal addresses. You can then style to suit.
- dl. The definition list and its related elements dt and dd have been around for a long time, so support is practically ubiquitous. Add style, and you've got an extremely useful, semantic means of creating meaningful and great-looking results.
Of course, these few elements are just the tip of the semantic iceberg. Many other elements in HTML and XHTML can be used semantically to enhance the meaning of your document's contents.
Take some time to study the available elements in HTML and XHTML. Begin thinking in terms of describing content in a meaningful way rather than making it look good. You can make it look almost any way you can imagine with CSS, so refining meaning is in your best interest for the long term.