
XHTML
XHTML is a Recommendation of the W3C. It was defined as a "Reformulation of HTML 4 in XML 1.0". It
reached Recommendation status in January 2000. In August 2002, updates were made to the specification to
correct errata.
A key difference between HTML 4 and XHTML is that the latter must conform to the rules for well-formed
XML.
In addition to the well-formed constraints of XML, there is the additional requirement of conforming to a
DTD. There are three flavors of XHTML, each with its own DTD.
XHTML 1.0 Strict is the most demanding of the three. CSS must be used to define the layout, enforcing the
strict separation of content from presentation. If a device or browser is targeted without good support for
CSS, XHTML 1.0 Strict is not the way to go.
XHTML 1.0 Transitional is less strict. It allows the use of tags to control the look of your page. This is
important, as it allows content to be viewable in older browsers or devices with CSS support.
XHTML 1.0 Frameset allows for using frames. Please note there is no way to remove the visible frame
border when defining an XHTML 1.0 Frameset compliant document. Does anyone still use frames?
Here are some of the particular requirements for XHTML 1.0 Transitional. The root element must be html.
The root element must reference the XHTML namespace, http://www.w3.org/1999/xhtml. There must be a
reference to the appropriate DTD for the flavor for XHTML.
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
</html>
Because the DTD is included, validating XML parsers may be used to check the XHTML for correctness.
To point out the ramifications of well-formed XML, let's take a look at some of the more common HTML
4.0 constructs that are no longer allowed. Elements must absolutely be properly nested.
<!-- Legal XHTML -->
<p>This is an <em>emphasized</em> word.</p>
<!-- Not legal -->
<p>This is an <em>emphasized word.</p></em>
Element and attribute names must be in lower case.
<!-- This is totally wrong -->
<HTML>
<HEAD></HEAD>
<BODY BGCOLOR="red">
</BODY>
</HTML>
For non-empty elements, close tags are required.
<ul>
<li>This is not correct
<li>But this one is </li>
</ul>
Attribute values must always be quoted.
<!-- This is correct -->
<td rowspan="3">Stuff</td>
<!-- This is not -->
<td rowspan=3>More Stuff</td>
Minimized attributes are not allowed.
<!-- This is correct -->
<dl compact="compact"/>
<!-- This is not correct -->
<dl compact/>
Empty elements must have an end tag, or use the abbreviated syntax.
<!-- Correct -->
<br />
<hr></hr>
<!-- Not correct -->
<hr>
To avoid issues with scripts in XHTML documents, delimit the content with a CDATA section. This is
because any less-than sign will be interpreted as the beginning of an element tag and ampersands will be
viewed as delimiting character entities.
<script type="text/javascript">
<![CDATA[
script content
]]>
</script>
For media and mime types:
- The media type for XHTML can be "text/html".
- They may also be labeled as "application/xhtml+xml", "application/xml".
Miscellanea
- Do not use HTML comments, as per the old way, to hide script or style from non-supporting
browsers.
- The use of the style attribute is not recommended.
- The use of document.write() is problematic in XHTML, as DOM methods are the proper tool to
use.
- Take care when using JavaScript Ajax toolkits, such as Dojo, which may use special attributes
(that do not appear in the XHTML DTD).
- There is really not much support at this time for XHTML in current web browsers.
- However, the practice of following XHTML rules is a good one and encourages building cleaner
documents.
- Inclusion of the XML declaration puts the IE browser into "quirks mode".
Quirks mode was implemented in browsers to allow authors to continue displaying old HTML correctly.
Documents without a doctype (DTD) are assumed to be old. IE 6.0 had the additional rule putting itself into
quirks mode whenever the XML declaration appears.
<?xml version="1.0"?>
There is an "almost strict" mode created due to a new default behavior. Images in the new way are inline,
causing a bottom margin that could not be removed. The solution was to indicate them as block in the CSS.
However, the "almost strict" mode can be triggered by some doctypes, and does not display this standards
compliant bottom margin.
This is somewhat tragic at this point, since DTDs are deprecated in favor of XML Schema.
And, finally, while standards mode should enforce the standard, the browsers have various levels of
compliance.
So then, the actual doctype triggers various levels of compliance checking, depending on the doctype used.
XHTML
Table of Contents
Copyright (c) 2008. Intertech, Inc. All Rights Reserved. This information is to be used exclusively as an
online learning aid. Any attempts to copy, reproduce, or use for training is strictly prohibited.
Courseware
Training Resources
Tutorials
Services