1. Introduction

Contents

This section is informative.

1.1. What is XHTML?

XHTML is the reformulation of HTML 4 as an application of XML. XHTML 1.0 [XHTML1] specifies three XML document types that correspond to the three HTML 4 DTDs: Strict, Transitional, and Frameset. XHTML 1.0 is the basis for a family of document types that subset and extend HTML.

1.2. What is XHTML Modularization?

XHTML Modularization is a decomposition of XHTML 1.0, and by reference HTML 4, into a collection of abstract modules that provide specific types of functionality. These abstract modules are implemented in this specification using the XML Document Type Definition language, but an implementation using XML Schemas is expected. The rules for defining the abstract modules, and for implementing them using XML DTDs, are also defined in this document.

These modules may be combined with each other and with other modules to create XHTML subset and extension document types that qualify as members of the XHTML-family of document types.

1.3. Why Modularize XHTML?

The modularization of XHTML refers to the task of specifying well-defined sets of XHTML elements that can be combined and extended by document authors, document type architects, other XML standards specifications, and application and product designers to make it economically feasible for content developers to deliver content on a greater number and diversity of platforms.

Over the last couple of years, many specialized markets have begun looking to HTML as a content language. There is a great movement toward using HTML across increasingly diverse computing platforms. Currently there is activity to move HTML onto mobile devices (hand held computers, portable phones, etc.), television devices (digital televisions, TV-based Web browsers, etc.), and appliances (fixed function devices). Each of these devices has different requirements and constraints.

Modularizing XHTML provides a means for product designers to specify which elements are supported by a device using standard building blocks and standard methods for specifying which building blocks are used. These modules serve as "points of conformance" for the content community. The content community can now target the installed base that supports a certain collection of modules, rather than worry about the installed base that supports this or that permutation of XHTML elements. The use of standards is critical for modularized XHTML to be successful on a large scale. It is not economically feasible for content developers to tailor content to each and every permutation of XHTML elements. By specifying a standard, either software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module.

Modularization also allows for the extension of XHTML's layout and presentation capabilities, using the extensibility of XML, without breaking the XHTML standard. This development path provides a stable, useful, and implementable framework for content developers and publishers to manage the rapid pace of technological change on the Web.

1.3.1. Abstract modules

An XHTML document type is defined as a set of abstract modules. A abstract module defines one kind of data that is semantically different from all others. Abstract modules can be combined into document types without a deep understanding of the underlying schemas that define the modules.

1.3.2. Module implementations

A module implementation consists of a set of element types, a set of attribute-list declarations, and a set of content model declarations, where any of these three sets may be empty. An attribute-list declaration in a module may modify an element type outside the element types defined in the module, and a content model declaration may modify an element type outside the element type set of the module.

One implementation mechanism is XML DTDs. An XML DTD is a means of describing the structure of a class of XML documents, collectively known as an XML document type. XML DTDs are described in the XML 1.0 Recommendation [XML]. Another implementation mechanism is XML Schema [XMLSCHEMA].

1.3.3. Hybrid document types

A hybrid document type is an document type composed from a collection of XML DTDs or DTD Modules. The primary purpose of the modularization framework described in this document is to allow a DTD author to combine elements from multiple abstract modules into a hybrid document type, develop documents against that hybrid document type, and to validate that document against the associated hybrid document type definition.

One of the most valuable benefits of XML over SGML is that XML reduces the barrier to entry for standardization of element sets that allow communities to exchange data in an interoperable format. However, the relatively static nature of HTML as the content language for the Web has meant that any one of these communities have previously held out little hope that their XML document types would be able to see widespread adoption as part of Web standards. The modularization framework allows for the dynamic incorporation of these diverse document types within the XHTML-family of document types, further reducing the barriers to the incorporation of these domain-specific vocabularies in XHTML documents.

1.3.4. Validation

The use of well-formed, but not valid, documents is an important benefit of XML. In the process of developing a document type, however, the additional leverage provided by a validating parser for error checking is important. The same statement applies to XHTML document types with elements from multiple abstract modules.

A document is an instance of one particular document type defined by the DTD identified in the document's prologue. Validating the document is the process of checking that the document complies with the rules in the document type definition.

One document can consist of multiple document fragments. Validating only fragments of a document, where each fragment is of a different document type than the other fragments in the document, is beyond the scope of this framework - since it would require technology that is not yet defined.

However, the modularization framework allows multiple document type definitions to be integrated and form a new document type (e.g. SVG integrated into XHTML). The new document type definition can be used for normal XML 1.0 validation.

1.3.5. Formatting Model

Earlier versions of HTML attempted to define parts of the model that user agents are required to use when formatting a document. With the advent of HTML 4, the W3C started the process of divorcing presentation from structure. XHTML 1.0 maintained this separation, and this document continues moving HTML and its descendants down this path. Consequently, this document makes no requirements on the formatting model associated with the presentation of documents marked up with XHTML Family document types.

Instead, this document recommends that content authors rely upon style mechanisms such as CSS to define the formatting model for their content. When user agents support the style mechanisms, documents will format as expected. When user agents do not support the style mechanisms, documents will format as appropriate for that user agent. This permits XHTML Family user agents to support rich formatting models on devices where that is appropriate, and lean formatting models on devices where that is appropriate.