Mathematical Markup Language (MathML) Version 2.0
6 Entities, Characters and Fonts
7 The MathML Interface
7.1 Embedding MathML in HTML
7.1.1 The Top-Level math
Element
7.1.2 Requirements for a MathML Browser Interface
7.1.3 Invoking Embedded Objects as Renderers
7.1.4 Invoking Other Applications
7.1.5 Mixing and Linking MathML and HTML
7.2 Generating, Processing and Rendering MathML
7.2.1 MathML Compliance
7.2.2 Handling of Errors
7.2.3 An Attribute for Unspecified Data
7.3 Future Extensions
7.3.1 Macros and Style Sheets
7.3.2 XML Extensions to MathML
8 Document Object Model for MathML
To be effective, MathML must work well with a wide variety of renderers, processors, translators and editors. This chapter addresses some of the interface issues involved in generating and rendering MathML. Since MathML exists primarily to encode mathematics in Web documents, perhaps the most important interface issues are related to embedding MathML in HTML.
There are three kinds of interface issues that arise in embedding MathML in HTML. First, MathML must be semantically integrated into HTML. Browsers must recognize MathML markup as embedded XML content, and not as an HTML syntax error. This is primarily a question of managing namespaces in XML.
Second, MathML rendering must be integrated into browser software. Some browsers already implement MathML rendering natively, and one can expect more browsers will do so in the future. At the same time, other browsers have developed infrastructure to facilitate the rendering of MathML and other embedded XML content by embedded elements. While substantial progress has been made, further improvement in coordination between browsers and embedded elements will be necessary. For example, better support for coordinating initialization and size negotiation is needed, as is better support for high-resolution printing.
Third, other tools for generating and processing MathML must be able to intercommunicate. A number of MathML tools have been or are being developed, include editors, translators, computer algebra systems, and other scientific software. However, since MathML expressions tend to be lengthy, and prone to error when entered by hand, special emphasis must be given to insuring that MathML can be easily generated by user-friendly conversion and authoring tools, and that these tools work together in a dependable, platform and vendor independent way.
The W3C Math working group is committed to providing support software vendors developing all kinds of MathML tools. the working group monitors the public www-math@w3.org mailing list, and will attempt answer questions about the MathML specification. The working group also intends to try to stimulate the formation of MathML developer and user groups. For current information about MathML tools, applications and user support activities, consult the W3C Math home page.
MathML specifies a single top-level math
element, which encapsulates each instance of MathML markup within an
HTML page. As such, the math
element provides
an attachment point for information which affects a MathML expression
as a whole.
In practice, the math
element also serves
as the interface for embedding MathML in HTML. In this capacity, the
math
element simultaneously signals the
semantic inclusion of MathML (XML) content in HTML, and provides the
necessary machinery for rendering its content in a browser either by
invoking an embedded element, or by specifying parameters for a native
renderer in the browser. Both semantic inclusion and rendering
present a number of issues that extend beyond the scope of this
specification.
In order to produce a complete and self-contained description of
MathML, this document only specifies the attributes and usage of the
math
element as a top-level element for
MathML, and not as an interface element. The W3C Math working group
will continue working closely with other World Wide Web Consortium
activities to insure that emerging standards for embedding XML in HTML
accommodate seamless integration of MathML in HTML. Section 7.1.2
lists requirements which an interface element for MathML would have to
meet in order to fully integrate MathML into HTML. However, it is
important to note that the MathML specification is independent of
embedding mechanisms.
math
ElementAs stated above, MathML specifies a single top-level
math
element. All other MathML content must be contained
in a math
element; equivalently, every valid, complete
MathML expression must be contained in <math>
tags. The math
element must always be the outermost
element in a MathML expression; it is an error for one
math
element to contain another.
Applications which return subexpressions of other MathML
expressions, for example as the result of a cut-and-paste operation,
should always wrap them in <math>
tags. The
presence of enclosing <math>
tags should be a
reasonable heuristic test for MathML content. Similarly, applications
which insert MathML expressions in other MathML expressions must take
care to remove the <math>
tags from the inner
expressions.
The math
element can contain an arbitrary number of children
schemata. The children schemata render by default as if they were
contained in a mrow
element.
The attributes of the math
element are:
macros
is provided to make
possible future development of more streamlined, MathML-specific macro
mechanisms.mode
attribute specifies whether
the enclosed MathML expression should be rendered in a display style
or an in-line style.
The default is mode
="inline".
This attribute is deprecated in favor of the standard CSS2
`display' property with the analogous block
and
inline
values.
The top-level math
element described in the
preceding section is concerned with encapsulating MathML content and
defining attributes which affect the entire enclosed expression. It is, in
a sense, `inward looking'. However, to render MathML properly
in a browser, and to integrate it properly into an HTML document, an
`outward looking' interface element is also required. This
interface element must be aware of its surrounding environment, and provide
a mechanism for passing information between the browser, and the MathML
renderer.
As noted above, the MathML interface element and the MathML
top-level element are in practice one and the same. The math
element must serve both to encapsulate MathML
content, and admit additional attributes for controlling how a MathML
renderer should interact with the surrounding context, typically a
browser.
While general mechanisms for embedding XML in HTML are beginning to be deployed, wide variations in strategy and level of implementation remain between vendors. Consequently, the remainder of this section describes attributes and functionality that would be highly desirable in a MathML interface element. In the near term, implementors attempting to provide interim solutions for rendering MathML in browsers should try to give authors some way of passing the following interface attributes to the renderer:
Attributes which apply to the MathML interface element necessarily
take effect when the document is first loaded, and therefore suffer
the limitation that they cannot change in response to reader
interaction unless they are exposed in the Document Object Model
(http://www.w3.org/TR/WD-DOM-Level-2)
and subject to programmatic control.
The height
and width
attributes
are good examples; if the reader changes the current font size, the
height and width of the embedded mathematical fragments also need to change.
At present, browser support for the DOM, and embedded element access to the DOM, is too limited to provide acceptable rendering for MathML. The W3C Math working group is working closely with the Document Object Model working group in an effort to provide better communication between embedded MathML renderers and browsers (see appendix E [Document Object Model for MathML]).
The basic requirements for communication between an embedded MathML and a browser include:
In browsers where MathML is not natively supported, we anticipate that MathML rendering will be carried out via embedded objects such as plug-ins, applets, or helper applications. In the near term, the W3C Math working group advocates the use of MIME types to bind embedded MathML to renderers. Mechanisms for assigning MIME types already exist in HTML, and mechanisms for registering and automatically invoking embedded elements such as plug-ins based on MIME type already exist in Web browsers.
The type
attribute, described in the previous
section as a requirement for the MathML interface element, is intended to
associate a MIME type with its content. The HTML element META is proposed
as a means of specifying document-wide default MIME types for an
element.
We propose a simple MIME type naming convention which is flexible enough to accommodate several common situations:
We propose that generic MathML be assigned the MIME type
text/mathml
, and for browser registry, we suggest the
standard file extension .mml
be used. To invoke specific
renderers, we suggest assigning a MIME type of the following format:
text/mathml-renderer
A user downloads and installs renderer A, and registers it with the
browser for the text/mathml
MIME type to process generic
MathML. However renderer A also accepts TEX as an input syntax, and
therefore during the installation process, it requests to be
registered for application/x-tex
as well. Later, the user
discovers renderer B provides additional features, such as cut and
paste capability. Therefore, the user downloads, installs and
registers renderer B for the text/mathml-rendererB
MIME
type.
An author then creates a document that contains the the following line in the document header:
<META Content-math-Type="text/mathml">
Later, the document contains the following expressions:
<math> <msup><mi>x</mi><mn>2</mn></msup> </math> <math type="text/mathml-rendererB"> <mi>α</mi><mo>=</mo><mn>0.4</mn> </math>
When our hypothetical reader views this document, renderer A is
invoked to process the first expression, while renderer B is invoked
for the second. Later, when our hypothetical reader later views a
document with MIME type application/x-tex
, renderer A is
again invoke, this time in TEX processing mode.
Although rendering MathML expressions typically occurs in place in a Web browser, other MathML processing functions take place more naturally in other applications. Particularly common tasks include opening a MathML expression in an equation editor or computer algebra system.
At present, there is no standard way of specifying that embedded content should be rendered with one application, edited in another, and evaluated by a third. As work progresses on coordination between browsers and embedded elements and the Document Object Model (DOM), providing this kind of functionality should be a priority. Both authors and readers should be able to indicate a preference about what MathML application to use in a given context. For example, one might imagine that some mouse gesture over a MathML expression would cause a browser to present the reader with a pop-up menu, showing the various kinds of MathML processing available on the system, and the MathML processors recommended by the author.
Since MathML will probably be widely generated by authoring tools,
it is particularly important that opening a MathML expression in an
editor should be easy to do and to implement. In many cases, it will
be desirable for an authoring tool to record some information about
its internal state along with a MathML expression, so that an author
can pick up editing where he or she left off. The MathML
specification does not explicitly contain provisions for recording
authoring tool information. In some circumstances, it may be possible
to include authoring tool information which applies to an entire
document as meta data; interested readers are encouraged to consult
the W3C Metadata Activity for current information about metadata and
resource definition. For encoding authoring tool state information
that applies to a particular MathML instance, readers are referred to
the possible use of the semantics
element for this
purpose.
In order to be fully integrated into HTML, it should be possible not only to embed MathML in HTML, but also to embed HTML in MathML. However, the problem of supporting HTML in MathML presents many difficulties. Moreover, the problems are not specific to MathML; they are problems for XML applications in HTML generally. Therefore, at present, the MathML specification does not permit any HTML elements within a MathML expression, although this may be subject to change in a future revision of MathML, when mechanisms for embedding XML in HTML have been further developed.
In most cases, HTML elements either do not apply in mathematical contexts (headings, paragraphs, lists, etcetera), or MathML already provides equivalent or better functionality specifically tailored to mathematical content (tables, style changes, etcetera). However, there are two notable exceptions.
MathML has no element which corresponds to the HTML anchor element a. In HTML, anchors are used both to make links, and to provide locations to link to. MathML, as an XML application, defines links by the use of the XLink mechanism. However, MathML at present does not provide a way for other documents to make links into a MathML expression. One reason for this omission is that linking into embedded XML content is better addressed as part of a general mechanism for embedding XML in HTML. Moreover, until browsers either natively implement MathML rendering, or substantially better coordination between embedded elements and browsers becomes possible, there is no reasonable way of implementing links into MathML expressions.
MathML linking elements are generic XML linking elements as described in the XML Linking Language (XLink) working draft. The reader is cautioned that this is as present still a working draft, and is therefore subject to future revision. Since the MathML linking mechanism is defined in terms of the XML linking specification, the same proviso holds for it as well.
A MathML element is designated as a link by the presence of the
xlink:href
attribute. To use the xlink:href
attribute, it is also necessary to
declare the xlink namespace. Thus, a typical MathML link might look like:
<mrow xmlns:xlink="http://www.w3.org/XML/XLink/0.9" xlink:href="sample.xml"> ... </mrow>
Issue (add-xlink-to-DTD): If we say this, we ought to add these attributes to all linkable elements in the DTD. See section 5.1 of the XLink working draft.
MathML designates that almost all elements can be used as an XML
linking element. The only elements which cannot serve as linking
elements are those such as the <sep/>
element which exist
primarily to disambiguate other MathML constructs and in general do
not correspond to any part of a typical visual rendering. The full
list of exceptional elements which cannot be used as linking elements
is given below in table 7.1.5.1.
<mprescripts/> |
<none/> |
<sep/> |
<power/> |
<malignmark/> |
<maligngroup/> |
The IMG element has no MathML equivalent. The decision to omit a general image inclusion mechanism in MathML was based on several factors. First, a simple mechanism for including images in MathML along the lines of the IMG element would not be more closely tied to mathematical content or notation than the HTML IMG element itself. Therefore, such an element would likely be superseded by the IMG element if it becomes possible to mix XML and HTML generally.
Another reason for not providing an image facility is that MathML takes great pains to make the notational structure and mathematical content it encodes easily available to processors while information contained in images is only available to a human reader looking at a visual representation. Thus, for example, in the MathML paradigm, it would be preferable to introduce new glyphs by the creation of special symbol fonts, rather than simply including them as images.
Finally, apart from the introduction of new glyphs, many of the situations where one might be inclined to use an image amount to some sort of labeled diagram. For example, knot diagrams, Venn diagrams, Dynkin diagrams, Feynman diagrams and complicated commutative diagrams all fall into this category. As such, their content would be better encoded via some combination of structured graphics and MathML markup. Because of the generality of the `labeled diagram' construction, the definition of a markup language to encode such constructions extends beyond the scope of the W3C Math activity. (See http://www.w3.org/Graphics for further W3C activity in this area.)
Information is increasingly generated, processed and rendered by software tools. The exponential growth of the Web is fueling the development of advanced systems for automatically searching, categorizing, and interconnecting information. Thus, although MathML can be written by hand and read by humans, the future of MathML is also tied to the ability to process it with software tools.
There are many different kinds of MathML editors, translators, processors and renderers. What it means to support MathML varies widely between applications. For example, the issues that arise with a MathML-compliant validating parser are very different from those for a MathML-compliant equation editor.
In this section, guidelines are given for describing different types of MathML support, and for quantifying the extent of MathML support in a given application. Developers, users and reviewers are encouraged to use these guidelines in characterizing products. The intention behind these guidelines is to facilitate reuse and interoperability between MathML applications by accurately characterizing their capabilities in quantifiable terms.
A well-formed MathML expression is a XML construct determined by the MathML DTD together with the additional requirements given in the specifications of the MathML document.
We define a `MathML processor' to mean any application that can accept, produce, or `roundtrip' a well-formed MathML expression. An example of an application that might round-trip a MathML expression might be an editor that writes a new file even though no modifications are made.
We specify three forms of MathML compliance:
Beyond the above definitions, the MathML specification makes no demands of individual processors. In order to guide developers, the MathML specification includes advisory material; for example, there are suggested rendering rules included in Chapter 3. However, in general, developers are given wide latitude in interpreting what kind of MathML implementation is meaningful for their own particular application.
To clarify the difference between compliance and interpretation of what is meaningful, consider some examples:
As the previous examples show, to be useful, the concept of MathML compliance frequently involves a judgment about what parts of the language are meaningfully implemented, as opposed to parts that are merely processed in a technically correct way with respect to the definitions of compliance. This requires some mechanism for giving a quantitative statement about which parts of MathML are meaningfully implemented by a given application. To this end, the W3C Math Working Group has provided a test suite of MathML expressions at http://www.w3.org/Math/testsuite.
The test suite consists of a large number of MathML expressions categorized by markup category and dominant MathML element being tested. The existence of this test suite makes is possible, for example, to characterize quantitatively the hypothetical computer algebra interface mentioned above by saying that it is a MathML-input compliant processor which meaningfully implements MathML content markup, including all of the expressions given under http://www.w3.org/testsuite/tests/4.
Developers who choose not to implement parts of the MathML specification in a meaningful way are encouraged to itemize the parts they leave out by referring to specific categories in the test suite.
For MathML-output-compliant processors, there is also a MathML validator online at http://www.w3.org/Math/validator. Developers of MathML-output-compliant processors are encouraged to verify their output using this validator.
Customers of MathML applications who wish to verify claims as to which parts of the MathML specification are implemented by an application are encouraged to use the test suites as a part of their decision processes.
If a MathML-input-compliant application receives input containing
one or more elements with an illegal number or type of attributes or
children schemata, it should nonetheless attempt to render all the
input in an intelligible way, i.e. to render normally those parts of
the input which were well-formed, and to render error messages
(rendered as if enclosed in an <merror>
element) in place
of ill-formed expressions.
MathML-output-compliant applications such as editors and
translators may choose to generate <merror>
expressions
to signal errors in their input. This is usually preferable to
generating well-formed, but possibly erroneous, MathML.
The MathML attributes described in the MathML specification are necessary for display and content markup. Ideally, the MathML attributes should be an open-ended list so that users could add specific attributes for specific renderers. However, this can't be done within the confines of a single XML DTD. Although it can be done using extensions of the standard DTD, some authors will wish to use nonstandard attributes while remaining strictly in compliance with the standard DTD.
To allow this, this specification also allows the attribute
other
="..." for all elements, for use as a hook to pass
on renderer-specific information. In particular, it can be used as a
hook for passing information to audio renderers, computer algebra
systems, and for pattern matching in any future macro/extension
mechanism. This idea is used in other languages. For example,
PostScript comments are widely used to pass information that is not
part of PostScript.
At the same time, the intent of the other
attribute is
not to encourage software developers to use this as a loophole for
circumventing the MathML core markup conventions. We trust both
authors and applications will use the other
attribute
judiciously.
The value of the other
attribute should be a string
containing an attribute list in valid XML format (i.e. attr1="val1"
attr2="val2"; ..., with appropriate escaping of the double
quotes). Renderers which accept nonstandard attributes directly should
also accept them when they occur within the string value of the
other
attribute. This is not required for attributes
specifically documented by the MathML standard.
MathML is in its infancy; it is to be expected that MathML will need to be extended and revised in various ways. Some of these extensions can be easily foreseen; as noted repeatedly in this chapter, the mechanisms for fully integrating MathML into HTML are not yet developed, and these mechanisms may have a significant impact on some aspects of MathML.
Similarly, there are several kinds of functionality that are fairly obvious candidates for future MathML extensions. These include macros, style sheets, and perhaps a general `labeled diagram' facility. However, there will also no doubt be other desirable extensions to MathML which will only emerge as MathML is widely used. For these extensions, the W3C Math working group relies on the extensible architecture of XML, and the common sense of the larger Web community.
The development of style sheet mechanisms for XML is part of the ongoing XML activity at the World Wide Web Consortium. Both XSL and CSS are working to incorporate greater support for mathematics. Further, XSL can be used to provide basic macro capability as well.
Macros, however, play a very important and useful role in encoding mathematical content and meaning. Moreover, it is difficult to devise a coherent, general macro system for MathML, because there are so many distinct applications for MathML macros. Therefore, a good direction for further work is the definition of a macro mechanism specifically tailored to MathML, in addition to participating in general ongoing XML style sheet and macro facility activities.
Some of the possible uses of MathML macros include:
<msubsup>
element as `second derivative with respect to x of f'.
The set of elements and attributes specified in the MathML specification are necessary for rendering common mathematical expressions. It is recognized that not all mathematical notation is covered by this set of elements, that new notations are continually invented, and that sub-communities within mathematics often have specialized notations; and furthermore that the explicit extension of a standard is a necessarily slow and conservative process; this implies that the MathML standard could never explicitly cover all the presentational forms used by every sub-community of authors and readers of mathematics, much less encode all mathematical content.
In order to facilitate the use of MathML by the widest possible audience, and to enable its smooth evolution to encompass more notational forms and more mathematical content (perhaps eventually covered by explicit extensions to the standard), the set of tags and attributes is open-ended, in the sense described in this section.
MathML is described by an XML-compliant DTD, which necessarily limits the elements and attributes to those which occur in the DTD. Renderers desiring to accept nonstandard elements or attributes, and authors desiring to include these in documents, should accept or produce documents which conform to an appropriately extended XML-compliant DTD which has the standard MathML DTD as a subset.
MathML compliant renderers are allowed, but not required, to accept
nonstandard elements and attributes, and to render them in any way. If
a renderer does not accept some or all nonstandard tags, it is
encouraged to either handle them as errors as described above for
elements with the wrong number of arguments, or to render their
arguments as if they were arguments to an mrow
, in either
case rendering all standard parts of the input normally.