WD-DOM/level-one-core-971209

Document Object Model (Core)
Level 1

W3C Working Draft 9-December-1997

This version:: http://www.w3.org/TR/WD-DOM/level-one-core-971209
Latest version:: http://www.w3.org/TR/WD-DOM/level-one-core
Previous version:: http://www.w3.org/TR/WD-DOM/level-one-core-971009
WG Chair:: Lauren Wood, SoftQuad, Inc.
Editor:: Steve Byrne, JavaSoft (until November 19, 1997)
Mike Champion, ArborText (from November 20, 1997)
Principal contributors:: Vidur Apparao, Netscape; Steve Byrne, JavaSoft (until November 19, 1997); Bill Smith, Sun (after November 20, 1997); Mike Champion, ArborText, Inc.; Scott Isaacs, Microsoft; Arnaud Le Hors, W3C; Gavin Nicol, INSO; Peter Sharpe, SoftQuad, Inc.; Jared Sorensen, Novell; Bob Sutor, IBM

Abstract

The Document Object Model (DOM) level one provides a mechanism for software developers and web script authors to access and manipulate parsed HTML and XML content. All markup as well as any document type declarations are made available. Level one also allows creation "from scratch" of entire web documents in memory; saving those documents persistently is left to the programmer. DOM Level one is intentionally limited in scope to content representation and manipulation; rendering, validation, externalization etc. are deferred to higher levels of the DOM.

Status of this document

This document is part of the Document Object Model Specification

Introduction

This specification defines a minimal set of objects and interfaces for accessing and manipulating document objects. The functionality specified in this draft (the "core" functionality) should be sufficient to implement higher level operations, such as querying, and filtering of a document; future drafts will add "utility" operations which can be implemented in terms of the core operations, but which may be implemented more efficiently using implementation-specific mechanisms that fall outside of the scope of this specification.

Intended Audience

The audience for this document is intended to be web script authors, software developers, and DOM implementation providers. This document presumes familiarity with concepts and terminology from HTML and XML, as well as object oriented programming. The actual DOM specification is provided in the Object Management Group's Interface Definition Language (IDL); experience with Java or C++ syntax should be sufficient to allow comprehension.

Design goals for the DOM Level One specification

The DOM objects and interfaces are designed to be:

sufficient for representing the content of parsed HTML and XML documents without loss of [significant] information. The supported HTML version is 4.0; the supported XML version is 1.0.
sufficient to construct an entirely new document instance programmatically that is identical to the parsed form of a given HTML or XML document. This means that it has sufficient constructive power to build any useful document object hierarchy, and that an implementation could be written such that the external document parser merely calls the methods specified in the level one specification to build the object hierarchy.
the foundation for the rest of the document object model levels, which means it must be simple, flexible, and extensible.
thread-safe: The operations supported by the DOM will not corrupt the document object or return corrupted state (as far as this API is concerned). Higher level consistency support mechanisms such as explicit locks or transactions are outside of the scope of the level one specification. For level one of the DOM, the assumption is that only one thread operates on the document at a time.

Note: In the current specification, some operations can modify the document tree, but there is no model for handling concurrent access. The WG also recognises that in some situations, a document, or some of its components, will not be modifiable, and a method for dealing with such situations needs to be defined.

Overall type hierarchy

The Document Object Model defines a representation of a hierarchy of objects (i.e. a tree, or "structure model"), called "nodes". The object hierarchy is typically created from a source representation such as HTML or XML via some implementation-specific mechanism that falls outside the scope of this specification.

Primary object model types

Node
  |
  +--Document
  |
  +--Element
  |
  +--Attribute
  |
  +--Text
  |
  +--Comment 
  |
  +--PI

Auxiliary types

These types are "helpers" which can appear in various parts of the DOM. Some of them occur quite frequently in common usage; others are limited to the Document Type Definition section of the document, and thus may be of little interest to typical DOM users. Those types are marked accordingly.

NodeList: Represents a (possibly) lazily evaluated set of nodes.
EditableNodeList: A subtype of NodeList which allows for its set of nodes to be changed.
NodeEnumerator: Used for iterating over (enumerating) a set of nodes.
AttributeList: Represents a collection of Attribute objects, indexed by attribute name.
DocumentContext: A repository for meta-data about a document, such as source, creation date, and other creation context information. This object will be fully specified in the level two DOM specification.
DOM: Provides document object model independent meta-operations such as retrieving the object factory or making inquiries about specific supported versions of HTML and XML within a particular document object model implementation.
DOMFactory: The mechanism for creating new DOM objects to populate a specific object model with.
Range: [Being discussed for possible inclusion in a later draft of level one] Ranges represent a part of the document, potentially including markup.

Entities and the DOM Core

In the DOM core, there are no objects representing entities. Numeric character references and references to the pre-defined entities in HTML and XML, are replaced by the single character that makes up the entities replacement. For example, in:

   <p>This is a dog &amp; a cat</p>

the "&" will be replaced by the character "&", and the text in the <p> element will form a single continuous sequence of characters. The representation of general entities, both internal and external, are defined within the XML-specific portion of the level one specification.

Note: When a DOM representation of a document is converted to its textual form as XML or HTML, applications will need to check each character in text data to see if it needs to be escaped using a numeric or pre-defined entity. Failing to do so could result in invalid HTML or XML.

IDL Issues

The primary Document Object Model type definitions are presented using the Object Management Group's Interface Definition Language (IDL, ISO standard 14750). While a complete tutorial on the IDL language is beyond the scope of this document, a few key items deserve explicit mentioning:

In the DOM's IDL definition, within each interface there are primarily two kinds of things:
1. Attribute definitions
2. Method (IDL calls them "operation") definitions
Attribute definitions are a shorthand notation for a pair of "get/set" or accessor/mutator methods, and should not be thought of, nor mapped in a particular language binding to, data members directly.
The "constructor" methods for the various IDL objects are not specified explicitly. Specific programming language implementations of the DOM will provide suitable object creation methods; typically these will be simple with no argument constructors, although this is not a requirement.
IDL's long datatype represents 32 signed bit integers. In other language bindings, for example Java, this would be mapped to the Java int datatype.

More information on IDL is available from OMG and in chapter 3 of the CORBA 2.0 specification.

Note: The Object Management Group Interface Definition Language (OMG IDL) was chosen as it was designed for specifying language and implementation-neutral interfaces. Various other IDLs could be used; the use of OMG IDL does not imply a requirement to use a specific object binding runtime.

Primary API Types

The types described in this section are those that application programmers using the DOM will encounter most frequently. A good working knowledge of these types will be sufficient to accomplish most tasks.

Node

Node is the base type of most objects in the Document Object Model. It may have an arbitrary number (including zero) of sequentially ordered child nodes. It usually has a parent Node; the exception being that the root Node in a document hierarchy has no parent.

Element

Element objects represent the elements in the HTML or XML document. Elements contain, as child nodes, all the content between the start tag, and the end tag of an element. Additionally, Element objects have a list of Attribute objects which represent the combination of those attributes explicitly specified in the document, and those defined in the document type definition which have default values.

Document

The Document object represents the root node of a document. It typically¹ has no parent; the getParentNode() method will return null.

Typical operation

Normally, a DOM-compliant implementation will make the main Document instance available to the application² through some implementation-specific mechanism. For example, a typical implementation would pass the application a reference to a DocumentContext object. From the DocumentContext, the application may retrieve the Document object, which is the root of the document object hierarchy.

Once the application has access to the root of the document object hierarchy, it can use the methods defined herein for accessing individual nodes, selection of specific node types such as all images, and so on.

Document Object Model APIs

This section defines the complete set of objects and methods which are defined by the Document Object Model. The general structure of these object definitions is:

A brief overview of the semantics of the object and how it relates to other objects in the DOM.
A sequence of <method signature, description> pairs for each method that is defined for the object.

Document

The Document object represents the entire HTML or XML document. Conceptually, it is the root of the document tree, and provides the primary access to the document's data.

Node documentType: For XML, this provides access to the Document Type Definition (see DocumentType) associated with this XML document. For HTML documents and XML documents without a document type definition this returns the value null.
Element documentElement: The element that's the root element for the given document. For HTML, this will be an Element instance whose tagName is "HTML"; for XML this is the outermost element, i.e. the element non-terminal in production [41] in Section 3 of the XML-lang specification.

NodeEnumerator getElementsByTagName(wstring name): Produces an enumerator which iterates over all of the Element nodes that are contained within the document whose tagName matches the given name. The iteration order is a depth first enumeration of the elements as they occurred in the original document.
Note: a future version of the DOM will provide a more generalized querying mechanism for Nodes. One such query involves obtaining all the Elements in a subtree with a given tagName. A convenience method for this query has been included in the core document. This method might be removed at a later date in favor of a more comprehensive querying mechanism.

DOM

The "DOM" interface provides a number of methods for performing operations that are independent of any particular instance of the document object model. The only operations currently supported is to retrieve the factory object. It is expected, however, that other operations such as querying for the version number of a particular DOM implementation, or asking about the versions of HTML or XML supported by a particular DOM implementation would also be present on this interface. Although IDL does not provide a mechanism for expressing the concept, the methods supplied by the DOM interface will be implemented as "static", or instance independent, methods. This means that a client application using the DOM does not have to locate a specific instance of the DOM object; rather, the methods are will be available directly on the DOM class itself and so are directly accessible from any execution context.

DOMFactory getFactory(): Returns an object that implements the DOMFactory interface. Note that by providing an accessor function for retrieving the factory object, DOM implementations are empowered to return different factory instances under different conditions.
Note that in the future it is expected that there will be additional static methods on the DOM itself to allow for specification of which factory object to be returned.

DOMFactory

The methods on the DOMFactory interface allow DOM clients to create new DOM objects. An application developer who needed to create an entire document object model programmatically would use the methods on a DOMFactory object to build the individual objects that comprise the object model, and use the operations on the objects themselves to connect the objects into an overall document object model.

Document createDocument(): Create and return a new empty Document object.
DocumentContext createDocumentContext(): Create and return a new DocumentContext.
Element createElement(in wstring tagName, in AttributeList attributes): Create an element based on the tagName. Note that the instance returned may implement an interface derived from Element. The attributes parameter can be null if no attributes are specified for the new Element.
Text createTextNode(in wstring data): Create a Text node given the specified string.
Comment createComment(in wstring data): Create a Comment node given the specified string.
PI createPI(in wstring name, in wstring data): Create a PI node with the specified name and data string.
Attribute createAttribute(in wstring name, in NodeList value): Create an Attribute of the given name and specified value. Note that the Attribute instance can then be set on an Element using the setAttribute method.

DocumentContext

The DocumentContext object represents information that is not strictly related to a document's content; rather, it provides the information about where the document came from, and any additional meta-data about the document. For example, the DocumentContext for a document retrieved using HTTP would provide access to the HTTP headers which were retrieved with the document, the URL that the document came from, etc.

For documents which were not retrieved via HTTP, or for those which were created directly in memory, there may be no DocumentContext.

NOTE: The DocumentContext interface described here is expected to be significantly expanded in the level two specification of the Document Object Model.

Document document: This is the root node of a Document Object Model. Any iteration, enumeration or other traversal of the entire document's content should begin with this node.

Node

The Node object is the primary datatype for the entire Document Object Model. It represents a single node in the document tree. Nodes may have, but are not required to have, an arbitrary number of child nodes.

NodeType getNodeType(): Returns an indication of the underlying Node object's type. The actual type of the returned data is language binding dependent; the IDL specification uses an enum, and it is expected that most language bindings will represent this runtime-queryable Node type using an integral data type. The names of the node type enumeration literals are straightforwardly derived from the names of the actual Node subtypes, and are fully specified in the IDL definition of Node in the IDL definition in Appendix A.
Node getParentNode(): Returns the parent of the given Node instance. If this node is the root of the document object tree, null is returned. [Note: because in ECMAScript get/set method pairs are surfaced as properties, Parent would conflict with the pre-defined Parent property, so we disambiguate this with "ParentNode" even though it is inconsistent with the naming convention of the other methods that do not include "Node"].
NodeList getChildren(): Returns a NodeList object containing the children of this node. If there are no children, null is returned. The content of the returned NodeList is "live" in the sense that changes to the children of the Node object that it was created from will be immediately reflected in the set of Nodes the NodeList contains; it is not a static snapshot of the content of the Node. Similarly, changes made to the NodeList will be immediately reflected in the set of children of the Node that the NodeList was created from.
boolean hasChildren(): Returns true if the node has any children, false if the node has no children at all. This method exists both for convenience as well as to allow implementations to be able to bypass object allocation, which may be required for implementing getChildren().
Node getFirstChild(): Returns the first child of a node. If there is no such node, null is returned.
Node getPreviousSibling(): Returns the node immediately preceding the current node in a breadth-first traversal of the tree. If there is no such node, null is returned.
Node getNextSibling(): Returns the node immediately following the current node in a breadth-first traversal of the tree. If there is no such node, null is returned.
Node insertBefore(in Node newChild, in Node refChild) raises (NotMyChildException): Inserts a child node (newChildbefore the existing child node refChild. If refChild is null, insert newChild at the end of the list of children. If refChild is not a child of the Node that insertBefore is being invoked on, a NotMyChildException is thrown.
Node replaceChild(in Node oldChild, in Node newChild) raises (NotMyChildException): Replaces the child node oldChild with newChild in the set of children of the given node, and return the oldChild node. If oldChild was not already a child of the node that the replaceChild method is being invoked on, a NotMyChildException is thrown.
Node removeChild(in Node oldChild) raises (NotMyChildException): Removes the child node indicated by oldChild from the list of children and returns it. If oldChild was not a child of the given node, a NotMyChildException is thrown.

NodeList

The NodeList object provides the abstraction of an immutable ordered collection of Nodes, without defining or constraining how this collection is implemented, allowing different DOM implementations to be tuned for their specific environments.

The items in the NodeList are accessible via an integral index, starting from 0. A NodeEnumerator object may be created to allow simple sequential traversal over the members of the list.

NodeEnumerator getEnumerator(): Creates and returns an object which allows traversal of the nodes in the list in an iterative fashion. Note this method may be very efficient in some implementations; that is, they can return the enumerator instance even before the first node in the set has been located.
Node item(in unsigned long index) raises(NoSuchNodeException): Returns the indexth item in the collection. If index is greater than or equal to the number of nodes in the list, a NoSuchNodeException is thrown.
unsigned long getLength(): Returns the number of nodes in the NodeList instance. The range of valid child node indices is 0 to getLength()-1 inclusive.

EditableNodeList

EditableNodeList is a subtype of NodeList that adds operations that modify the list of nodes, such as adding, deleting and replacing Node instances in the list.

Node replace(in unsigned long index, in Node replacedNode) raises (NoSuchNodeException): Replace the indexth item the list with replacedNode, and return the old node object at that index (null is returned if the index is equal to the previous number of nodes in the list). If index is greater than the number of nodes in the list, a NoSuchNodeException is thrown.
void insert(in unsigned long index, in Node newNode) raises (NoSuchNodeException): Inserts a child node into the list BEFORE zero-based location index. Nodes from index to the end of list are moved up by one. If index is 0, the node is added at the beginning of the list; if index is self.getLength(), the node is added at the end of the list.
Node remove(in unsigned long index) raises (NoSuchNodeException): Removes the node at index from the list and returns it. The indices of the members of the list which followed this node are decremented by one following the removal. If the index is provided is larger than the number of nodes in the list, the NoSuchNodeException is thrown.

NodeEnumerator

This class provides a generic iteration mechanism over an arbitrary collection of nodes. The nodes may be enumerated in either forward or reverse order, and the direction of enumeration may be changed at any time. The enumerator behaves as though it had an internal "pointer" to the current node, and provides methods for abstractly changing the notion of what the current node is.

Typical usage (in some C++ like language) might look like:

    NodeEnumerator nodeEnum = document.getChildren().getEnumerator();

    for (Node node = nodeEnum.first(); node != null; node = nodeEnum.next()) {

	// ... do some computation on that node
    }

Node getFirst(): Returns the first node that the enumeration refers to, and resets the enumerator to reference the first node. If there are no nodes in the enumeration, null is returned.
NOTE: in some implementations this may or may not be a fast operation; it may be the case that the enumeration finds the requested node on demand, and for very large document object, this may take some time.
Node getNext(): Return the next node in the enumeration, and advances the enumeration. Returns null after the last node in the list has been passed, and leaves the current pointer at the last node.
Node getPrevious(): Return the previous node in the enumeration, and regresses the enumeration. Returns null after the first node in the enumeration has been returned, and leaves the current pointer at the first node.
Node getLast(): Returns the last node in the enumeration, and sets the enumerator to reference the last node in the enumeration. If the enumeration is empty, this method will return null. Doing a getNext() immediately after this operation will return null.
Node getCurrent(): This returns the node that the enumeration is currently referring to, without affecting the state of the enumeration object in any way. When invoked before any of the enumeration positioning methods above, the node returned will be the first node in the enumeration, or null if the enumeration is empty.
boolean atStart(): Returns true if the enumeration's "pointer" is positioned at the start of the set of nodes, i.e. if getCurrent() will return the same node as getFirst() would return. For empty enumerations, true is always returned. Does not affect the state of the enumeration in any way.
boolean atEnd(): Returns true if the enumeration's "pointer" is positioned at the end of the set of nodes, i.e. if getCurrent() will return the same node as getLast() would return. For empty enumerations, true is always returned. Does not affect the state of the enumeration in any way.

AttributeList

AttributeList objects are used to represent collections of Attribute objects which can be accessed by name. The Attribute objects contained in a AttributeList may also be accessed by ordinal index. In most cases, AttributeList objects are created from Element objects.

Attribute getAttribute(in wstring attrName): Retrieve an Attribute instance from the list by its name. If it's not present, null is returned.
Attribute setAttribute(in wstring attrName, in Attribute attr): Add a new attribute to the end of the list and associate it with the given name. If the name already exists, the previous Attribute object is replaced, and returned. If no object of the same name exists, null is returned, and the named Attribute is added to the end of the AttributeList object; that is, it is accessible via the item method using the index one less than the value returned by getLength().
Attribute remove(in wstring attrName) raises (NoSuchAttributeException): Removes the Attribute instance named name from the list and returns it. If the name provided does not exist, the NoSuchAttributeException is thrown.
Attribute item(in unsigned long index) raises(NoSuchAttributeException): Returns the (zero-based) indexth Attribute item in the collection. If index is greater than or equal to the number of nodes in the list, a NoSuchAttributeException is thrown.
unsigned long getLength(): Returns the number of Attributes in the AttributeList instance.

Element

By far the vast majority (apart from text) of node types that authors will generally encounter when traversing a document will be Element nodes. These objects represent both the element itself, as well as any contained nodes.

For example (in XML):

<elementExample id="demo">
    <subelement1/>
    <subelement2>
	<subsubelement/>
    </subelement2>
</elementExample>

When represented using DOM, the top node would be "elementExample", which contains two child Element nodes (and some ^space), one for "subelement1" and one for "subelement2". "subelement1" contains no child nodes of its own.

wstring getTagName()

This method returns the string that is the element's name. For example, in:

<elementExample id="demo">
    ...
</elementExample>

This would have the value "elementExample". Note that this is case-preserving, as are all of the operations of the DOM. See Name case in the DOM for a description of why the DOM preserves case.

AttributeList attributes

The attributes for this element. In the elementExample example above, the attributes list would consist of the id attribute, as well as any attributes which were defined by the document type definition for this element which have default values.

void setAttribute(in Attribute newAttr)

Adds a new attribute/value pair to an Element node object. If an attribute by that name is already present in the element, it's value is changed to be that of the Attribute instance.

NodeEnumerator getElementsByTagName(wstring name)

Produces an enumerator which iterates over all of the Element nodes that are descendants of the current node whose tagName matches the given name. The iteration order is a depth first enumeration of the elements as they occurred in the original document.

Note: a future version of the DOM will provide a more generalized querying mechanism for Nodes. One such query involves obtaining all the Elements in a subtree with a given tagName. A convenience method for this query has been included in the core document. This method might be removed at a later date in favor of a more comprehensive querying mechanism.

Attribute

The Attribute object represents an attribute in an Element object. Typically the allowable values for the attribute are defined in a document type definition.

wstring getName(): Returns the name of this attribute.
NodeList value: The effective value of this attribute. (The attribute's effective value is determined as follows: if this attribute has been explicitly assigned any value, that value is the attribute's effective value; otherwise, if there is a declaration for this attribute, and that declaration includes a default value, then that default value is the attribute's effective value; otherwise, the attribute has no effective value.) Note, in particular, that an effective value of the null string would be returned as a Text node instance whose toString() method will return a zero length string (as will toString() invoked directly on this Attribute instance). If the attribute has no effective value, then this method will return null. Note the toString() method on the Attribute instance can also be used to retrieve the string version of the attribute's value(s).
boolean specified: If this attribute was explicitly given a value in the original document, this will be true; otherwise, it will be false.
wstring toString(): Returns the value of the attribute as a string. Character and general entity references will have been replaced wit their values in the returned string.

Comment

Represents the content of a comment, i.e. all the characters between the starting ''. Note that this is the definition of a comment in XML, and, in practice, HTML, although some HTML tools may implement the full SGML comment structure.

wstring data: The content of the comment, exclusive of the comment begin and end sequence.

PI (Processing Instruction)

A PI node is a "processing instruction". The content of the PI node is the entire content between the delimiters of the processing instruction

wstring name: XML defines a name as the first token following the markup that begins the processing instruction, and this attribute returns that name. For HTML, the returned value is null.
wstring data: The content of the processing instruction, from the character immediately after the <? (after the name in XML) to the character immediately preceding the ?>.

Text

The Text object contains the non-markup portion of a document. For XML documents, all whitespace between markup results in Text nodes being created.

wstring data: This holds the actual content of the text node. Text nodes contain just plain text, without markup and without entities, both of which are manifest as separate objects in the DOM.
boolean isIgnorableWhitespace: This is true if the Text node contains only whitespace, and if the whitespace is ignorable by the application. Only XML processors will make use of this, as HTML abides by SGML's rules for whitespace handling.

Footnotes

The term "Application"

This document uses the the term "application" to mean the set of code that is using the DOM to inspect and manipulate the document object; for example scripts and/or full-scale applications.

Nested Documents

Sometimes it may make sense to have a document node be stored as child of another node. For example, at some point during the creation of a document that's representing XML links, it may be valuable to be able to have the target document(s) directly accessible in the node hierarchy.

Whitespace in XML

Parsed XML includes text nodes for white space between elements, even if there is nothing but whitespace present. The text node contains an indication of whether or not the author of the document intended for the whitespace to be ignored, but, according to the XML specification, white space must be passed to the DOM verbatim.

Name case in the DOM

The Document Object Model does not change the case of any identifiers present in a parsed document. XML preserves the case of identifiers (and indeed recognizes upper and lower case versions of the same identifier as distinct), and the HTML specification says that markup is handled case-insensitively, and many implementations of HTML tools interpret this to mean that element and attribute names are lowercased. So, in order to not lose case information, the methods in the Document Object Model do not alter the case of returned identifiers.

Application developers using the DOM for HTML would be wise to use case-insensitive comparisons when testing for equality.

Appendix A: IDL Interface definitions

This section contains the IDL definitions for the objects in the core Document Object Model. The HTML IDL definition is here, and the XML IDL definition, including the types to represent the document type definition is here.

//hb////-*-Mode: C++-*-////////////////////////////////////////////////////////
//                                                                           //
// NAME        :                                                             //
// DESCRIPTION :                                                             //
// HISTORY     :                                                             //
//                                                                           //
//he///////////////////////////////////////////////////////////////////////////

exception NoSuchNodeException {
};
exception NotMyChildException {
};

// Enumerator class for a node list
interface NodeEnumerator {
  Node getFirst();
  Node getNext();
  Node getPrevious();
  Node getLast();

  Node getCurrent();

  // The rationale for their existence is that the enumerator may be used
  // internally to a method, which may return some interesting value, and
  // therefore cannot also indicate whether the start or end of enumeration
  // was reached.  Any of the traversal methods affects the state, and
  // so are not suitable for usage as predicates (unless possible state
  // manipulation is acceptable).
  boolean atStart();
  boolean atEnd();
};

// Define the type for a sequence of nodes
interface NodeList {
  NodeEnumerator getEnumerator();

  Node item(in unsigned long index)
    raises(NoSuchNodeException);

  // This may be expensive to compute
  unsigned long getLength();
};

// Define the type for a sequence of nodes
interface EditableNodeList : NodeList {
  void replace(in unsigned long index, in Node replacedNode) 
    raises (NoSuchNodeException);

  void insert(in unsigned long index, in Node newNode) 
    raises (NoSuchNodeException);

  Node remove(in unsigned long index)
    raises (NoSuchNodeException);
};

// Interface to a node in a grove
interface Node {
  enum NodeType {
    DOCUMENT,
    ELEMENT,
    ATTRIBUTE,
    PI,
    COMMENT,
    TEXT
    };

  NodeType getNodeType();

  // Simple traversal interface
  Node     getParentNode();
  NodeList getChildren();
  boolean  hasChildren();
  Node     getFirstChild();
  Node     getPreviousSibling();
  Node     getNextSibling();

  void insertBefore(in Node newChild, in Node refChild)
    raises (NotMyChildException);

  Node replaceChild(in Node oldChild, in Node newChild)
    raises (NotMyChildException);

  Node removeChild(in Node oldChild)
    raises (NotMyChildException);
};

// Named node list
interface NamedNodeList {
  // Core get and set interface. Note that implementations may
  // build the list lazily
  Node getNode(in wstring name);
  Node setNode(in wstring name, in Node node);
 
  Node remove(in wstring name) raises (NoSuchNodeException);
 
  Node item(in unsigned long index)
    raises(NoSuchNodeException);
 
  unsigned long getLength();
 
  NodeEnumerator getEnumerator();
};

//////////////////////////////////////////////////////////////////////////
//                                                                      //
// OBJECTS RELATED TO THE DOM ITSELF                                    //
//                                                                      //
//////////////////////////////////////////////////////////////////////////

interface DOM {
  DOMFactory  getFactory();
};

interface DOMFactory {
  Document          createDocument();
  DocumentContext   createDocumentContext();
  Element           createElement(in wstring tagName, 
				  in AttributeList attributes);
  Text              createTextNode(in wstring data);
  Comment           createComment(in wstring data);
  PI                createPI(in wstring name, in wstring data);
  Attribute         createAttribute(in wstring name, in NodeList value);
};

interface DocumentContext {
  attribute Document	document;
};

interface Document : Node {
  attribute Node        documentType;
  attribute Element 	documentElement;
  NodeEnumerator        getElementsByTagName(in wstring name);
};

//////////////////////////////////////////////////////////////////////////
//                                                                      //
// OBJECTS RELATED TO THE INSTANCE                                      //
//                                                                      //
//////////////////////////////////////////////////////////////////////////

interface Attribute : Node {
  // 
  attribute wstring   name;
  attribute NodeList  value;
  attribute boolean	specified;

  // provides a connection to the DTD 
  // attribute Node  	definition;

  wstring toString();
};

// Attribute list
interface AttributeList {
  Attribute getAttribute(in wstring name);
  Attribute setAttribute(in wstring name, in Attribute attr);

  Attribute remove(in wstring name) 
    raises (NoSuchNodeException);

  Node item(in unsigned long index)
    raises(NoSuchNodeException);

  unsigned long getLength();
};

// Processing Instruction
interface PI : Node {
  attribute wstring 	name;
  attribute wstring 	data;
};

interface Element : Node {
  // 
  attribute wstring	tagName;

  attribute AttributeList  attributes;

  void setAttribute(in Attribute newAttr);

  NodeEnumerator getElementsByTagName(in wstring name);
};

// Represents the content of <!-- ... -->
interface Comment : Node {
  attribute wstring	data;
};

interface Text : Node {
  attribute wstring    data;

  attribute boolean isIgnorableWhitespace;
};

Appendix B: Java Core API definitions

// Nodes, node lists, and enumerators for node lists

public class NoSuchNodeException extends Exception {

};

public class NotMyChildException extends Exception {

};


// NodeEnumerator traverses the nodes in a NodeList.

public interface NodeEnumerator {

  Node getFirst();
  Node getNext();
  Node getPrevious();
  Node getLast();

  Node getCurrent();

  // The rationale for their existence is that the enumerator may be used
  // internally to a method, which may return some interesting value, and
  // therefore cannot also indicate whether the start or end of enumeration
  // was reached.  Any of the traversal methods affects the state, and
  // so are not suitable for usage as predicates (unless possible state
  // manipulation is acceptable).

  boolean atStart();
  boolean atEnd();

};


public interface NodeList {

  NodeEnumerator getEnumerator();

  Node item(long index)
    throws NoSuchNodeException;

  long getLength();

};


public interface EditableNodeList extends NodeList {

  void replace(long index,Node replacedNode) 
    throws NoSuchNodeException;

  void insert(long index,Node newNode) 
    throws NoSuchNodeException;

  Node remove(long index)
    throws NoSuchNodeException;

};


public interface Node {

  public final class NodeType {
    public final int DOCUMENT 	= 0;
    public final int ELEMENT 	= 1;
    public final int ATTRIBUTE 	= 2;
    public final int PI 		= 3;
    public final int COMMENT 	= 4;
    public final int TEXT 		= 5;
  };

  // getNodeType() returns one of the NodeType constants
  // defined above.

  int getNodeType();

  Node     getParentNode();
  NodeList getChildren();
  boolean  hasChildren();
  Node     getFirstChild();
  Node     getPreviousSibling();
  Node     getNextSibling();

  void insertBefore(Node newChild, Node refChild)
    throws NotMyChildException;

  Node replaceChild(Node oldChild, Node newChild)
    throws NotMyChildException;

  Node removeChild(Node oldChild)
    throws NotMyChildException;
};



public interface NamedNodeList {

  // Core get and set public interface. Note that implementations may
  // build the list lazily

  Node getNode(String name);
  Node setNode(String name, Node node);
 
  Node remove(String  name) 
    throws NoSuchNodeException;
 
  Node item(long index)
    throws NoSuchNodeException;
 
  long getLength();
 
  NodeEnumerator getEnumerator();

};


public interface DOM {
  DOMFactory  getFactory();
};


public interface DOMFactory {

  Document          createDocument();
  DocumentContext   createDocumentContext();
  Element           createElement(String tagName, AttributeList attributes);
  Text              createTextNode(String data);
  Comment           createComment(String data);
  PI                createPI(String name, String data);
  Attribute         createAttribute(String name, NodeList value);

};


public interface DocumentContext {

  void setDocument(Document document);
  Document getDocument();

};


public interface Document extends Node {

  void setDocumentType(Node documentType);
  Node getDocumentType();

  void setDocumentElement(Element documentElement);
  Element getDocumentElement();

  NodeEnumerator getElementsByTagName(String name);

};


public interface Attribute extends Node {

  void setName(String name);
  String getName();

  void setValue(NodeList value);
  NodeList getValue();

  void setSpecified(boolean specified);
  boolean getSpecified();

  // provides a connection to the DTD 
  // attribute Node definition;

  String toString();

};


public interface AttributeList {

  Attribute getAttribute(String name);
  Attribute setAttribute(String name, Attribute attr);

  Attribute remove(String name) 
    throws NoSuchNodeException;

  Node item(long index)
    throws NoSuchNodeException;

  long getLength();

};


// Processing Instruction

public interface PI extends Node {

  void setName(String name);
  String getName();

  void setData(String data);
  String getData();

};

public interface Element extends Node {

  void setTagName(String tagName);
  String getTagName();

  void setAttributes(AttributeList attributes);
  AttributeList getAttributes();

  void setAttribute(Attribute newAttr);

  NodeEnumerator getElementsByTagName(String name);

};


// Represents the content of 
<!-- ... -->
public interface Comment extends Node {

  void setData(String data);
  String getData();

};


public interface Text extends Node {

  void setData(String data);
  String getData();

  void setIsIgnorableWhitespace(boolean isIgnorableWhitespace);

  boolean getIsIgnorableWhitespace();

};

Appendix C: ECMAScript Core API definitions

Note: This section will contain the complete DOM core bindings for ECMAScript when they become available. We expect this to occur in the very near future as the level one core specification reaches maturity.

Appendix D: Glossary

There are a large number of terms that the DOM uses which may not be familiar to many of the readers. We suggest that you review the glossary if you encounter terms that aren't familiar.