This document is a submission to the World Wide Web Consortium (see Submission Request, W3C Staff Comment). It is the initial draft of the specification of the DCD facility. It is intended for review and comment by W3C members and is subject to change.
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
This document proposes a structural schema facility, Document Content Description (DCD), for specifying rules covering the structure and content of XML documents. The DCD proposal incorporates a subset of the XML-Data Submission [XML-Data] and expresses it in a way which is consistent with the ongoing W3C RDF (Resource Description Framework) [RDF] effort; in particular, DCD is an RDF vocabulary. DCD is intended to define document constraints in an XML syntax; these constraints may be used in the same fashion as traditional XML DTDs. DCD also provides additional properties, such as basic datatypes.
1. Introduction
1.1 Motivating Examples
1.2 Design Principles
1.3 Future Work
2. The DCD Framework
2.1 A Note on Syntax
2.1.1
Proposed Simplification of RDF Syntax
2.1.2
Interchangeability of Elements and
Attributes
2.2 DCD Nodes and Resource
Types
2.3 Referring to Elements and
Attributes
3. The DCD Vocabulary
3.1 Properties which apply to
DCDs
3.1.1
AttributeDef
3.1.2
Description
3.1.3
InternalEntityDef and ExternalEntityDef
3.1.4
Contents
3.1.5
Namespace
3.2 Properties Which Apply to
Element Definitions
3.2.1
Attribute and AttributeDef
3.2.2
Contents
3.2.3
Datatype
3.2.4
Default and Fixed
3.2.5
Description
3.2.6
Groups, Occurs and Order
3.2.7 Max,
Min, MaxExclusive, MinExclusive
3.2.8
Model
3.2.9
Root
3.2.10
Type
3.3 Properties Which Apply to
Attribute Definitions
3.3.1
Global
3.3.2
ID-Role
3.3.3
Name
3.3.4
Occurs
3.4 Properties Which Apply to
Internal Entity Definitions
3.4.1
Name
3.4.2
Value
3.5 Properties Which Apply to
External Entity Definitions
3.5.1
Name
3.5.2
PublicID
3.5.3
SystemID
4. Datatypes
4.1 Datatype
Specifications
4.2 Datatypes in
instances
4.3 Picture Constraints
A. Local Element Definitions
B. Inheritance and Subclassing
C. Null Values
D. Unique Values
E. References
F. Acknowledgements
The Document Content Description facility for XML (abbreviated DCD) is an RDF vocabulary designed for describing constraints to be applied to the structure and content of XML documents. The abbreviation "DCD" is used to describe both the general facility described in this document and individual schema instances that conform to it.
The following example is a DCD which describes the important characteristics
of the DL
element from HTML:
<DCD> |
The example above is very document-oriented and in many respects isomorphic to what can be done with an XML DTD. The following example, less document-oriented, provides constraints for an airline booking:
<DCD> |
Here is a booking record that conforms to the schema:
<Booking> |
DCD is based on the following design principles:
It is anticipated that for DCD to realize its full potential, several types of constraint are required beyond those described in this note. These include:
Bag
as another legal value for the RDF:Order
property. This would
support the concept that a Relational database table is an unordered collection
of columns. But this would bring back the SGML &-connector and so, on
the balance, it was decided (for this release) to disallow Bag
as a legal value for the RDF:Order
property. This decision may
need to be revisited in future.
A Document Content Description (DCD) is a set of properties used to constrain the types of elements and names of attributes that may appear in an XML document, the contents of the elements, and the values of the attributes.
As stated earlier, it is intended that DCD be conformant to the RDF Model and Syntax Specification [RDF]. However, it assumes certain simplifications in the RDF syntax which we intend to propose to the RDF working group. This syntax will be adopted only if ratified by the RDF working group. These syntactic simplifications are:
The RDF syntax document allows non-repeatable properties to be expressed as attributes of the parent element. Thus, properties such as Name, Content and Model can be expressed either as elements or as attributes. The following are, therefore, equivalent:
<DCD> |
<DCD> |
As shown in the above example, a optional processing instruction (PI) may
be added to a DCD to specify the alternative "explicit" syntax form. The
examples are equivalent and legal even without the PI. When the DCD PI is
present with syntax="explicit"
specified, then throughout the
schema, the following properties must be specified using attribute
syntax as shown below:
Type
Model
Occurs
RDF:Order
Content
Root
Fixed
Datatype
and all other properties must be specified using element syntax.
<?DCD syntax="explicit"?> |
The examples in this document, for the most part, use the attribute form for properties.
The namespace which describes DCD properties and resources is identified
by the URI http://w3.org/Schemas/DCD
. It contains the following
types: DCD
, ElementDef
, Group
,
AttributeDef
, ExternalEntityDef
and
InternalEntityDef
.
In the XML form of a DCD, the types of the elements correspond to RDF's property types. In the interests of brevity, we refer, for example, to "objects of type Namespace", which in the XML syntax are elements whose type is "Namespace" representing RDF properties where the property type is "Namespace".
A resource of type DCD is a document structure description that constrains the structure and contents of any document that identifies itself as falling under that DCD's constraints. An XML document can be identified as falling under the constraints of more than one DCD, in which event the properties applying to each such DCD are taken as constraints on the XML document. This provides two benefits: first, a single DCD can be used to provide constraints for large numbers of separate documents. Second, the DCD object provides a convenient level of granularity for applying namespace mechanisms.
The resources of type ElementDef and AttributeDef are more detailed structure
descriptors. The properties of these resources provide constraints governing
elements and attributes in the XML document. Implicitly, any node which is
the value of an ElementDef
or AttributeDef
property
is of respective type ElementDef or AttributeDef; however, there is typically
no value in indicating this explicitly with an RDF:InstanceOf
property.
Most DCD declarations constrain the content and attributes of elements in
document instances. This is done by assigning properties to objects of type
ElementDef
. These assignments may be seen as element type
declarations. Element definitions declare that elements may have other elements
as children, or may have attributes provided with certain names and properties.
Child elements must be collected together into Groups
which
have Order
and Occurs
properties. See
"3.2.6 Groups, Occurs and Order". Each
ElementDef
must have a Type
property. This must
be unique within the DCD. But, see Appendix
"A. Local Element Definitions".
The attributes and the elements referred to in a particular DCD may come
from the same DCD or from other DCDs identified by namespaces. Element
definitions from within the same DCD are referred to by their
Type
property. If the element definition comes from another
namespace, the value of the Type
property may be a
qualified name,
where the prefix identifies the namespace.
For example, in the following, FirstName
, MI
and
LastName
are defined elsewhere in the DCD but
Address
comes from a namespace declared with the
common
prefix.
<ElementDef Type="person" Model="Elements"> |
Attributes are declared in DCDs using objects of type
AttributeDef
. An attribute definition may occur on its own,
as a property of the DCD, or it may occur within an element definition. In
either case it may have a Global
property whose value may be
True
or False
. The default is False
.
Every attribute definition must have a Name
property. If the
value of the Global
property is True
the
Name
property must be unique in the DCD.
Global attributes can referred to by their names in any element definition within the DCD. Global attributes in other namespaces can be referred to by the use of qualified names.
In the following example, Hidden
is a global attribute in the
DCD, while schemas:CLASS
is a global attribute from another
namespace.
<DCD> |
In the following, the SRC
attribute is defined locally within
the IMG
element definition.
<DCD> |
Attributes defined with Global="False"
can be referred to in
other element definitions in the DCD by a resource identifier. For example:
<DCD> |
The (roughly alphabetical) order in which the property descriptions appear is not intended to have any significance.
In the following descriptions, the phrase "such documents" signifies documents which have been identified as falling under the constraints of the DCD.
Declares an attribute type which may be provided for one or more elements
in such documents. This property does not assert that the attribute is provided
for any individual element type; this can only be done with
Attribute
and AttributeDef
properties of ElementDef.
However, this property can be used to create an AttributeDef node which can
serve as the value of Attribute
properties. See discussion above.
An example of the use of AttributeDef
:
<DCD> |
Provides a, presumably human-readable, description of the semantics and usage
of this DCD. The value of this property must match the production labeled
Content
in the XML specification; that is to say, it may contain markup, and is
well-formed.
Identify an entity which may be invoked via reference within such documents. The value of these properties must be a Node (in RDF terms), provided in the RDF syntax with subelement or URI. The resource which is the property value must be identified by the class mechanism as an InternalEntityDef or ExternalEntityDef.
An example of the use of InternalEntityDef
and
ExternalEntityDef
:
<InternalEntityDef |
Signals whether elements of types not explicitly declared via
ElementDef
properties may appear in such documents. The value
of this property must be a string whose value is Open
or
Closed
. Closed
means that such documents may contain
only elements whose types have been declared via ElementDef
properties. Open
means that such documents may contain elements
which have not been so declared.
Provides the namespace of this DCD. The value of this property must be a URI which identifies a namespace. This property is required to exist for every DCD.
The namespace of a DCD applies to all elements and attributes attached by properties to this DCD. The idea is that in an instance, the prefix part of a qualified name is used to locate the namespace and schema, and the local name part used to locate the applicable properties in the schema.
An example of the use of Namespace
:
<DCD> |
This declares the namespace for this DCD to be
http://www.w3.org/TR/REC-html40
. If some XML document indicates
that the prefix H
refers to the namespace whose
namespace name
is http://www.w3.org/TR/REC-html40
, then references to an element
H:B in that document refer to the element defined in the above example using
the local name
B
.
[Definition:] In the descriptions, the phrase this type signifies the element definition to which the properties apply.
Identify attributes which may be provided for elements of
this type. No element definition may have two
Attribute
or AttributeDef
properties referencing
attributes that have the same name.
An example of the use of Attribute
and
AttributeDef
:
<ElementDef Type="IMG"> |
In this example, the properties of the Attribute whose name is
SRC
are declared within the declaration of the IMG
element. This would make sense if IMG
is the only element for
which the SRC
attribute applies.
The second attribute, BORDER
, has a declaration stored separately,
referenced by its name. This declaration style is suitable when such an attribute
is applicable to multiple elements; it allows maintaining the declaration
in one location.
Finally, the declaration for the third attribute, HUE
uses a
qualified name and refers to a declaration found in another DCD, whose namespace
is identified by the prefix SiteMap
. BORDER
and
HUE
must be defined as global attributes in their respective
DCDs.
Signals whether elements of types not explicitly declared via the
Group
property may appear as children of elements of
this type. The value of this property must be
a string whose value is Open
or Closed
.
Closed
means that this element type is allowed to have children
only of types which are declared via the Group
property.
Open
means that this element type may have children of types
not declared via the Group
property.
Examples of the use of Content
:
<ElementDef Type="DT" Model="Data" Content="Closed"/> |
Identifies a specific datatype (in the [XML-Data] sense) which constrains the content of elements of this type. The value of this property must be a string which matches one of an enumerated list of datatypes. See section "4. Datatypes".
The Datatype
property is only meaningful if the value of the
Model
property is Data
. That is to say, it is not
meaningful to provide a lexical datatype for content which contains
substructures.
Examples of the use of Datatype
:
<ElementDef Type="Loan"> |
Provides default values for the content of elements of
this type, and signals whether any value other
than the default is allowed. The value of the Default
property
must be a string which provides a default value. The only allowed values
of the Fixed
property are the strings True
and
False
.
The Default
value is used in the case that this element type
appears as the value of an Element
property of some other element
type, but an element of that type fails to contain a child of
this type.
The Default
property is only meaningful if the value of the
Model
property is Data
. That is to say, it is not
meaningful to provide a default value for content which contains substructures.
When the Default
property is used to give an element type a
default value, the presence of the Fixed
property with a value
of True
means that the default value is the only one allowed
for this element type. If the Fixed
property is not specified
it is assumed to have a value of False
.
An example of the use of Default
:
<ElementDef Type="AirTicketClass" Model="Data" Datatype="char"> |
An example of the use of Fixed
:
<ElementDef Type="Namespace" Model="Data" Fixed="True"> |
Provides a, presumably human-readable, description of the semantics and usage
of elements of this type. The value of this property
must match the production labeled
Content
in the XML specification; that is to say, it may contain markup, and is
well-formed.
An example of the use of Description
:
<ElementDef Type="BLINK"> |
An ElementDef
whose Model
property has the value
Elements
must also have a single property named
Group
, containing a specification of the elements and groups
which can appear as children of elements of this
type. Groups
in turn may have an Occurs
property.
This can take one of four values.
The default is Required
.
A group declares individual elements and other groups which may occur as
children of groups of this type. The order of
occurrence of the children is declared using the RDF collection ordering
facility via the proposed RDF:Collection
attribute. Legal values
are Seq
, in which case children must occur in the specified
order, or Alt
in which case only one of the specified children
may appear. The default is Seq
. See section
"1.3 Future Work".
An example of a simple element declaration:
<ElementDef Type="person" Model="Elements" > |
Here is a more complete example with attribute and element specifications:
<ElementDef Type="employee" Model="Elements" Content="Closed"> |
Provide, respectively, upper and lower bounds on the content of elements
of this type. Max
and
Min
allow values upto and including the bound while
MaxExclusive
and MinExclusive
allow values less
than and greater than the bound, respectively, The semantics of upper and
lower bounding are highly dependent on the element's Datatype
;
for some datatypes (e.g. uri
), this property has no meaning.
If an element has no Datatype
, then Max, Min,
MaxExclusive
and MinExclusive
values are treated as strings,
and tests for upper and lower bounding are performed according to the language
specification collation rules defined in Chapter 5.15 of the Unicode
standard.[Unicode].
The Max, Min, MaxExclusive
and MinExclusive
properties
are only meaningful if the value of the Model
property is
Data
. That is to say, it is not meaningful to provide upper
or lower bounds for content which contains substructures.
Examples of the use of Max
and Min
:
<ElementDef Type="MonthOfYear" Model="Data" Datatype="int" |
Indicates which of five broad classes of constraints apply to the content
of elements of this type. The value of this property
must be a string whose value is one of Empty
, Any
,
Data
, Elements
, or Mixed
. The meanings
are:
Group
and Element
properties.
Element
property.
The default is Data
.
Examples of the use of Model
:
<ElementDef Type='IMG' Model='Empty' /> |
Element definitions can have a Root
property that indicates
whether an element of that type can serve as the root of a conforming document.
Allowed values are True
and False
. The default
is False
.
If no element definition in a DCD has a Root="True"
property,
then an element of any type that is allowed to appear in such documents may
serve as the root element. If multiple element definitions have
Root="True"
then any element of one of those types can appear
as the root of a conforming document.
An example of the use of Root
:
<DCD> |
Gives the type of the
element. This property is required to be present for every Element resource
in DCD. The value of this property must be a
Name
in the
XML sense. Furthermore, it must be an
NCName
as defined in [XML Namespaces]; that is to say, it
may not contain a prefix or a colon.
As discussed earlier, the Type
property for element definitions
must be unique within the DCD. But, see Appendix
"A. Local Element Definitions".
The following properties which apply to attribute definitions or attribute
types have the same names as, and are identical in effect to, the corresponding
properties of element types: Datatype
, Default
,
Description
, Max, Min, MaxExclusive, MinExclusive
and Fixed
.
Indicates whether the Name
property of this attribute must be
unique in the DCD, and thus can serve as an address for this attribute
definition. The possible values are True
and False
.
The default is False
.
An example of the use of Global
:
<DCD> |
Signals that the attribute has unique identifier or unique ID pointer semantics.
The value of this property must be a string whose value is one of
ID
, IDREF
, or IDREFS
. The effect of
each of these values is the same as if the attribute had been declared, in
an XML DTD, with the
attribute type
of the same name.
An example of the use of ID-Role
:
<ElementDef Type="A"> |
Gives the name of the attribute. This property is required to be present
for every Attribute resource in DCD. The value of this property must be a
Name
in the
XML sense. Furthermore, it must be an
NCName
as defined in [XML Namespaces]; that is to say, it
may not contain a prefix or a colon.
As discussed earlier, the Name
property for attribute definitions
that have Global="True"
must be unique within the DCD.
Indicates whether the presence of the Attribute is required. This can take one of two values.
The default is Optional.
Gives the name by which the entity may be invoked. This property is required
to be present for every InternalEntity definition resource in DCD. The value
of this property must be a
Name
in the
XML sense. Furthermore, it must be an
NCName
as defined in [XML Namespaces]; that is to say, it
may not contain a prefix or a colon.
Provides the replacement text for the internal entity. The value of this
property must match the production labeled
Content
in the XML specification; that is to say, it may contain markup, and is
well-formed.
An example of the use of Value
:
<InternalEntityDef> |
Gives the name by which the entity may be invoked. This property is required
to be present for every ExternalEntity definition resource in DCD. The value
of this property must be a
Name
in the
XML sense. Furthermore, it must be an
NCName
as defined in [XML Namespaces]; that is to say, it
may not contain a prefix or a colon.
Provides a public identifier for the entity. This is a string whose syntax
(see
PublicID
)
and semantics are exactly
as described in the XML specification.
Provides a system identifier for the entity. This is a string whose syntax and semantics are exactly as described in the XML specification.
The SystemID
property must be provided for every ExternalEntity
resource in DCD.
A number of datatypes are specified in this section. These are modeled after
the datatypes supported by [SQL] and modern programming
languages. Attributes and element types whose Model
property
has the value Data
can constrain their values/contents to be
instances of a particular datatype. XML 1.0 defines about 10 datatypes, which
may only be used to constrain attribute values, and essentially one datatype,
PCDATA, that can be used for element content. Here we propose a much richer
set of datatypes, applicable equally to attribute and element content.
The specifications in this section serve a number of purposes:
Datatypes are referenced from the datatype namespace. In order to use this
namespace in a schema, it must be declared. Some datataypes require that
additional properties be specified. For example, length
and
precision
for decimal
, length
for
char
and legal values
for enumeration
.
These should be specified as additional properties of the element or attribute
being defined. See the final example in
"3.2.6 Groups, Occurs and Order".
The DCD primitive datatypes are tabulated below.
Name | Examples | Parse type |
id | X | XML ID |
idref | X | XML IDREF |
idrefs | X Y Z | XML IDREFS |
entity | Foo | XML ENTITY |
entities | Foo Bar | XML ENTITIES |
nmtoken | Name | XML NMTOKEN |
nmtokens | Name1 Name2 | XML NMTOKENS |
enumeration Legal values must be specified. |
Red Blue Green | XML ENUMERATION |
notation | GIF | XML NOTATION |
string | Give me liberty or give me death! | pcdata |
number | 15, 3.14, -123.456E+10 | A number, with up to 31 digits.
May optionally have a leading sign, fractional digits, and exponent. Punctuation as in US English. Leading and trailing blanks are removed before converting a number specified as as string. Similarly, leading and trailing zeroes are removed. |
int | 1, 58502, -13 | A number, with optional sign,
no fractions, no exponent. |
fixed or decimal Precision and scale must be specified. |
12.0044 | Precision is the total
number of digits. It may range from 1 to 31. Scale is the number of digits to the right of the decimal point and must be less than or equal to the precision. |
boolean | 0, 1 (1=="true") | "1" or "0" |
dateTime | 2088-04-07T18:39:09 | A date in a subset of ISO
8601 format, with optional time and no optional zone. Fractional seconds may be as precise as nanoseconds. |
dateTime.tz | 2088-04-07T18:39:09-08:00 | A date in a subset ISO 8601 format, with optional time and optional zone. Fractional seconds may be as precise as nanoseconds. |
date | 2094-11-05 | A date in a subset ISO 8601
format. (no time) |
time | 08:15:27 | A time in a subset ISO 8601
format, with no date and no time zone. Fractional seconds may be as precise as nanoseconds. |
time.tz | 08:1527-05:00 |
A time in a subset ISO 8601 format, with no date but optional time zone. Fractional seconds may be as precise as nanoseconds. |
interval | 2088-04-07T18:39:09 |
A time interval which may
have year, month, day, hour, minute and second fields. Fractional seconds may be as precise as nanoseconds. |
i1, byte 1-byte integer |
1, 127, -128 | A number, with optional sign,
no fractions, no exponent. |
i2 2-byte integer |
1, 703, -32768 | " |
i4, int 4-byte integer |
1, 703, -32768, 148343, -1000000000 |
" |
i8 8-byte integer |
1, 703, -32768, 1483433434334, -1000000000000000 |
" |
ui1 unsigned 1-byte integer |
1, 255 | A number, unsigned, no fractions, no exponent. |
ui2 unsigned 2-byte integer |
1, 255, 65535 | " |
ui4 unsigned 4-byte integer |
1, 703, 3000000000 | " |
ui8 unsigned 4-byte integer |
1483433434334 | " |
r4 |
.31415E+1 | Real number ranging from
-3.402E+38 to -1.175E-37 or from 1.175E-37 to 3.402E+38 |
r8 | .314159265358979E+1 | Real number ranging from
-1.79769E+308 to -2.225E-307 or from 2.225E-307 to 1.79769E+308 |
fixed.14.4 | 1.95 | A number with 14 digits to
the left of the decimal point and 4 digits to the right of the decimal point. Convenient for representing monetary values. |
uuid | 333C4-460F-11D0-BC04-0080CA83 | Hexadecimal digits representing
octets. Optional embedded hyphens are allowed but ignored during conversion. |
uri | urn:schemas-microsoft-com:Office9 http://www.ics.uci.edu/pub/ietf/uri/ |
Universal Resource Identifier |
bin.hex Length may be specified. Default is unlimited. |
Hexadecimal digits representing octets | |
bin.base64 Length may be specified. Default is unlimited. |
MIME style Base64 encoded binary blob. | |
char Length may be specified. Default is 1. |
char | Character string, n characters long |
picture Picture must be specified. |
999-99-9999 | Constraint for validating
strings. See note below. |
The datatypes defined in "4. Datatypes" can also be used in instance datatype specifications as described in XML-Data [XML-Data]. For example:
<conversionRate DCD:dt="float">1.4172</conversionRate> |
This provides the benefit of datatype support to well-formed documents that may not have an associated DTD or DCD. It is expected that XML parsers would provide assistance in encoding and decoding these datatypes.
"Pictures", similar to those in [COBOL] picture clauses, can be used to constrain the format of strings and in some cases control their conversion to numbers. A picture is an alphanumeric string consisting of character symbols. Each symbol, which is usually one character but may be two characters, is a placeholder that stands for a set of characters. For example, the picture "A" stands for a single alphabetic character.
The following is a list of picture symbols and their meanings.
Here are some examples of picture constraints
$123,45.90 satisfies picture $999,99.99 |
The specifications in this document only allow elements to be defined as
properties of the DCD. A useful future direction may be to allow element
definitions within the context of another element definition. Element definitions
may be local or global. Global element definitions must have a
Type
property that is unique in the DCD and can be referred
to by name in other definitions. Local element definitions can be used within
the containing definition and can be referred to in other definitions by
a resource identifier as described for attribute definitions in
"2.3 Referring to Elements and Attributes".
For example, in the following, FirstName
, MI
and
LastName
are defined elsewhere in the DTD but
Address
comes from a namespace declared with the
common
prefix. The Telephone
element is defined
locally within the person
definition.
<ElementDef Type="person" Model="Elements" > |
An element type may be declared to re-use the content model declarations
of other element types through the use of the extends
property.
This property effectively replaces itself with the entire content model of
the element type it names. For example:
<ElementDef Type="polygon" Model="Elements"> |
A legal instance of regularPolygon (in this case an empty equilateral triangle 3mm on a side) might be:
<regularPolygon n="3"> |
Using extends
also allows instances of the extending element
type to occur anywhere the extended type is allowed. In the above example
this means that any content model that allows polygon will also now allow
regularPolygon. Furthermore, attributes declared on the extended element
type may also occur on the extending element type, so in the example
n
can, in fact must, now appear on regularPolygon. For example,
if in addition to the above example we have:
<ElementDef Type="picture"> |
then the following is a valid schema:
<picture> |
Note that in the above examples, Element
declarations occur
directly within an ElementDef
without an enclosing
Group
. We allow this to facilitate inheritance. The
Element
declaration opens a default Group
. In fact,
Element
extends Group
and inherits its properties.
We restrict the use of extends
to cases where the merger of
the two content models involved is straightforward.
Content="Open"
or
the extending element type must have no content at all, either explicit or
inherited.
order
attribute must be consistent. The following table shows
all the allowed values (if the extended element type has order with value
Alt
, no extension is possible):
Extended | Extending |
Seq | Seq |
Bag | Bag; Seq |
Alt | Alt |
Extended | Extending |
Empty | Empty |
Data | Data; Empty |
Elements | Elements |
Any; Mixed | Any; Mixed; Data; Elements |
Consistent with the above remark about the extending element type being allowed anywhere the extended one is, the guiding principle is that anything allowed by the extending declaration would also be allowed by the extended one if the tag was changed. That is, the extending type is polymorphic to the extended type. Thus, if we rename regularPolygon to polygon in the first example above, we get a schema-valid polygon:
<polygon n="3"> |
It's legal as a polygon, because it has everything a polygon requires
(n
attribute, diagonals
sub-element), and the
side
sub-element is permissible because polygon has, by default,
open Content.
Note that a single ElementDef can contain multiple extends. This does not cause ambiguity -- effectively, the extended content model is dropped in as a group in the relevant place in the extending model.
For several situations, especially in mapping data from a database into XML, we need to handle the case where the value is not specified. This is different from a numeric value being zero or a string being empty.
If the element or attribute is not Required
then it can just
be omitted. If it is Required
or if it has a default value then
it is desirable to be able to indicate that its value in the database was
undefined. This can be done by defining a special attribute to signal this
condition. If an element is involved then the special attribute is an attribute
of the element. If an attribute is involved it is another attribute of the
parent element. In either case, the special attribute takes one of two values
"True" or "False".
Consider the case of a required Salary
element. A missing
Salary
element would be appear as:
<Employee> |
If Salary
was a required attribute on, say, an Employee
element then we would need to define another attribute on
Employee
called, say, Salary_null
.
If the element or attribute had a default value the value would appear along
with the null
attribute with a True
value.
Similarly, special attributes can be defined to indicate errors in data conversion
In current XML, the ID
attribute type is unique within a document.
Unique attribute and element types are very important and should be extended
to any named attribute and element type with the ability to specify the scope
of the uniqueness. For elements, uniqueness specification applies only if
the model type is Data
i.e. it does not apply to elements that
have structure. Particular implementations can use unique element and attribute
types to define keys to speed up searches.
Essentially, when defining an attribute type we can specify that it's value is unique within a particular element type.
<AttributeDef UniqueIn="Company" Global="True" |
Company
is the name of an element type defined within the DTD.
This specifies that the SerialNumber
attribute is unique within
Company
elements in documents conformant with this DTD. The
default value of the UniqueIn
Attribute is "null" which signifies
the entire document. Thus, the default behavior is the current XML behavior.
This work is totally dependent on the whole lineage of metadata thinking in the World Wide Web Consortium. This specification has benefited greatly as a result of input from David Fallside and David Singer, both of IBM, Andrew Layman and Jean Paoli both of Microsoft, and from Lauren Wood of SoftQuad. We also wish to thank Henry Thompson of the University of Edinburgh and all the authors of the XML-Data specification [XML-Data].