Copyright ©1998 W3C (MIT, INRIA, Keio) , All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document is a revision of the RDF Schema working draft dated 9 April 1998. The major difference between this version and the previous version is that this version adopts a "property centric" approach whereas the previous version was "class-centric". In the previous version, Classes could be defined in a manner similar to an OO programming language like Java. A new class would have a number of "allowedPropertyType" arcs that pointed to property types which would be expected to occur on all instances of the class (modulo optionality constraints). For example, if we defined a class "Book", we might define it to have allowed property types of "author", "title", and "publisher". If all three of those were not defined, then we did not have a legal occurance of a "Book" node. This approach is familiar to many, because of its similarity to programming. It works well if things can be designed in advance. However, our direction is to allow a very free-flowing annotation style, and we believe that may not fit in with heavily pre-designed class hierarchies.
This version of the specification adopts a property-centric approach. Instead of defining a Class in terms of the Properties it has, we define Properties in terms of the Classes they may connect. That is the role of the RDFS:domain and RDFS:range constraints. For example, we could define the "author" property to have a domain of "Book" and a range of "String". The benefits of the property centric approach are that it is very easy for anyone to say anything they want about existing resources, which is one of the axioms of the web. Feedback on this point is particuarly encouraged.
This draft specification is a work in progress representing the current consensus of the W3C RDF Schema Working Group. This is a W3C Working Draft for review by W3C members and other interested parties. Publication as a working draft does not imply endorsement by the W3C membership. We caution that further changes are possible and therefore we recommend that only experimental software or software that can be easily field-upgraded be implemented to this specification at this time. The RDF Schema Working Group will not allow early implementation to constrain their ability to make changes to this specification prior to final release.
The Resource Description Framework is part of the W3C Metadata Activity. The goal of this activity, and of RDF specifically, is to produce a language for the exchange of machine-understandable descriptions of resources on the Web. A separate specification describes the data model and syntax for the interchange of metadata using RDF.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".
Note: As working drafts are subject to frequent change, you are advised to reference the above URL for "Latest version" rather than the URLs for working draft versions themselves. The latest version URL will always point to the most current version of this draft.
Note: The HTML source of this document contains embedded RDF and will therefore not validate by the HTML4.0 DTD. A solution for those requiring DTD-style validation services may come from future W3C work.
Comments may be sent to www-rdf-comments@w3.org. The public archive of these comments is available at http://lists.w3.org/Archives/Public/www-rdf-comments/
The Resource Description Framework (RDF) is targeted at supporting applications such as:
The descriptions used by these applications are, in essence, models of web resources and their interrelationships. Basic RDF, as specified in [RDFMS], has the ability to represent the information needed by those applications. However, it does not provide facilities to make it easy to build those descriptions, nor to know if a particular description meets the needs of a particular application. That is the void this specification seeks to fill.
Each of the applications above needs to say certain things about certain kinds of resources. In other words, each of those applications has a descriptive schema it needs to implement. For bibliographic descriptions a schema would include things like author, title, and subject. A schema should define the kinds of resources we need to talk about (web pages, people, companies), define how the resources can be related (Author, EmployedBy, PublishedBy), and define certain things we want to say about those resources (publicationDate, givenName, fileType).
This document does not specify a vocabulary of descriptive elements such as Author or ResourceSize. Instead, it specifies the mechanisms needed to define such elements, to define the classes of resources they connect, to restrict possible combinations of classes and relationships, and to detect violations of those restrictions. Thus, this document defines a schema specification language. More succinctly, the RDF Schema mechanism provides a basic type system for use in RDF models. It defines nodes and arcs such as Class and subClassOf that are used in specifying descriptive schemas.
The typing system is specified in terms of the basic RDF data model - as nodes and arcs. Thus, the nodes and arcs of this typing system become part of the graph of any description that uses them. The schema specification language is a declarative representation language influenced by ideas from knowledge representation (e.g. semantic nets, frames, predicate logic) as well as database schema specification languages (e.g. NIAM) and graph data models. The RDF schema specification language is less expressive, but much simpler to implement, than full predicate calculus languages such as CycL[CycL] and KIF [KIF].
RDF and the RDF Schema language were also based on metadata research in the the Digital Library community. In particular, RDF adopts a modular approach to metadata along the lines of the Warwick Framework [WF]. RDF represents an evolution of the Warwick Framework model in that the Warwick Framework allowed each metadata vocabulary to be represented in a different syntax. Within RDF, all vocabularies are expressed within a single well defined model and syntax. This allows for a finer grained mixing of machine-processable vocabularies, and addresses the need [EXTWEB] to create metadata in which statements can draw upon multiple vocabularies that are managed in a decentralised fashion by various communities of expertise.
RDF Schemas aim not at theoretical issues, but at solving a small number of immediate problems. Its creators expect that other problems (listed in appendix B) will share similar characteristics and that they also may be able to use the basic classes described in this paper.
This paper was directly influenced by consideration of the following problems:
The RDF Model and Syntax is adequate to represent most of PICS [PICS], however, it requires augmentation by features such as the ability to specify a default rating for any page with a certain URL prefix. This draft does not yet provide such a mechanism.
One obvious application for RDF is in the description of web pages. This is one of the basic functions of the Dublin Core [DC] initiative. The Dublin Core is a set of 15 elements believed to be broadly applicable to describing web resources to enable their discovery. The Dublin Core has been a major influence on the development of RDF. An important consideration in the development of the Dublin Core was to allow simple descriptions, but also to provide the ability to qualify descriptions in order to provide both domain specific elaboration and descriptive precision.
The RDF Schema mechanism proposed in this paper provides a machine-understandable system for defining 'schemas' for descriptive vocabularies like the Dublin Core. It allows designers to specify classes of Resource types, property types to convey descriptions of those classes, and constraints on the allowed combinations of classes, property types, and values.
An initial schema for the simple Dublin Core is provided in Appendix D. This schema defines the 15 elements as property types, and gives a description of their purpose. Despite the simplicity of the definition, it is believed that this schema serves as the foundation for more elaborate definitions. Future extensions are likely to specify the structure of the values of the properties, which will involve defining classes, the property types that apply to those classes, and some constraints on the property values. In order for browsers and authoring tools to understand and enforce these constraints, this information should be machine understandable. This document provides a machine understandable schema language for expressing such definitions and constraints.
A sitemap is a hierarchical description of a site. A subject taxonomy is a taxonomy such as that used by Yahoo!, Excite, etc. RDF Schemas aims to provide a mechanism by which sitemaps and taxonomies using the RDF Model can express the classes of nodes, properties, relations etc. they use.
The W3C Platform for Privacy Preferences Project (P3P) requires a grammar for constructing statements about a site's data collection practices, personal preferences as exercised over those practices, as well as a syntax for exchanging structured data. The ability for third party assurances (signed statements) regarding P3P practices is also important. For instance, entities may wish to certify that P3P practice statements were properly generated in accordance with industry guidelines, have been audited, or are compliant with the relevant privacy regulations.
These problems have influenced the design of this first version of the RDF Schema. RDF is tailored for easy extensibility. If RDF succeeds, we expect that future versions of RDF will be more ambitious and include more problems in their design space.
An RDF Schema can be expressed by the data model described in the RDF Model and Syntax [RDFMS] document. The schema description language is simply a set of resources and property types defined by this paper and implicitly part of every RDF graph using this schema machinery.
This document specifies the RDF Schema mechanism as a set of RDF resources, property types, and constraints on their relationships. We especially solicit comment on the understandability of the names we have chosen for these.
The Resource Description Framework is intended to be flexible and easily extensible; this suggests that a great variety of schemas will be created and that new and improved versions of these schemas will be a common occurence on the Web. Since changing a schema risks breaking other RDF graphs which depend on that schema, this specification requires that a new URI is used whenever an RDF schema is changed. In effect, changing a schema creates a new one; new schemas namespaces should have their own URI to avoid ambiguity. Since an RDF Schema URI unambiguously identifies a single version of a schema, RDF processors (and Web caches) should be able to safely store copies of RDF schema graphs for an indefinite period. The problems of RDF schema evolution share many characteristics with XML DTD version management and the general problem of Web resource versioning. Is is expected that a general approach to these issues will presented in a future version of this document, in co-ordination with other W3C activities. Future versions of this document may also offer RDF specific guidelines: for example, describing how a schema could document its relationship to preceding versions.
Since each RDF schema has its own unchanging URI, these can be used to construct unique URI references for the resources defined in a schema. This is achieved by combining the local identifier for a resource with the URI associated with that schema namespace. The XML representation of RDF uses the XML namespace mechanism for associating elements and attributes with URI references for each vocabulary item used.
The following resources belong to the core RDF Schema. We first define the type system and then introduce property types for expressing various kinds of constraints.
The RDF Schema defined in this paper is a set of RDF resources and property types that can be used to describe classes of RDF nodes, including properties and relations. They are defined in a namespace informally called 'RDFS' here and which will be more formally defined and given a URI in the future.
As described in the RDF Model and Syntax paper, nodes may be RDF:instanceOf one or more classes. However, classes themselves are often used in a hierarchical fashion, for example considering the class 'dog' to be a subclass of 'animal,' which is a subclass of 'organism' etc., meaning that any node which is an RDF:instanceOf 'dog' is ipso facto an RDF:instanceOf 'animal' and so on. This specification describes a property type, RDFS:subClassOf, to denote such relationships.
In addition to the RDFS:subClassOf relation, this paper defines a small number of other resources and properties that express constraints on instances of classes, such things as statements that all instances of a class have certain properties or relations, limitations on the types of values that are valid for a property or relation. This paper gives a mechanism for describing such constraints, but does not say whether or how an application must process the constraint information. For example, while an RDF schema may express that a 'Book' may have an 'author' property, it does not say whether or how an application should act in processing that information. We expect that different applications will use these constraints in different ways. e.g., a validator will look for errors, an editor might suggest legal values.
We anticipate the development of a set of classes corresponding to a set of "datatypes." This paper does not define datatypes, but does note that datatypes may be used as the value of the RDFS:range property.
The following resources are core classes that are defined as part of the RDF Schema machinery. Every RDF graph (implicitly) includes these.
All resources, i.e., all the elements of the set Nodes defined in section 5.1 of the RDF Model and Syntax document, are instances of RDF:Resource. This roughly corresponds to the concept of Object in Java.
All the property types, i.e., all the elements of the set PropertyTypes defined in section 5.1 of the RDF Model and Syntax document, are instances of RDF:PropertyType. Conversely, every instance of RDF:PropertyType is an element of PropertyTypes.
This corresponds to the generic concept of a Type or Category, similar to the notion of a Class in object-oriented programming languages such as Java. When a schema defines a new class, the node representing that class must have an RDF:instanceOf arc to this node. Classes can be defined to represent almost anything, such as web pages, people, cars, rabbits, and animals.
Every RDF graph which uses the schema mechanism also (implicitly) includes the following core property types. These are instances of the PropertyType class and provide a mechanism for expressing relationships between classes and their instances or superclasses.
This indicates that a resource is a member of a class, and thus has all the characteristics that are to be expected of a member of that class. It is a relation between a resource and another resource which must be an instance of Class. A resource may be an instance of more than one class.
This indicates the subset/superset relation between classes. RDFS:subClassOf is transitive. If class A is a sub-class of class B, and B is a sub-class of C, then A is also implicitly a sub-class of C. Consequently, resources that are instances of class A will also be instances of C, since A is a sub-set of both B and C. Only instances of Class can have the RDFS:subClassOf property type and the property value is always an instanceOf Class. A Class may be a subClassOf more than one Class.
A class can never be declared to be a sub-class of itself, nor of any of its own sub-classes. Note that this constraint is not expressible using the RDF Schema constraint facilities provided below, and so does not appear in the RDF version of this specification given in Appendix A.
An RDF schema can declare constraints associated with classes and property types. In particular, the concepts of domain and range are used in RDF schemas to make statements about the contexts in which certain property types "make sense".
Although the RDF data model does not allow for explicit arcs (such as instanceOf) from an atomic value to a resource (e.g. class), we nevertheless consider these entities to members of classes (e.g. "John Smith" is considered to be a member of the class rdfs:string.) We expect future work in RDF and XML data-typing to provide clarifications in this area.
A model that violates a constraint is an inconsistent model. Different applications may exhibit different behaviours in the face of an inconsistent model. Some examples of constraints include:
RDF schemas can express constraints that relate vocabulary items from multiple independently developed schemas. Since URI references are used to identify classes and property types, it is possible to create new property types whose domain or range is constrained to be a class defined in another namespace.
The following property types are provided to support simple documentation and user-interface related annotations within RDF schemas. Multilingual documentation of schemas is supported by the use of the xml:lang language tagging facility. Since RDF schemas are expressed within the RDF data model, other namespaces may be used to provider richer documentation.
The RDF Model and Syntax specification introduces certain concepts. These are defined formally in another namespace identified with the URI http://www.w3.org/TR/WD-rdf-syntax#. Here we describe resources corresponding to a number of these concepts.
The RDF Schema namespace includes a resource known as RDF:Collection. This is a class representing the set of collections. Collection is the super-class of the collection types Bag, Seq and Alt as defined in the RDF Model and Syntax specification.
The following will be described and added to the figures and Appendix A in a future draft.
Figure 1: Classes and Resources as Sets and Elements
Figure 2 shows the same information about the class hierarchy as in figure 1, but does so in terms of the RDF data model. If a class is a subset of another, then there is a RDFS:subClassOf arc from the node representing the first class to the node representing the second. Similarly, if a Resource was an instance of a Class, then there is an RDF:instanceOf arc from the resource to the node representing the class. (Not all such arcs are shown. We only show the arc to the most tightly encompassing class, and rely on the transitivity of the 'subClassOf' relation to provide the rest).
Figure 2: Class Hierarchy for the RDF Schema
The RDF Schema uses the constraint property types to constrain how its own PropertyTypes can be used. These constraints are shown below in figure 3. Nodes with bold outlines are instances of RDFS:Class.
Figure 3: Constraints in the RDF Schema
Note: This document was prepared and approved for publication by the W3C RDF Schema Working Group (WG). WG approval of this document does not necessarily imply that all WG members voted for its approval.
David Singer of IBM is the chair of the group; we thank David for his efforts and thank IBM for supporting him and us in this endeavor.
Ron Daniel produced all the graphics for this document.
The working group membership has included:
Nick Arnett (Verity), Dan Brickley (ILRT/University of Bristol), Walter Chang (Adobe), Sailesh Chutani (Oracle), Ron Daniel (DATAFUSION), Joe Lapp (webMethods Inc.), Patrick Gannon (CommerceNet), RV Guha (Netscape), Tom Hill (Apple Computer), Renato Iannella (DSTC), Sandeep Jain (Oracle), Kevin Jones, (InterMind), Emiko Kezuka (Digital Vision Laboratories), Ora Lassila (Nokia Research Center), Andrew Layman (Microsoft), John McCarthy (Lawrence Berkeley National Laboratory), Michael Mealling (Network Solutions), Norbert Mikula (DataChannel), Eric Miller (OCLC), Frank Olken (Lawrence Berkeley National Laboratory), Sri Raghavan (Digital), Lisa Rein (webMethods Inc.), Tsuyoshi Sakata (Digital Vision Laboratories), Leon Shklar (Pencom Web Works), David Singer (IBM), Wei (William) Song (SISU), Neel Sundaresan (IBM), Ralph Swick (W3C), Naohiko Uramoto (IBM), Charles Wicksteed (Reuters Ltd.), Misha Wolf (Reuters Ltd.)
Not all of the people listed above have been members throughout the entire duration of the working group, but all have contributed to the evolution of this document.
The RDF specification of the above is given below in the serialization syntax. Please note that the namespace URIs listed are examples only; formal identifiers have not yet been assigned for these schemas.
Note that there are some constraints (such as those given in 2.3.2 above) on certain RDF Schema constructs which are themselves not fully expressible in the RDF Schema specification language. For example, the RDF below does not tell us that subClassOf arcs should not form loops in any RDF graph.
|
This section gives some brief examples of using the RDF Schema machinery to define classes and property types for some likely purposes.
In this example, Person is a class with a corresponding description of "Class for representing people. Instances correspond to a single person." A Person is a subclass of Animal. A Person may have an age. The value of age is an integer. A Person may also have an ssn ("Social Security Number"). The value of ssn is an integer. A Person's marital status is one of {Married, Divorced, Single, Widowed}. This is achieved through use of the range constraint: we define both a property type maritalStatus and a class MaritalStatus (adopting the convention of using lower case letters to begin the names of property types, and capitals for classes). We then use RDFS:range to state that a maritalStatus property only 'makes sense' when it has a value which is an instance of the class MaritalStatus. The schema then defines a number of instances of this class. Whether resources declared to be instanceOf MaritalStatus in another graph are trusted is an application level decision; such decisions will be aided by the provisions in RDF for digital signatures.
|
In this example we sketch an outline of an RDF vocabulary for use with searchable Internet services. SearchQuery is a declared to be a class. Every SearchQuery must have both a queryString whose value is a String and a queryService whose value is a SearchService. A SearchService is a subclass of InternetService (which is defined elsewhere). A SearchQuery has some number of results (whose value is SearchResult). Each SearchResult has a title (value is a string), a rating (value is between 0 and 1) and of course, the page itself.
The modularity of RDF allows other vocabularies to be combined with simple schemas such as this to characterise more fully the properties of networked resources. For example, Dublin Core or a library-based classification vocabulary might be used to describe the subject coverage or collections-level properties for each SearchService, while an independently managed "search protocols" vocabulary could be used to describe connection details for (say) LDAP, WHOIS++ or Z39.50 search interfaces offered by the service. By allowing the creation of statements which draw upon specialised schemas from various domains, RDF makes it possible for communities of expertise to contribute to a decentralised web of machine-readable vocabularies.
|
In particular: should it be possible to avoid requiring RDF aware user-agents to download the entire large schema across the network when dereferencing a schema URI? In particular: how, if it all, might a schema namespace URI be modified to obtain alternate forms or fragments of the schema? (one possible scenario being the use of the XML-Link [XML-Link] '|' fragment identifier).
The RDF Schema mechanism will need to interact with many externally developed typing systems. There are two broad categories of such systems. The first are externally defined "primitive data types", such as IEEE floating point numbers, Integers, Boolean values, Dates and Times, etc. The second category are external "type systems", which provide features such as inheritance, type inferencing, etc.
At this time we have not even begun to consider the second category. Several factors make it difficult to decide on the appropriate interactions with the first category. RDF models are exchanged as XML document instances. The XML Working group has expressed an interest in working on the problem of data typing, to provide the ability to specify that element content should be interpreted as an integer, a date, a float, left as a string, etc. The interactions between data typing efforts in XML and RDF is currently being discussed by the W3 staff, so this document does not provide a specification for those interactions that is as firm as the specification for elements such as RDFS:Class, RDFS:subClassOf, etc.
However, it is the rough consensus of the RDF Schema WG that it would be useful to show that the current schema system can actually accommodate externally defined primitive data types. Therefore, figure 1, and the relevant portion of the text of the specification, was modified to give a provisional indication of how external types might be handled. The reader is advised that those portions of the specification are highly subject to change, even more so than the rest of this specification. All of those sections have been explicitly marked to refer to this open issue.
The following represents an initial schema for the simple Dublin Core Element Set [DC]. The following schema is for illustration purposes only; the authoritative Dublin Core Schema will be made available by the Dublin Core Initiative.
This schema is provided as the foundation for the Dublin Core semantics. It is believed to be all that is needed to serve as the foundation for future, more elaborate definitions. Future extensions are likely to specify the structure of the values of the properties, which will involve defining classes, the property types that apply to those classes, and some constraints on the property values.
|