W3CNOTE-CCPP-19981130

Composite Capability/Preference Profiles (CC/PP): A user side framework for content negotiation

W3C NOTE 30 November 1998

This Version:
http://www.w3.org/TR/1998/NOTE-CCPP-19981130
Latest Version:
http://www.w3.org/TR/NOTE-CCPP
Previous version (member-only):
http://www.w3.org/Mobile/Group/IG/ccpp-paper
Editors
Franklin Reynolds franklin.reynolds@research.nokia.com, Nokia Research Center
Johan Hjelm hjelm@w3.org, W3C/Ericsson
Spencer Dawkins sdawkins@nt.com, Nortel
Sandeep Singhal singhal@us.ibm.com, IBM

Status of this document

This document is a work in progress, representing a revision of the working draft dated 1998-10-05 incorporating suggestions received in review comments and further deliberations of the W3C Mobile Access Interest Group. It also incorporates suggestions resulting from reviews by members of the IETF CONNEG working group and the WAPForum. It is the first public review draft of this document. Publication as a working draft does not imply endorsement by the W3C membership.

All RDF code has been validated with SiRPAC, the W3C RDF validator.

Review comments from the public on this document should be sent to www-mobile@w3.org which is an automatically archived email list. Information on how to subscribe to public W3C email lists can be found at http://www.w3.org/Mail/Request.

This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by this NOTE.

Copyright © 1997,1998 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

Table of Contents

Abstract

1. Introduction

1.1 Wireless networks

1.2 Goals of this work

2. Metadata and profiles

3. Composite Capability/Preferences Profiles (CC/PP)

3.1 Inline example

3.2 Indirect references

3.3 Runtime changes

4. Protocol considerations

5. Summary

References

Abstract

In this note we describe a method for using RDF, the Resource Description Format of the W3C, to create a general, yet extensible framework for describing user preferences and device capabilities. This information can be provided by the user to servers and content providers. The servers can use this information describing the user's preferences to customize the service or content provided. The ability of RDF to reference profile information via URLs assists in minimizing the number of network transactions required to adapt content to a device, while the framework fits well into the current and future protocols being developed a the W3C and the WAP Forum.

1. Introduction

This document describes the rationale and design of a profile service to describe the capabilities and preferences of Web enabled applications. A Composite Capability/Preference Profile (CC/PP) is a collection of the capabilities and preferences associated with user and the agents used by the user to access the World Wide Web. These user agents include the hardware platform, system software and applications used by the user. User agent capabilities and references can be thought of as metadata or properties and descriptions of the user agent hardware and software.

A description of the user's capabilities and preferences is necessary but insufficient to provide a general content negotiation solution. A general framework for content negotiation requires a means for describing the meta data or attributes and preferences of the user and his/hers/its agents, the attributes of the content and the rules for adapting content to the capabilities and preferences of the user. The current mechanisms, such as accept headers and <alt> tags, are somewhat limited. Future services will be more demanding. For example: the content might be authored in multiple languages with different levels of confidence in the translation and the user might be able to understand multiple languages with different levels of proficiency. To complete the negotiation some rule is needed for selecting a version of the document based on weighing the user's proficiency in different languages against the quality of the documents various translations.

This proposal focuses on the design of a user agent profile service based on XML/RDF. RDF, the Resource Description Format [RDF][RDF-Schema], was designed by the W3C consortium. There is a specification that describes how to encode RDF using XML. RDF was designed to describe the machine understandable properties of web content. In this proposal we explore to use of RDF to describe capabilities and preferences associated with a user and the hardware and software agents used to access the web. We expect the use of a common technology to encode metadata describing both content and a user's preferences will encourage the adoption of the technology and simplify the use of metadata in the Web. Hopefully, powerful tools for dealing with XML and RDF, some of which are already under development, will be available.

Some potentially complex negotiation may have to take place between the content or the server of the content and the user of the content. For example: the content might be authored in multiple languages with different levels of confidence in the translation and the user might be able to understand multiple languages with different levels of proficiency. Though we hope that the use of RDF to encode the metadata describing the content and the user's preferences will facilitate the development of solutions to these kinds of complex negotiations, the implementation of appropriate rules for the negotiation is left to application developers.

Alternate methods for describing the attributes or meta data of documents are under investigation by other organizations such as the IETF Content Negotiation [CONNEG] working group. Though this proposal is not directly compatible with the IETF CONNEG proposals currently under development, RDF allows the use of multiple vocabularies. Hopefully, this will provide a means for interoperability, at least at the level of attribute vocabularies. The CONNEG working group is also developing a media feature matching algebra. Efforts are underway to insure that the CONNEG algebra and RDF are complementary technologies. In addition to the IETF we are particularly concerned about the WAPForum and ETSI. The success of the CC/PP effort will undoubtedly hinge on our ability to cooperate with those organizations.

1.1 Wireless Networks

Compared to the typical wireline data networks available to corporate desktop users, wireless networks are more expensive, provide less bandwidth, with higher latency and less reliability. SMS data service on GSM networks provides 22 bytes (!) per second to a typical mobile host. The situation is rapidly changing. Emerging packet oriented, cellular networks, such as CDPD and CDMA, and with packet oriented bearer technologies such as GPRS and EDGE are providing higher bandwidth and lower latency. Within the next decade we should see the deployment of "third generation" cellular networks that provide low latency and megabit bandwidth to mobile hosts.

But today's wireless networks are slow and tomorrow's wireless networks will be slow compared to tomorrow's wireline networks. Protocols designed for wireline networks without regard for the limitations of wireless networks often exhibit undesirable behavior when deployed on wireless networks.

CC/PPs are intended to provide information necessary to adapt the content and the content delivery mechanisms to best fit the capabilities and preferences of the user and its agents. Protocol design is beyond the scope of this group, however, the use of CC/PPs does have some impact on web protocols and in this section some of those issues are discussed. The design and implementation of HTTP-NG is being actively carried out by another group. In this section we limit our discussion to some of the issues that many need to be considered in HTTPng or similar protocols:

Profiles can be quite verbose.
We need ways to reduce the overhead for low bandwidth networks like the cell phone network.
CC/PPs should be cacheable on gateways/proxies.
Components used to construct CC/PPs, such as vendor default profiles, should be independently cacheable.
Changes to the active profile should be very lightweight.
We don't want to have to resend the whole profile to turn off sound.
The protocols must be able to exploit gateways and proxies if they exist.
Though vendors may be able to supply URLs that name default profiles, the client devices may store this information in case the vendor site  is unreachable for some reason.

1.2 Goals of this work

The goal of this work is to:

Enhance content negotiation speed through a standardized format for user agent profiles.
Minimize content negotiation transactions through the use of standardized formats and referencing URLs.
Recognize and support the composition of preferences and profiles originating from multiple sources (e.g. hardware vendors, software vendors, users, etc.).
Enable user control over user agent information (e.g. personal preferences, etc.).
Enable the use of compact data formats, such as tokenized XML [TokenXML], for content negotiation.
Support the presence of multiple network elements (proxies, servers, etc.) between the user agent and the origin server.

The data model for the capability and preferences profile is similar to a table of tables. Each individual table roughly compares to a significant hardware or software component. The primary goal is to be able to describe the desired table of tables in an unambiguous and inter operable fashion. Secondary goals include general applicability and good performance.

2. Metadata and profiles

In most documents on 3rd generation networks, scenarios are presented where users will want to assert several preferential factors[IMT-2000]. Also, mechanisms for this exist [Agent-attrib]. The preferences are such as:

preferred language
sound on/off
images on/off
privacy preferences (like P3P)
scripting on/off
cookies on/off
etc.

They will also want to assert hardware platform attributes, like:

vendor
model
class of device {phone, pda, printer, etc.}
screen size
colors
available bandwidth
CPU
memory
input device
secondary storage
loudspeaker
etc.

We also expect them to want to assert software defined variables, such as:

application brand and version
level of HTML support
supported XML vocabularies
Level of CSS support
supported RDF vocabularies
level of WAP support
supported scripting languages(s)
etc.

It is interesting to note that metadata (capabilities and preferences) associated with the device, the software used to access the web and the user of the device could originate from different sources created at different times. The hardware vendor might have profile information available for its products, the software vendor might supply a default profile, and the user's preferences might apply across multiple applications (preferred language) or change during a session (sound on/off). If it is too complex people won't use it and if it too slow people won't use it. The challenge is to provide an efficient mechanism for communicating the profiles for constrained devices, such as smart phones, using slow networks, such as GSM SMS.

3. Composite Capability/Preferences Profiles (CC/PP)

The CC/PP proposal describes an interoperable encoding for capabilities and preferences of user agents, specifically web browsers. The proposal is also intended to support applications other than browsers, including email, calendars, etc. Support for peripherals like printers and fax machines will require other types of attributes such as type of printer, location, Postscript support, color, etc. We believe an XML/RDF based approach would be suitable. However, metadata descriptions of devices like printers or fax machines may use a different scheme. Every reasonable effort will be made to provide interoperability other important proposals.

The basic data model for a CC/PP is a collection of tables. Though RDF makes modeling a wide range of data structures possible, it is unlikely that this flexibility will used in the creation of complex data models for profiles. In the simplest form each table in the CC/PP is a collection of RDF statements with simple, atomic properties. These tables may be constructed from default settings, persistent local changes or temporary changes made by a user. One extension to the simple table of properties data model is the notion of a separate, subordinate collection of default properties. Default settings might be properties defined by the vendor. In the case of hardware the vendor often has a very good idea of the physical properties of any given model of product. However, the current owner of the product may be able to add options, such as memory or persistent store or additional I/O devices that add new properties or change the values of some original properties. These would be persistent local changes. An example of a temporary change would be turning sound on or off.

The profile is associated with the current network session or transaction. Each major component may have a collection of attributes or preferences. Examples of major components are the hardware platform upon which all the software is executing, the software platform upon which all the applications are hosted and each of the applications. This following is a simplified example of the sort of data expected to be encoded in these profiles.

Hardware Platform

Memory = 64mb

CPU = PPC

Screen = 640*400*8

BlueTooth = Yes

Software Platform

OS version = 1.0

HTML version = 4.0

WML version = 1.0

Sound = ON

Images = Yes

Email

Language = English

...

Some collections of properties and property values may be common to a particular component. For example: a specific model of a smart phone may come with a specific CPU, screen size and amount of memory by default. Gathering these "default" properties together as a distinct RDF resource makes it possible to independently retrieve and cache those properties. A collection of "default" properties is not mandatory, but it may improve network performance, especially the performance of relatively slow wireless networks.

Any RDF graph consists of nodes, arcs and leafs. Nodes are resources, arcs are properties and leafs are property values. An RDF graph based on the previous example that includes "Default" properties for each major component is relatively straightforward.

The introduction of "Defaults" makes the graph of each major component more of a simple tree than a table. In this example the major components are associated with the current network session. In this case, the network session is serving as the root of a tree that includes the trees of each major component. RDF was originally intended to describe metadata associated with documents or other objects that can be named via a URI. The closest thing to a "document" associated with a CC/PP is the current network session.

From the point of view of any particular network transaction the only property or capability information that is important is whatever is "current". The network transaction does not care about the differences between defaults or persistent local changes, it only cares about the capabilities and preferences that apply to the current network transaction. Because this information may originate from multiple sources and because different parts of the capability profile may be differentially cached, the various components must be explicitly described in the network transaction.

The CC/PP is the encoding of profile information that needs to be shared between a client and a server, gateway or proxy. The persistent encoding of profile information and the encoding for the purposes of interoperability (communication) need not be the same. In this document we consider the use of XML/RDF as the interoperability encoding. Persistent storage of profile information is left to the individual applications.

3.1 Inline example

Consider a more realistic example of inline encoding of a CC/PP for a hypothetical smart phone. This is an example of the type of information a phone might provide to a gateway/proxy/server. Note that we do not explicitly name the "current network session". Instead, the profiles of each major component is collected in a "Bag". This is probably not necessary since the document in question, the network session, is unlikely to contain additional RDF.

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#" 
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#"> <RDF:Bag> <RDF:Description about="HardwarePlatform"> <PRF:Defaults> <Description PRF:Vendor="Nokia" PRF:Model="2160" PRF:Type="PDA" PRF:ScreenSize="800x600x24"         PRF:CPU="PPC"
      PRF:Keyboard="Yes"
        PRF:Memory="16mB"         PRF:Bluetooth="YES"         PRF:Speaker="Yes" /> </PRF:Defaults> <PRF:Modifications> <Description PRF:Memory="32mB" /> <PRF:Modifications> </RDF:Description> <RDF:Description about="SoftwarePlatform"> <PRF:Defaults> <Description PRF:OS="EPOC1.0" PRF:HTMLVersion="4.0" PRF:JavaScriptVersion="4.0" PRF:WAPVersion="1.0"         PRF:WMLScript="1.0" /> </PRF:Defaults> <PRF:Modifications> <Description PRF:Sound="Off" PRF:Images="Off" /> </PRF:Modifications> </RDF:Description> <RDF:Description about="EpocEmail1.0"> <PRF:Defaults> <Description PRF:HTMLVersion="4.0" /> </PRF:Defaults> </RDF:Description> <RDF:Description about="EpocCalendar1.0"> <PRF:Defaults> <Description PRF:HTMLVersion="4.0" /> </PRF:Defaults> </RDF:Description> <RDF:Description about="UserPreferences"> <PRF:Defaults> <Description PRF:Language="English" /> </PRF:Defaults> </RDF:Description> </RDF:Bag> </RDF>

This sample profile is a collection of the capabilities and preferences associated with either a user or the hardware platform or a software component. Each collection of capabilities and preferences are organized within a description block. These description blocks may contain subordinate description blocks to describe default attributes or other collections of attributes. One new namespace "PRF" was introduced which contains the vocabulary used for all attributes and preferences. The PRF namespace is intended to provide a well defined collection of attributes of general utility. There is nothing that prevents the use of multiple namespaces. This might be useful to either define experimental or non-standard attributes or to define application specific capabilities and preferences.

Delivering all of the CC/PP at one time, inline makes some simplifications possible. If the user has overridden some default property, then there is no reason to send the default - all that is needed is to send the current value for that attribute. In the example above, there is no reason to send the hardware platform's default setting of "Memory=16mb" since the user has upgraded the memory to 32mb.

The significance of an attribute is generally limited to the component it is describing. For example, each software application can define a value for a "Version" attribute. This indicates the version of the particular application being described. In general, side effects that extend beyond the bounds of a particular component are not defined in this document. The relationship between components is system and application dependent.

The major disadvantage of this format is that it is verbose. Some networks are very slow and this would be a moderately expensive way to handle metadata. There are several optimizations possible to help deal network performance issues. One strategy is compressed form of XML [TokenXML] and a complementary strategy is to use indirect references.

3.2 Indirect References

Instead of enumerating each set of attributes, a remote reference can be used to name a collection of attributes such as the hardware platform defaults. This has the advantage of enabling the separate fetching and caching of functional subsets. This might be very nice if the link between the gateway or the proxy and the client agent was slow and the link between the gateway or proxy and the site named by the remote reference was fast - a typical case when the user agent is a smart phone. Another advantage is the simplification of the development of different vocabularies for hardware vendors and software vendors (assuming this is a good thing).

The following example uses indirect references. First the profile provided by the user agent. It refers to default profiles provided by the hardware and software platform vendors:

-----------------------------------

<?xml version="1.0"?>

<RDF
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#">
<RDF:Bag>
   <RDF:Description about="HardwarePlatform"
     PRF:Default="http://www.nokia.com/profiles/2160"
         PRF:Memory="32mB" />

  <RDF:Description about="SoftwarePlatform"
     PRF:Default="http://www.symbian.com/profiles/pda"
     PRF:Sound="Off"
     PRF:Images="Off" />

     <RDF:Description about="EpocEmail"
   
PRF:Default="http://www.symbian.com/epoc/profiles/epocemail" />

  <RDF:Description about="EpocCalendar"
    PRF:Default="http://www.symbian.com/epoc/profiles/epoccal"
/>

  <RDF:Description="UserPreferences"
    PRF:Language="English" />

 </RDF:Bag>
</RDF>

-----------------------------------------------------

Next, the profile provided by the hardware vendor.

-----------------------------------------------------

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#">
<Description
    PRF:Vendor="Nokia"
    PRF:Model="2160"
    PRF:Type="PDA"
    PRF:ScreenSize="800x600x24"
    PRF:CPU="PPC"
    PRF:Keyboard="Yes"
    PRF:Speaker="Yes"
    PRF:Memory="16mb" />
</RDF>

-----------------------------------------------------

Finally, the profiles provided by the software platform and application vendors.

-----------------------------------------------------

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#">

<Description
    PRF:OS="EPOC1.0"
    PRF:HTMLVersion="4.0"
    PRF:JavaScriptVersion="4.0"
    PRF:WAPVersion="1.0"
    PRF:WMLScript="1.0" />

</RDF>

-----------------------------------------------------------------------

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#">
<Description
    PRF:Version="EpocEmail1.0"
    PRF:HTMLVersion="4.0" />
</RDF>

------------------------------------------------------------------------

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary">
<Description
    PRF:Version="EpocCalendar1.0"
    PRF:HTMLVersion="4.0" />
</RDF>

-----------------------------------------------------

All we did in the second example was group different collections of default attributes together in such a way that they could be named by a URL. Since the hardware and software platform default profiles were independently described using a URL, they can be separately fetched and cached. When an application in the server/gateway/proxy uses RDF to process the CC/PP it may encounter attrributes with default values and user specified values. It is up the application to enforce the rule that user specified attributes over ride default values. RDF does not provide a convenient mechanism for implementing that rule.

3.3 Runtime Changes

It is worth noting again that the information we are most concerned with is the current profile. Default properties might have some importance, for example, they may be worth caching independently of any particular session or user. However, the key is for the client and the server/gateway/proxy to have a consistent view of the current profile.

It is important to be able to add to and modify attributes associated with the current CC/PP. We need to be able to modify the value of certain attributes, such as turning sound on and off and we need to make persistent changes to reflect things like a memory upgrade. We need to be able to override the default profile provided by the vendor. However, we only need to concern ourselves with changes to the current profile. Reflecting changes to preferences or capabilities in persistent storage is beyond the scope of this document.

Our problem is to propogate changes to the current CC/PP to the server/gateway/proxy. One solution is to transmit the entire CC/PP with each change. It would replace the previous profile. This is not ideal for slow networks. An alternative is to send only the changes. Thus if Sound were to be changed from "Off" to "On" the only data that would need to be sent would be:

<?xml version="1.0"?>
<RDF 
xmlns:RDF="http://www.w3C.org/TR/WD-rdf-syntax#"
xmlns:PRF="http://www.w3C.org/TR/WD-profile-vocabulary#">
  <Description ID="SoftwarePlatform"
    PRF:Sound="On" />
</RDF> 

Alternatively, the <Modification> element could be used to communicate changes.

4. Protocol considerations

When used in the context of a web browsing application, a CC/PP should be associated with a notion of a current session rather than a user or a node. HTTP and WSP (the WAP session protocol) both define different session semantics. The client, server and and gateways and proxies may already have their own, well defined notions of what constitutes a connection or a session. Our protocol strategy is to send as little information as possible and if anyone is missing something, they have to ask for it. If there is good reason to believe that someone is going to ask for a profile, the client can elect to send the most efficient form of the profile that makes sense.

Consider the following possible interaction between a server and a client. When the Client begins a session it sends a minimal profile using as much indirection as possible.  If the server/gateway/proxy does not have a CC/PP for this session, then it asks for one. When a profile is sent the client tries a minimal form, i.e., it uses as much indirection as possible and only names the non default attributes of the profile. The server/gateway/proxy can try to fill in the profile using the indirect HTTP references (which may be independently cached). If any of these fail, a request for additional data can be sent to the user which can reply with a fully enumerated profile. If the client changes the value of an attribute, such as turning sound off, only that change needs to be sent.

It is likely that servers and gateways/proxies will be concerned with different preferences. For example, the server may need to know which language the user prefers and the gateway may have responsibility to trim images to 8 bits of color (to save bandwidth). However, the exact use of profile information by each server/gateway/proxy is hard to predict. Therefore gateways/proxies should forward all profile information to the server. Any requests for profile information that the gateway/proxy cannot satisfy should be forwarded to the client.

The ability to compose a profile from sources provided by third parties at run-time exposes the system to a new type of attack. For example, if the URL that named the hardware default platform defaults were to be compromised via an attack on DNS it would be possible to load incorrect profile information. If cached within a server/gateway/proxy this could be a serious denial of service attack. If this is a serious enough problem it may be worth adding digital signatures to the URLs used to refer to profile components.

New versions of HTTP such as HTTPng should be able to support the CC/PP framework without difficulty. HTTP 1.0 servers and proxies may not be able to handle CC/PPs. Clients need to be able to detect communication with old servers and adapt the protocol accordingly. HTTP 1.1, perhaps via the Mandatory/Optional Extension Framework should be able to support sessions and profiles. At the least, 1.1 proxy servers should pass requests that include CC/PPs on to servers in the hope that the servers will understand the requests. New versions of 1.1 proxies and servers should be able to use CC/PPs.

This protocol discussion is not a specific proposal for HTTP or WSP. Its intent is merely to illustrate how the design allows us to exploit the cachability of both the current session state and the default profiles.

5. Summary

In this document, we have described a proposal for the use of XML/RDF to describe user preferences and the capabilities of the device and software used to access the Web. Encodings of hypothetical user profiles were used to illustrate some of the benefits of RDF. Some of the possible ramifications for Web protocol design were discussed.

References

[Agent-attrib] Client-Specific Web Services by Using User Agent Attributes. Tomihisa Kamada, Tomohiko Miyazaki. W3C Note.

[CONNEG] IETF working group on content negotiation

[IMT-2000] Ericsson in Wideband Wireless Multimedia.

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, Borenstein N., Freed N., 1996/11/27

[RDF] Resource Description Framework, (RDF) Model and Syntax Specification. Lassila O., Swick R. W3C Working Draft.

[RDF-Schema] Resource Description Framework (RDF) Schema Specification. Brickley, D., Guha, R.V. , Layman, A., W3C Working Draft.

[TokenXML] Binary XML Content Format Specification