HTTP-NG-Binary-Wire-Protocol-971016

w3ng: Binary Wire Protocol for HTTP-NG


Authors:
     Bill Janssen, Xerox PARC, <janssen@parc.xerox.com>
This version:
     http://www.parc.xerox.com/http-ng/wire-encoding.html
     $Id: wire-encoding.html,v 1.3 1997/10/28 00:45:56 janssen Exp $
Latest version:
     http://www.parc.xerox.com/http-ng/wire-encoding.html
Previous versions:
     None

This document describes a binary `on-the-wire' protocol to be used when sending HTTP-NG operation invocations or terminations across a network connection.

1. Syntax Used in this Document

Two data description languages are used in this document. The first, called ISL, is an abstract language for defining data types and interfaces. It is described in ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_2.html#SEC21. The second is a pseudo-C syntax. It should be interpreted as C data structure layouts without any automatic padding to size boundaries, and allowing arbitrary bit-size limits on structs and unions as well as on ints and enums. Pseudo-C enums are extensible rather than closed. Each use of ISL and pseudo-C is marked as to which language is being used.

2. Global Issues

2.1. Byte Order

All values use `network standard' byte order, i.e. big-endian, because all Internet protocols use it. If in the future this becomes a problem for the Internet, this protocol will be affected by whatever solution is used to solve the problem in the wider Internet context.

2.2. Alignment and Padding

All values are padded-after to a 32-bit alignment boundary, to avoid marshalling history problems.

2.3. Marshalling Format

Marshalling is via the XDR format specified in Internet RFC 1832. It could be argued that this format is inexcusably wasteful with certain value types, such as boolean (32 bits) or byte (32 bits), and that a 16-bit or 8-bit oriented format should be designed and used in its place. However, the argument of using an existing Internet standard for this purpose, rather than inventing a new one, is a strong one; a new format should only be defined if measurement of the overhead shows gross waste.

2.4. Transport Requirements

This protocol is assumed to work over a reliable sequenced message transport; that is, over a transport that reliably conveys messages of arbitrary sizes from one peer to another and preserves the sequencing of the messages, in that they are received in the same order in which they are sent. An example of this would be the record marking defined in Internet RFC 1831 used with TCP/IP. In addition, this protocol is defined in the scope of a session, which is defined as a communications context between client and server during the extent of which they can build up state about each other. An example of a session might be a TCP/IP connection.

2.5. Security

This protocol assumes that security provisions are made either at some level above it, typically in the application interfaces, or at some level below it, typically by use of a secure transport mechanism. It contains no protocol-level mechanisms for providing or assuring any of the concerns normally related to security.

2.6. Encapsulation

It is sometimes useful to allow the marshalling protocol to use itself recursively, as when PICKLE values contain the marshalled form of the encapsulated value. For this protocol, encapsulation is performed by using the data marshalling rules of the protocol, but marshalling the values into a buffer instead of onto a transport layer.

2.7. Extension Headers

This protocol uses a feature called an extension header to provide for extensibility and tailorability. Features such as serialization contexts or global thread identifiers may be implemented via this feature. An extension header is an excapsulated value of the ISL type ExtensionHeader. Each request message and reply message may contain a value of type ExtensionHeaderList, which contains a number of extension headers. The following ISL fragment decribes the types ExtensionHeaderList and ExtensionHeader:
INTERFACE HTTP-NG-w3ng;
...
TYPE SimpleString = SEQUENCE OF SHORT CHARACTER LIMIT 0xFFFF;
TYPE ExtensionHeader = RECORD
     name : SimpleString,
     value : PICKLE
END;
TYPE ExtensionHeaderList = SEQUENCE OF ExtensionHeader;
...

3. Utility types

The following data structures are defined in pseudo-C:
typedef enum {
 False = 0,
 True = 1
} Boolean;

typedef enum {
 Request = 0,
 Reply = 1, 
 CancelRequest = 2,
 TerminateSession = 3,
 VerifyServer = 4,
 LoadContext = 5,
 LoadContextAck = 6
} MsgType;

typedef enum {
 Success = 0,
 UserException = 1,             /* occurred during operation */
 SystemExceptionBefore = 2,     /* occurred before beginning operation */
 SystemExceptionAfter = 3       /* occurred after beginning operation */
} ReplyStatus;

typedef struct {
 Boolean fixed_size_key : 1;    /* True if fixed size key */
 Boolean cache_key : 1;         /* True if both sides cache it */
 union {
  unsigned key : 14;            /* key if fixed size */
  unsigned key_len : 14;        /* key length if variable */
 } v;
} DiscriminantID;

typedef struct {
 Boolean cached_op : 1;         /* True if cached id */
 Boolean cache_operation : 1;   /* True if should be cached */
 union {
  unsigned cache_index : 14;    /* cache index if "cached_op" set */
  unsigned method_id : 14;      /* method id if "cached_op" not set */
 } v;
} OperationID;

typedef enum {
 MangledMessage = 0,
 ProcessFinished = 1,
 ResourceManagement = 2,
 WrongCallee = 3
} TerminationCause;

typedef struct {
 unsigned major : 4;     
 unsigned minor : 4;
} ProtocolVersion;

typedef unsigned Unused;

4. Messages

Only a few messages are defined. The VerifyServer message is used by the caller to verify that it has connected to the right server. The Request message causes an operation to be started on the remote server. The CancelRequest message can be sent to cause an active operation to be cancelled. The Reply message is sent from the server to the client to inform it of the completion status of the operation, and to convey any result values. There are three subtypes of Reply: Success indicates normal termination; UserException signals that the operation raised one of its declared exceptions; SystemException signals that the operation or the remote server encountered an unexpected problem. The TerminateSession message allows either side to indicate graceful shutdown of a session. The LoadContext message allows the caller to instruct the callee to load a new session context; the callee replies with the LoadContextAck message.

4.1. Request

Request header (pseudo-C):

typedef struct {
  ProtocolVersion version : 8;
  MsgType msg_type : 5;           /* == Request */
  Boolean ext_hdr_present : 1;    /* True if ext hdr list present */
  Unused request_1 : 2;
  unsigned serial_no : 16;        /* session-relative serial number */
  OperationID operation_id : 16;  /* identifies operation */
  DiscriminantID object_key : 16; /* identifies discriminant */
} RequestMsgHeader                /* 8 bytes total */

The actual message consists of the following sections:

[ RequestMsgHeader ]
[ extension header list, if any ]
[ XDR string containing object type ID of object type defining operation, if not cached ]
[ bytes of object_key, if not cached, padded to 4 byte boundary ]
[ explicit input parameter values, if any, padded to a 4 byte boundary ]

The operation_id contains either a fixed-length session-specific 14-bit cache index, or the method id (the zero-based ordinal position of the method in the ISL declaration of the object type in which the operation is defined) of the operation. If the method id is given, an additional value, an XDR string value containing the object type ID of the object type in which the operation is defined, is also passed. This means that this protocol will not support interfaces in which object types have more than 16383 methods directly defined.

The object_key is either a fixed-length 14-bit session-specific cache index, or the length of a variable length octet sequence of 16383 or fewer bytes containing the callee-relative name for the object (the OBJ-ID of the URL). The object key value of { False, False, 0 }, normally a zero byte variable length object key, is reserved for use by the protocol. The object_key is marshalled onto the transport as an XDR value of type fixed-length opaque data, where the length is that specified in the v.key_len field of the object_key.

4.1.1 Operation and Object Memoizing

Callers may reduce the size of messages by memoizing operation IDs and object IDs that are passed in the session. This is done by the caller setting the cache_key (for object IDs) or cache_operation (for operation IDs) bit in the DiscriminantID or OperationID struct when the object key or operation ID is first sent. Each side must then assign the next available index to that object or operation. The space of operations is separate from the space of object ids, so that a total of 16383 possible values is available for each type of memoized value.

Note that the index is passed implicitly, so both sides of the session must synchronize their use of indices.

A shared set of indices may be loaded into the session by some mechanism before any messages are sent. This specification does not define a mechanism for doing so.

4.2. Reply

Reply header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == Reply */
  Boolean ext_hdr_present : 1;    /* True if ext hdr list present */
  ReplyStatus : 2;
  unsigned serial_no : 16;        /* serial # from Request */
} ReplyMsgHeader;                 /* 4 bytes total */

The actual message consists of the following fields:

[ ReplyMsgHeader ]
[ extension header list, if any ]
[ exception ID (32-bit unsigned), if any ]
[ explicit output parameter values, if any, padded to 4 byte boundary ]

4.3. CancelRequest

CancelRequest header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == CancelRequest */
  Unused cancel_1 : 3;
  unsigned serial_no : 16;        /* id of request to cancel */
} CancelRequestMsgHeader;         /* 4 bytes total */

The actual message consists simply of the header. It is sent from the caller to the callee, and instructs the callee to cancel processing of the specified operation. The callee may still return a reply message in reply to a cancelled operation, which the caller will ignore.

4.4. TerminateSession

TerminateSession header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == TerminateSession */
  TerminationCause cause: 3;      /* why session terminated */
  unsigned serial_no : 16;        /* last request processed/sent */
} TerminateSessionMsgHeader;

The actual message consists simply of the header; it provides for graceful session shutdown. It is sent either from the caller to the callee, or from the callee to the caller, and informs the other party that it is cancelling the session, for one of these reasons:

  1. A badly formatted message has arrived from the other party, and protocol sychrononization is believe lost;
  2. This party is going away, and the other party should not attempt to reconnect to it;
  3. This session is being terminated due to active resource management; the other party should attempt to reconnect if it needs to. This reason is typically only useful from caller to callee;
  4. The caller has sent a VerifyServer message with the wrong UUID.
The serial_no field contains the serial number of the last message processed by the caller (when TerminateSession is sent from caller to callee), or the serial number of the last message sent by the callee (when sent from callee to caller).

4.5. VerifyServer

VerifyServer header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == VerifyServer */
  Unused verify_1 : 3;
  unsigned server_id_len : 16;    /* length of server ID */
} VerifyServerMsgHeader;

The actual message consists of the following fields:

[ VerifyServerMsgHeader ]
[ server_id_len-length UUID for supposed callee, padded to 4-byte boundary ]

This message is sent from caller to callee as the first message of the session. It is used to verify that the callee is connecting to the correct service, by passing the UUID of the service. If the UUID received by the callee is not the correct UUID for the callee, the callee should terminate the session, with the appropriate reason. The UUID is passed as an XDR fixed-length opaque data value of the length specified in server_id_len.

4.6. LoadContext

LoadContext header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == LoadContext */
  Boolean reset_context : 1,      /* 1 if start fresh, 0 if incremental */
  Unused load_1 : 2;
  unsigned context_id_len : 16;   /* length of following context_id URN */
} LoadContextMsgHeader;

The actual message consists of the following fields:

[ LoadContextMsgHeader ]
[ context object to load context from ]

This message is sent from caller to callee. It instructs the callee to add session context from the context object of type HTTP-NG-w3ng.SessionContextObject specified by this message. If the reset_context bit is set, the callee should discard all current session context before loading the specified context store. The callee should respond with a LoadContextAck message when it has finished loading the context. The caller will not send additional messages until after it has processed the callee's LoadContextAck message.

4.6.1. The HTTP-NG-w3ng.SessionContextObject Type

The following ISL fragment describes the interface to the SessionContextObject type.
INTERFACE HTTP-NG-w3ng;
...
TYPE URI = SimpleString;
TYPE Type-ID = URI;
TYPE UUID = URI;
TYPE CachedOperation = RECORD
  cache-value : SHORT CARDINAL,
  object-type : Type-ID,
  method-name : SimpleString
END;
TYPE CachedOperationList = SEQUENCE OF CachedOperation;
TYPE CachedDiscriminant = RECORD
  cache-value : SHORT CARDINAL,
  server-uuid : UUID,
  object-id   : SimpleString
END;
TYPE CachedDiscriminantList = SEQUENCE OF CachedDiscriminant;
TYPE SessionContextObject = OBJECT
  METHODS
    GetCachedOperations() : CachedOperationList;
    GetCachedDiscriminants() : CachedDiscriminantList;
  END;
...

4.7. LoadContextAck

LoadContextAck header (pseudo-C):

typedef struct {
  ProtocolVersion version: 8;
  MsgType msg_type : 5;           /* == LoadContextAck */
  Boolean success : 1;            /* 1 if able to perform load */
  Unused loadack_1 : 18;
} LoadContextAckMsgHeader;

The actual message consists of the header. It is sent from the callee to the caller to acknowledge an instruction to load a new session context, and reports on the success or failure of its attempt to load that context. The caller should not assume that the context has been loaded until after receiving a successful LoadContextAck message from the callee.

5. Data Marshalling

The data value format used for parameters is the XDR format specified in Internet RFC 1832.

5.1. Simple Types

The following ISL->XDR data type mappings for non-constructed types are assumed:

5.2. Constructed Types

5.3. PICKLE Types

A pickle is passed as an XDR variable-length opaque data, containing the type ID of the pickled value's type, followed by the XDR-marshalled pickled value. To save pickle space for common value types used in metadata, we define a packed format for the type ID marshalling. A type ID is marshalling into a pickle as a 32-bit header, in an XDR unsigned integer, possibly followed by an XDR fixed-length opaque data, containing the string form of the type ID of the pickled type. The header has the following internal structure:

typedef struct {
  unsigned              version : 8;
  PickleTypeKind        type_kind : 8;
  unsigned              type_id_len : 16;
} TypeIDHeader;
The version field gives the version number of the pickle format; the type_kind field contains a value from the enum
typedef enum {
  TypeKind_unconstrained = 0,
  TypeKind_boolean = 1,
  TypeKind_byte = 2,
  TypeKind_short_integer = 3,
  TypeKind_integer = 4,
  TypeKind_long_integer = 5,
  TypeKind_short_cardinal = 6,
  TypeKind_cardinal = 7,
  TypeKind_long_cardinal = 8,
  TypeKind_short_real = 9,
  TypeKind_real = 10,
  TypeKind_long_real = 11,
  TypeKind_short_character = 12,
  TypeKind_character = 13,
  TypeKind_ascii_string = 14,
  TypeKind_object = 15,
  /* other types like Date, etc, should be added here... */
  ...
} PickleTypeKind;
If the value of type_kind is TypeKind_unconstrained, the value of type_kind_len is the length of a value of XDR type fixed-length opaque data, containing the full string type ID of the type, which immediately follows the header. Otherwise, no opaque data is marshalled.

5.4. Object Types and URLs

Values of object types are passed as a value of XDR string, which contains the URL for the object.

URLs for HTTP-NG objects will be of the form

w3ng:SERVER-UUID/OBJ-ID[;type=TYPE][;cinfo=CINFO]
where SERVER-UUID is a UUID for the server which supports the desired object; OBJ-ID is a server-relative name for the object; TYPE is the type ID for the most derived type of the object; and CINFO is information about the way in which the object needs to be contacted, including information such as whether various transport layers are involved. This form has the virtue of becoming a URI if the optional CINFO field is omitted.

6. System Exceptions

This section is very rough, and should be taken with a grain of salt...

6.1. UnknownProblem

Exception Code: 0
ISL Values: None

An unknown problem occurred.

6.2. ImplementationLimit

Exception Code: 1
ISL Values: None

The request could not be properly addressed because of some implementation resource limit on the callee side.

6.3. SwitchSessionCinfo

Exception Code: 2
ISL Values: NEW-CINFO : HTTP-NG-w3ng.String

This exception requests the caller to upgrade the session protocol and transport information to the cinfo specified as the argument, and re-try the call. This is the equivalent of the UPGRADE message in HTTP 1.1, and the RELOCATE_REPLY message in CORBA GIOP.

6.4. Marshal

Exception Code: 3
ISL Values: None

A marshalling problem was encountered.

6.5. NoSuchObjectType

Exception Code: 4
ISL Values: None

The object type of the operation was unknown at the server.

6.6. NoSuchMethod

Exception Code: 5
ISL Values: None

The object type of the operation was known at the server, but did not contain the indicated method.

6.7. Rejected

Exception Code: 6
ISL Values: REASON : OPTIONAL SimpleString

The server refused to process the request. It may return a string giving a reason for the rejection.

7. Discussion

7.1. Serial Numbers

Does this protocol need to assign serial numbers to requests and replies? We do so in order to be able to cancel operations by serial number, and to be able to return reply messages out of order. The first problem, that of cancelling operations, could be dealt with by keeping track of serial numbers implicitly, and using an explicit serial number only in the CancelRequest message. Doing this would imply that the replies would have to be returned in the order in which the requests were passed, but would allow us to have 6 byte request messages (4 bytes if we count the discriminant as part of the arguments, instead of part of the header), and 4 byte reply messages. Thus the only real purpose for serial numbers is to allow replies to be returned out of order (and possibly to make debugging the protocol easier). There are other deeper unanswered questions here about the serialization semantics of the protocol. For instance, should the callee wait until dispatching a reply to one request until beginning to process the next one?

The current answer to these questions is that it is highly useful to allow a threaded callee to process multiple requests in parallel, and to allow it to return requests out of order. Thus serial numbers are useful. We assume that higher-level protocols desiring serialization will provide a serialization context as part of the context of the call, and that serialization will be handled at either a higher or lower level.

7.2. Memoizing of PICKLE and Object Types?

A great deal of the traffic over this protocol may consist of values of type PICKLE (the equivalent of object-by-value, or of HTTP's MIME-encapsulated body type) or of some object type. It is tempting to introduce a form of memoizing for these value types, similar to that used for request discriminants. There are two reasons not to do so:
  1. XDR provides no support for this, which means that we would have to provide a marshalling format for these types which has no clean layering onto XDR. For instance, it might be possible to pass an object value as an XDR 32-bit unsigned integer with the following (private) pseudo-C structure
    struct {
      boolean   use_cached_value : 1;
      boolean   cache_this_value : 1;
      union {
        unsigned int url_len : 30;
        unsigned int cache_key : 30;
      } v;
    };
    
    either by itself (if use_cached_value is set), or followed by an XDR fixed length opaque value containing the URL for the object (if use_cached_value is not set). This type of variable structure has no equivalent in XDR. On the other hand, it could well be argued that since we are marshalling an object type, something not explicitly covered by XDR, that we are simply providing an extension to XDR, in the spirit of the marshalling.
  2. A more powerful argument is that allowing arbitrary memoizing of large items can let the caller place almost arbitrary loads on the storage requirements of the callee. It could be argued that the callee can reset the session at any time if the load becomes too onerous via TerminateSession.
Neither of these arguments seems overwhelmingly powerful.

7.3. URL Forms

Open issues:

8. References

XDR [RFC 1832]: http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1832.txt

ISL: ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_2.html#SEC21

9. Address of Author

Bill Janssen
Xerox Palo Alto Research Center
3333 Coyote Hill Rd
Palo Alto, CA 94304

Phone: (650) 812-4763
FAX: (650) 812-4777
Email: janssen@parc.xerox.com
HTTP: http://www.parc.xerox.com/istl/members/janssen/

Index

a

  • alignment
  • array of byte, marshalling of
  • array types, marshalling of
  • author

    b

  • big-endian
  • BOOLEAN
  • Boolean (pseudo-C enum type)
  • byte order

    c

  • CancelRequest message

    d

  • discriminant object ID memoizing
  • discriminant, identification of
  • DiscriminantID (pseudo-C struct type)

    e

  • encapsulation
  • encapsulation types, marshalling of
  • extension headers

    f

  • floating-point types, marshalling of

    g

  • graceful session shutdown

    h

  • HTTP-NG-w3ng.CachedDiscriminant (ISL type)
  • HTTP-NG-w3ng.CachedDiscriminantList (ISL type)
  • HTTP-NG-w3ng.CachedOperation (ISL type)
  • HTTP-NG-w3ng.CachedOperationList (ISL type)
  • HTTP-NG-w3ng.ExtensionHeader (ISL type)
  • HTTP-NG-w3ng.ExtensionHeaderList (ISL type)
  • HTTP-NG-w3ng.SessionContextObject (ISL type)
  • HTTP-NG-w3ng.SimpleString (ISL type)
  • HTTP-NG-w3ng.Type-ID (ISL type)
  • HTTP-NG-w3ng.URI (ISL type)
  • HTTP-NG-w3ng.UUID (ISL type)

    i

  • ImplementationLimit (system exception)
  • integer types, marshalling of
  • ISL->XDR mapping

    l

  • LoadContext message
  • LoadContextAck message
  • LoadContextAckMsgHeader (pseudo-C struct type)
  • LoadContextMsgHeader (pseudo-C struct type)

    m

  • Marshal (system exception)
  • marshalling of data
  • memoizing
  • memoizing of pickle and object types
  • messages, description of
  • MsgType (pseudo-C enum type)

    n

  • NoSuchMethod (system exception)
  • NoSuchObjectType (system exception)

    o

  • object types, marshalling of
  • operation ID memoizing
  • operation, identification of
  • OperationID (pseudo-C struct type)
  • optional types, marshalling of

    p

  • padding
  • pc{CancelRequestMsgHeader} (pseudo-C struct type)
  • pickle types, marshalling of
  • ProtocolVersion (pseudo-C struct type)
  • pseudo-C syntax, definition of

    r

  • record types, marshalling of
  • Rejected (system exception)
  • reliable sequenced message transport
  • Reply message
  • ReplyMsgHeader (pseudo-C struct type)
  • ReplyStatus (pseudo-C enum type)
  • Request message
  • RequestMsgHeader (pseudo-C struct type)

    s

  • security
  • sequence of byte, marshalling of
  • sequence types, marshalling of
  • serial numbers, discussion of
  • session, definition of
  • string types, marshalling of
  • Success subtype of Reply
  • SwitchSessionCinfo (system exception)
  • syntax used
  • system exceptions
  • SystemException subtype of Reply

    t

  • TerminateSession message
  • TerminateSessionMsgHeader (pseudo-C struct type)
  • TerminationCause (pseudo-C enum type)
  • transport requirements

    u

  • union types, marshalling of
  • UnknownProblem (system exception)
  • Unused (pseudo-C alias type)
  • UserException subtype of Reply

    v

  • VerifyServer message
  • VerifyServerMsgHeader (pseudo-C struct type)

    w

  • w3ng URL form

    x

  • XDR, Internet RFC 1832, use of