This Section defines the SMIL media object module. This module contains elements and attributes used to describe media objects. Since these elements and attributes are defined in a module, designers of other markup languages can reuse the SMIL media module when they need to include media objects into their language.
Changes with respect to the media object elements in SMIL 1.0 provide additional functionality that was brought up as Requirements of the Working Group, and those differences are explained in Appendix A.
ref, animation, audio, img, video, text
and textstream
elements
The media object elements allow the inclusion of media objects into a SMIL presentation. Media objects are included by reference (using a URI).
There are two types of media objects: media objects with an intrinsic duration (e.g. video, audio) (also called "continuous media"), and media objects without intrinsic duration (e.g. text, image) (also called "discrete media").
Anchors and links can be attached to visual media objects, i.e. media objects rendered on a visual abstract rendering surface.
When playing back a media object, the player must not derive the exact type of the media object from the name of the media object element. Instead, it must rely solely on other sources about the type, such as type information contained in the "type" attribute, or the type information communicated by a server or the operating system.
Authors, however, should make sure that the group into which of the media object falls (animation, audio, img, video, text or textstream) is reflected in the element name. This is in order to increase the readability of the SMIL document. When in doubt about the group of a media object, authors should use the generic "ref" element.
Element Attributes
Media object elements can have the following attributes:
If the content of these attributes is read by a screen-reader, the presentation should be paused while the text is read out, and resumed afterwards.
Clip-value ::= [ Metric ] "=" ( Clock-val | Smpte-val ) |
"marker" "=" name-val
Metric ::= Smpte-type | "npt"
Smpte-type ::= "smpte" | "smpte-30-drop" | "smpte-25"
Smpte-val ::= Hours ":" Minutes ":" Seconds
[ ":" Frames [ "." Subframes ]]
Hours ::= Digit Digit
/* see XML 1.0 for a definition of ´Digit´*/
Minutes ::= Digit Digit
Seconds ::= Digit Digit
Frames ::= Digit Digit
Subframes ::= Digit Digit
name-val ::= ([^<&"] | [^<&´])*
/* Derived from BNF rule [10] in [XML10]
Whether single or double quotes are
allowed in a name value depends on which
type of quotes is used to quote the
clip attribute value */
The value of this attribute consists of a metric specifier, followed by a time value whose syntax and semantics depend on the metric specifier. The following formats are allowed:
The time value has the format hours:minutes:seconds:frames.subframes. If
the frame value is zero, it may be omitted. Subframes are measured in
one-hundredth of a frame.
Examples:
clipBegin="smpte=10:12:33:20"
clipBegin="npt=123.45s"
clipBegin="npt=12:05:35.3
"
Example: Assume that a recorded radio transmission consists of a sequence of songs, which are separated by announcements by a disk jockey. The audio format supports marked time points, and the begin of each song or announcement with number X is marked as songX or djX respectively. To extract the first song using the "marker" metric, the following audio media element can be used:
<audio clipBegin="marker=song1" clipEnd="marker=dj1" />
"clipBegin" may also be expressed as "clip-begin" for compatibility with SMIL 1.0. Software supporting SMIL Boston must be able to handle both "clipBegin" and "clip-begin", whereas software supporting only the SMIL media object module only needs to support "clipBegin". If an element contains both the old and the new version of a clipping attribute, the the attribute that occurs later in the text is ignored.
Example:
<audio src="radio.wav" clip-begin="5s" clipBegin="10s" />
The clip begins at second 5 of the audio, and not at second 10, since the "clipBegin" attribute is ignored. See Changes to SMIL 1.0 Attributes for more discussion on this topic.
See Changes to SMIL 1.0 Attributes for more discussion on this topic.
If the content of these attributes is read by a screen-reader, the presentation should be paused while the text is read out, and resumed afterwards.
longdesc
and alt
text are read out by
a screen reader for the current document. This value must be a number between
0 and 32767. User agents should ignore leading zeros. The default value is
0.
Elements that contain alt
or longdesc
attributes
are read by a screen reader according to the following rules:
@@ this may be better to derive from the "src" parameter, which could optionally be rtp://___. This would mean that an RTP URL format would need to be defined.
xml:lang
differs from the system-language
test
attribute in one important respect. xml:lang
provides information
about the content's language independent of what implementations do with
the information, whereas system-language
is a test attribute
with specific associated behavior (see system-language
in
SMIL Content Control Module for details)
Element Content
Media object elements can contain the following elements:
rtpmap
element
If the media object is transferred using the RTP protocol, and uses a dynamic payload type, SDP requires the use of the "rtpmap" attribute field. In this specification, this is mapped onto the "rtpmap" element, which is contained in the content of the media object element. If the media object is not transferred using RTP, this element is ignored.
Attributes
encoding-val ::= ( short-encoding | long-encoding ) short-encoding ::= encoding-name "/" clock-rate long-encoding ::= encoding-name "/" clock-rate "/" encoding-params encoding-name ::= name-val clock-rate ::= +Digit encoding-params ::= ??
Legal values for "encoding-name" are payload names defined in [RFC1890], and RTP payload names registered as MIME types [draft-ietf-avt-rtp-mime-00].
For audio streams, "encoding parameters" may specify the number of audio channels. This parameter may be omitted if the number of channels is one provided no additional parameters are needed. For video streams, no encoding parameters are currently specified. Additional parameters may be defined in the future, but codec specific parameters should not be added, but defined as separate rtpmap attributes.
Element Content
"rtpmap" is an empty element
Example
<audio src="rtsp://www.w3.org/foo.rtp" port="49170" transport="RTP/AVP" rtpformat="96,97,98"> <rtpmap payload="96" encoding="L8/8000" /> <rtpmap payload="97" encoding="L16/8000" /> <rtpmap payload="98" encoding="L16/11025/2" /> </audio>
A media object referenced by a media object element is often rendered by software modules referred to as media players that are separate from the software module providing the synchronization between different media objects in a presentation (referred to as synchronization engine).
Media players generally support varying levels of control, depending on the constraints of the underlying renderer as well as media delivery, streaming etc. This specification defines 4 levels of support, allowing for increasingly tight integration, and broader functionality. The details of the interface will be presented in a separate document.
With regards to the clipBegin/clip-begin and clipEnd/clip-end elements, SMIL Boston defines the following changes to the syntax defined in SMIL 1.0:
Using attribute names with hyphens such as "clip-begin" and "clip-end" is problematic when using a scripting language and the DOM to manipulate these attributes. Therefore, this specification adds the attribute names "clipBegin" and "clipEnd" as an equivalent alternative to the SMIL 1.0 "clip-begin" and "clip-end" attributes. The attribute names with hyphens are deprecated.
Authors can use two approaches for writing SMIL Boston presentations that use the new clipping syntax and functionality ("marker", default metric) defined in this specification, but can still can be handled by SMIL 1.0 software. First, authors can use non-hyphenated versions of the new attributes that use the new functionality, and add SMIL 1.0 conformant clipping attributes later in the text.
Example:
<audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" clip-begin="0s" clip-end="3:50" />
SMIL 1.0 players implementing the recommended extensibility rules of SMIL 1.0 [SMIL10] will ignore the clip attributes using the new functionality, since they are not part of SMIL 1.0. SMIL Boston players, in contrast, will ignore the clip attributes using SMIL 1.0 syntax, since they occur later in the text.
The second approach is to use the following steps:
Example:
<switch> <audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" system-required= "@@http://www.w3.org/AudioVideo/Group/Media/extended-media-object19990707" /> <audio src="radio.wav" clip-begin="0s" clip-end="3:50" /> </switch>
alt, longdesc
Added the recommendation that if the content of these attributes is read by a screen-reader, the presentation should be paused while the text is read out, and resumed afterwards.
New Accessibility Attributes
When using SMIL in conjunction with the Real Time Transport Protocol (RTP, [RFC1889]), which is designed for real-time delivery of media streams, a media client is required to have initialization parameters in order to interpret the RTP data. In the typical RTP implementation, these initialization parameters are described in the Session Description Protocol (SDP, [RFC2327]). The SDP description can be delivered in the DESCRIBE portion of the Real Time Streaming Protocol (RTSP, [RFC2326]), or can be delivered as a file via HTTP.
Since SMIL provides a media description language which often references SDP via RTSP and can also reference SDP files via HTTP, a very useful optimization can be realized by merging parameters typically delivered via SDP into the SMIL document. Since retrieving a SMIL document constitutes one round trip, and retrieving the SDP descriptions referenced in the SMIL document constitutes another round trip, merging the media description into the SMIL document itself can save a round trip in a typical media exchange. This round-trip savings can result in a noticeably faster start-up over a slow network link.
This applies particularly well to two primary usage scenarios:
The following attributes were added to SMIL Boston:
Example
<audio src="rtsp://www.w3.org/test.rtp" port="49170-49171" transport="RTP/AVP" rtpformat="96,97,98" />
In addition to these new attributes, the "rtpmap" element was added to complete the SDP functionality.
SMIL 1.0 only allowed "anchor" as a child element of a media element. In addition to "anchor", the following elements are now allowed as children of a SMIL media object:
rtpmap
element
A new section describing the "rtpmap" element is provided which provides functionality needed to use SMIL as a replacement for SDP.
SMIL Boston introduces the concepts of levels of functionality, which are explained in this section.
Listed below are the features that haven't been integrated yet, and may not make it into the final version of SMIL Boston: