P1L1 P1L2 P1L3 P1L4 Network Working Group H. Schulzrinne P1L5 Request for Comments: 3550 Columbia University P1L6 Obsoletes: 1889 S. Casner P1L7 Category: Standards Track Packet Design P1L8 R. Frederick P1L9 Blue Coat Systems Inc. P1L10 V. Jacobson P1L11 Packet Design P1L12 July 2003 P1L13 P1L14 P1L15 RTP: A Transport Protocol for Real-Time Applications P1L16 P1L17 Status of this Memo P1L18 P1L19 This document specifies an Internet standards track protocol for the P1L20 Internet community, and requests discussion and suggestions for P1L21 improvements. Please refer to the current edition of the "Internet P1L22 Official Protocol Standards" (STD 1) for the standardization state P1L23 and status of this protocol. Distribution of this memo is unlimited. P1L24 P1L25 Copyright Notice P1L26 P1L27 Copyright (C) The Internet Society (2003). All Rights Reserved. P1L28 P1L29 Abstract P1L30 P1L31 This memorandum describes RTP, the real-time transport protocol. RTP P1L32 provides end-to-end network transport functions suitable for P1L33 applications transmitting real-time data, such as audio, video or P1L34 simulation data, over multicast or unicast network services. RTP P1L35 does not address resource reservation and does not guarantee P1L36 quality-of-service for real-time services. The data transport is P1L37 augmented by a control protocol (RTCP) to allow monitoring of the P1L38 data delivery in a manner scalable to large multicast networks, and P1L39 to provide minimal control and identification functionality. RTP and P1L40 RTCP are designed to be independent of the underlying transport and P1L41 network layers. The protocol supports the use of RTP-level P1L42 translators and mixers. P1L43 P1L44 Most of the text in this memorandum is identical to RFC 1889 which it P1L45 obsoletes. There are no changes in the packet formats on the wire, P1L46 only changes to the rules and algorithms governing how the protocol P1L47 is used. The biggest change is an enhancement to the scalable timer P1L48 algorithm for calculating when to send RTCP packets in order to P1L49 minimize transmission in excess of the intended rate when many P1L50 participants join a session simultaneously. P2L1 Table of Contents P2L2 P2L3 1. Introduction ................................................ 4 P2L4 1.1 Terminology ............................................ 5 P2L5 2. RTP Use Scenarios ........................................... 5 P2L6 2.1 Simple Multicast Audio Conference ...................... 6 P2L7 2.2 Audio and Video Conference ............................. 7 P2L8 2.3 Mixers and Translators ................................. 7 P2L9 2.4 Layered Encodings ...................................... 8 P2L10 3. Definitions ................................................. 8 P2L11 4. Byte Order, Alignment, and Time Format ...................... 12 P2L12 5. RTP Data Transfer Protocol .................................. 13 P2L13 5.1 RTP Fixed Header Fields ................................ 13 P2L14 5.2 Multiplexing RTP Sessions .............................. 16 P2L15 5.3 Profile-Specific Modifications to the RTP Header ....... 18 P2L16 5.3.1 RTP Header Extension ............................ 18 P2L17 6. RTP Control Protocol -- RTCP ................................ 19 P2L18 6.1 RTCP Packet Format ..................................... 21 P2L19 6.2 RTCP Transmission Interval ............................. 24 P2L20 6.2.1 Maintaining the Number of Session Members ....... 28 P2L21 6.3 RTCP Packet Send and Receive Rules ..................... 28 P2L22 6.3.1 Computing the RTCP Transmission Interval ........ 29 P2L23 6.3.2 Initialization .................................. 30 P2L24 6.3.3 Receiving an RTP or Non-BYE RTCP Packet ......... 31 P2L25 6.3.4 Receiving an RTCP BYE Packet .................... 31 P2L26 6.3.5 Timing Out an SSRC .............................. 32 P2L27 6.3.6 Expiration of Transmission Timer ................ 32 P2L28 6.3.7 Transmitting a BYE Packet ....................... 33 P2L29 6.3.8 Updating we_sent ................................ 34 P2L30 6.3.9 Allocation of Source Description Bandwidth ...... 34 P2L31 6.4 Sender and Receiver Reports ............................ 35 P2L32 6.4.1 SR: Sender Report RTCP Packet ................... 36 P2L33 6.4.2 RR: Receiver Report RTCP Packet ................. 42 P2L34 6.4.3 Extending the Sender and Receiver Reports ....... 42 P2L35 6.4.4 Analyzing Sender and Receiver Reports ........... 43 P2L36 6.5 SDES: Source Description RTCP Packet ................... 45 P2L37 6.5.1 CNAME: Canonical End-Point Identifier SDES Item . 46 P2L38 6.5.2 NAME: User Name SDES Item ....................... 48 P2L39 6.5.3 EMAIL: Electronic Mail Address SDES Item ........ 48 P2L40 6.5.4 PHONE: Phone Number SDES Item ................... 49 P2L41 6.5.5 LOC: Geographic User Location SDES Item ......... 49 P2L42 6.5.6 TOOL: Application or Tool Name SDES Item ........ 49 P2L43 6.5.7 NOTE: Notice/Status SDES Item ................... 50 P2L44 6.5.8 PRIV: Private Extensions SDES Item .............. 50 P2L45 6.6 BYE: Goodbye RTCP Packet ............................... 51 P2L46 6.7 APP: Application-Defined RTCP Packet ................... 52 P2L47 7. RTP Translators and Mixers .................................. 53 P2L48 7.1 General Description .................................... 53 P3L1 7.2 RTCP Processing in Translators ......................... 55 P3L2 7.3 RTCP Processing in Mixers .............................. 57 P3L3 7.4 Cascaded Mixers ........................................ 58 P3L4 8. SSRC Identifier Allocation and Use .......................... 59 P3L5 8.1 Probability of Collision ............................... 59 P3L6 8.2 Collision Resolution and Loop Detection ................ 60 P3L7 8.3 Use with Layered Encodings ............................. 64 P3L8 9. Security .................................................... 65 P3L9 9.1 Confidentiality ........................................ 65 P3L10 9.2 Authentication and Message Integrity ................... 67 P3L11 10. Congestion Control .......................................... 67 P3L12 11. RTP over Network and Transport Protocols .................... 68 P3L13 12. Summary of Protocol Constants ............................... 69 P3L14 12.1 RTCP Packet Types ...................................... 70 P3L15 12.2 SDES Types ............................................. 70 P3L16 13. RTP Profiles and Payload Format Specifications .............. 71 P3L17 14. Security Considerations ..................................... 73 P3L18 15. IANA Considerations ......................................... 73 P3L19 16. Intellectual Property Rights Statement ...................... 74 P3L20 17. Acknowledgments ............................................. 74 P3L21 Appendix A. Algorithms ........................................ 75 P3L22 Appendix A.1 RTP Data Header Validity Checks ................... 78 P3L23 Appendix A.2 RTCP Header Validity Checks ....................... 82 P3L24 Appendix A.3 Determining Number of Packets Expected and Lost ... 83 P3L25 Appendix A.4 Generating RTCP SDES Packets ...................... 84 P3L26 Appendix A.5 Parsing RTCP SDES Packets ......................... 85 P3L27 Appendix A.6 Generating a Random 32-bit Identifier ............. 85 P3L28 Appendix A.7 Computing the RTCP Transmission Interval .......... 87 P3L29 Appendix A.8 Estimating the Interarrival Jitter ................ 94 P3L30 Appendix B. Changes from RFC 1889 ............................. 95 P3L31 References ...................................................... 100 P3L32 Normative References ............................................ 100 P3L33 Informative References .......................................... 100 P3L34 Authors' Addresses .............................................. 103 P3L35 Full Copyright Statement ........................................ 104 P3L36 P3L37 P3L38 P3L39 P3L40 P3L41 P3L42 P3L43 P3L44 P3L45 P3L46 P3L47 P3L48 P4L1 1. Introduction P4L2 P4L3 This memorandum specifies the real-time transport protocol (RTP), P4L4 which provides end-to-end delivery services for data with real-time P4L5 characteristics, such as interactive audio and video. Those services P4L6 include payload type identification, sequence numbering, timestamping P4L7 and delivery monitoring. Applications typically run RTP on top of P4L8 UDP to make use of its multiplexing and checksum services; both P4L9 protocols contribute parts of the transport protocol functionality. P4L10 However, RTP may be used with other suitable underlying network or P4L11 transport protocols (see Section 11). RTP supports data transfer to P4L12 multiple destinations using multicast distribution if provided by the P4L13 underlying network. P4L14 P4L15 Note that RTP itself does not provide any mechanism to ensure timely P4L16 delivery or provide other quality-of-service guarantees, but relies P4L17 on lower-layer services to do so. It does not guarantee delivery or P4L18 prevent out-of-order delivery, nor does it assume that the underlying P4L19 network is reliable and delivers packets in sequence. The sequence P4L20 numbers included in RTP allow the receiver to reconstruct the P4L21 sender's packet sequence, but sequence numbers might also be used to P4L22 determine the proper location of a packet, for example in video P4L23 decoding, without necessarily decoding packets in sequence. P4L24 P4L25 While RTP is primarily designed to satisfy the needs of multi- P4L26 participant multimedia conferences, it is not limited to that P4L27 particular application. Storage of continuous data, interactive P4L28 distributed simulation, active badge, and control and measurement P4L29 applications may also find RTP applicable. P4L30 P4L31 This document defines RTP, consisting of two closely-linked parts: P4L32 P4L33 o the real-time transport protocol (RTP), to carry data that has P4L34 real-time properties. P4L35 P4L36 o the RTP control protocol (RTCP), to monitor the quality of service P4L37 and to convey information about the participants in an on-going P4L38 session. The latter aspect of RTCP may be sufficient for "loosely P4L39 controlled" sessions, i.e., where there is no explicit membership P4L40 control and set-up, but it is not necessarily intended to support P4L41 all of an application's control communication requirements. This P4L42 functionality may be fully or partially subsumed by a separate P4L43 session control protocol, which is beyond the scope of this P4L44 document. P4L45 P4L46 RTP represents a new style of protocol following the principles of P4L47 application level framing and integrated layer processing proposed by P4L48 Clark and Tennenhouse [10]. That is, RTP is intended to be malleable P5L1 to provide the information required by a particular application and P5L2 will often be integrated into the application processing rather than P5L3 being implemented as a separate layer. RTP is a protocol framework P5L4 that is deliberately not complete. This document specifies those P5L5 functions expected to be common across all the applications for which P5L6 RTP would be appropriate. Unlike conventional protocols in which P5L7 additional functions might be accommodated by making the protocol P5L8 more general or by adding an option mechanism that would require P5L9 parsing, RTP is intended to be tailored through modifications and/or P5L10 additions to the headers as needed. Examples are given in Sections P5L11 5.3 and 6.4.3. P5L12 P5L13 Therefore, in addition to this document, a complete specification of P5L14 RTP for a particular application will require one or more companion P5L15 documents (see Section 13): P5L16 P5L17 o a profile specification document, which defines a set of payload P5L18 type codes and their mapping to payload formats (e.g., media P5L19 encodings). A profile may also define extensions or modifications P5L20 to RTP that are specific to a particular class of applications. P5L21 Typically an application will operate under only one profile. A P5L22 profile for audio and video data may be found in the companion RFC P5L23 3551 [1]. P5L24 P5L25 o payload format specification documents, which define how a P5L26 particular payload, such as an audio or video encoding, is to be P5L27 carried in RTP. P5L28 P5L29 A discussion of real-time services and algorithms for their P5L30 implementation as well as background discussion on some of the RTP P5L31 design decisions can be found in [11]. P5L32 P5L33 1.1 Terminology P5L34 P5L35 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", P5L36 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this P5L37 document are to be interpreted as described in BCP 14, RFC 2119 [2] P5L38 and indicate requirement levels for compliant RTP implementations. P5L39 P5L40 2. RTP Use Scenarios P5L41 P5L42 The following sections describe some aspects of the use of RTP. The P5L43 examples were chosen to illustrate the basic operation of P5L44 applications using RTP, not to limit what RTP may be used for. In P5L45 these examples, RTP is carried on top of IP and UDP, and follows the P5L46 conventions established by the profile for audio and video specified P5L47 in the companion RFC 3551. P5L48 P6L1 2.1 Simple Multicast Audio Conference P6L2 P6L3 A working group of the IETF meets to discuss the latest protocol P6L4 document, using the IP multicast services of the Internet for voice P6L5 communications. Through some allocation mechanism the working group P6L6 chair obtains a multicast group address and pair of ports. One port P6L7 is used for audio data, and the other is used for control (RTCP) P6L8 packets. This address and port information is distributed to the P6L9 intended participants. If privacy is desired, the data and control P6L10 packets may be encrypted as specified in Section 9.1, in which case P6L11 an encryption key must also be generated and distributed. The exact P6L12 details of these allocation and distribution mechanisms are beyond P6L13 the scope of RTP. P6L14 P6L15 The audio conferencing application used by each conference P6L16 participant sends audio data in small chunks of, say, 20 ms duration. P6L17 Each chunk of audio data is preceded by an RTP header; RTP header and P6L18 data are in turn contained in a UDP packet. The RTP header indicates P6L19 what type of audio encoding (such as PCM, ADPCM or LPC) is contained P6L20 in each packet so that senders can change the encoding during a P6L21 conference, for example, to accommodate a new participant that is P6L22 connected through a low-bandwidth link or react to indications of P6L23 network congestion. P6L24 P6L25 The Internet, like other packet networks, occasionally loses and P6L26 reorders packets and delays them by variable amounts of time. To P6L27 cope with these impairments, the RTP header contains timing P6L28 information and a sequence number that allow the receivers to P6L29 reconstruct the timing produced by the source, so that in this P6L30 example, chunks of audio are contiguously played out the speaker P6L31 every 20 ms. This timing reconstruction is performed separately for P6L32 each source of RTP packets in the conference. The sequence number P6L33 can also be used by the receiver to estimate how many packets are P6L34 being lost. P6L35 P6L36 Since members of the working group join and leave during the P6L37 conference, it is useful to know who is participating at any moment P6L38 and how well they are receiving the audio data. For that purpose, P6L39 each instance of the audio application in the conference periodically P6L40 multicasts a reception report plus the name of its user on the RTCP P6L41 (control) port. The reception report indicates how well the current P6L42 speaker is being received and may be used to control adaptive P6L43 encodings. In addition to the user name, other identifying P6L44 information may also be included subject to control bandwidth limits. P6L45 A site sends the RTCP BYE packet (Section 6.6) when it leaves the P6L46 conference. P6L47 P6L48 P7L1 2.2 Audio and Video Conference P7L2 P7L3 If both audio and video media are used in a conference, they are P7L4 transmitted as separate RTP sessions. That is, separate RTP and RTCP P7L5 packets are transmitted for each medium using two different UDP port P7L6 pairs and/or multicast addresses. There is no direct coupling at the P7L7 RTP level between the audio and video sessions, except that a user P7L8 participating in both sessions should use the same distinguished P7L9 (canonical) name in the RTCP packets for both so that the sessions P7L10 can be associated. P7L11 P7L12 One motivation for this separation is to allow some participants in P7L13 the conference to receive only one medium if they choose. Further P7L14 explanation is given in Section 5.2. Despite the separation, P7L15 synchronized playback of a source's audio and video can be achieved P7L16 using timing information carried in the RTCP packets for both P7L17 sessions. P7L18 P7L19 2.3 Mixers and Translators P7L20 P7L21 So far, we have assumed that all sites want to receive media data in P7L22 the same format. However, this may not always be appropriate. P7L23 Consider the case where participants in one area are connected P7L24 through a low-speed link to the majority of the conference P7L25 participants who enjoy high-speed network access. Instead of forcing P7L26 everyone to use a lower-bandwidth, reduced-quality audio encoding, an P7L27 RTP-level relay called a mixer may be placed near the low-bandwidth P7L28 area. This mixer resynchronizes incoming audio packets to P7L29 reconstruct the constant 20 ms spacing generated by the sender, mixes P7L30 these reconstructed audio streams into a single stream, translates P7L31 the audio encoding to a lower-bandwidth one and forwards the lower- P7L32 bandwidth packet stream across the low-speed link. These packets P7L33 might be unicast to a single recipient or multicast on a different P7L34 address to multiple recipients. The RTP header includes a means for P7L35 mixers to identify the sources that contributed to a mixed packet so P7L36 that correct talker indication can be provided at the receivers. P7L37 P7L38 Some of the intended participants in the audio conference may be P7L39 connected with high bandwidth links but might not be directly P7L40 reachable via IP multicast. For example, they might be behind an P7L41 application-level firewall that will not let any IP packets pass. P7L42 For these sites, mixing may not be necessary, in which case another P7L43 type of RTP-level relay called a translator may be used. Two P7L44 translators are installed, one on either side of the firewall, with P7L45 the outside one funneling all multicast packets received through a P7L46 secure connection to the translator inside the firewall. The P7L47 translator inside the firewall sends them again as multicast packets P7L48 to a multicast group restricted to the site's internal network. P8L1 Mixers and translators may be designed for a variety of purposes. An P8L2 example is a video mixer that scales the images of individual people P8L3 in separate video streams and composites them into one video stream P8L4 to simulate a group scene. Other examples of translation include the P8L5 connection of a group of hosts speaking only IP/UDP to a group of P8L6 hosts that understand only ST-II, or the packet-by-packet encoding P8L7 translation of video streams from individual sources without P8L8 resynchronization or mixing. Details of the operation of mixers and P8L9 translators are given in Section 7. P8L10 P8L11 2.4 Layered Encodings P8L12 P8L13 Multimedia applications should be able to adjust the transmission P8L14 rate to match the capacity of the receiver or to adapt to network P8L15 congestion. Many implementations place the responsibility of rate- P8L16 adaptivity at the source. This does not work well with multicast P8L17 transmission because of the conflicting bandwidth requirements of P8L18 heterogeneous receivers. The result is often a least-common P8L19 denominator scenario, where the smallest pipe in the network mesh P8L20 dictates the quality and fidelity of the overall live multimedia P8L21 "broadcast". P8L22 P8L23 Instead, responsibility for rate-adaptation can be placed at the P8L24 receivers by combining a layered encoding with a layered transmission P8L25 system. In the context of RTP over IP multicast, the source can P8L26 stripe the progressive layers of a hierarchically represented signal P8L27 across multiple RTP sessions each carried on its own multicast group. P8L28 Receivers can then adapt to network heterogeneity and control their P8L29 reception bandwidth by joining only the appropriate subset of the P8L30 multicast groups. P8L31 P8L32 Details of the use of RTP with layered encodings are given in P8L33 Sections 6.3.9, 8.3 and 11. P8L34 P8L35 3. Definitions P8L36 P8L37 RTP payload: The data transported by RTP in a packet, for P8L38 example audio samples or compressed video data. The payload P8L39 format and interpretation are beyond the scope of this document. P8L40 P8L41 RTP packet: A data packet consisting of the fixed RTP header, a P8L42 possibly empty list of contributing sources (see below), and the P8L43 payload data. Some underlying protocols may require an P8L44 encapsulation of the RTP packet to be defined. Typically one P8L45 packet of the underlying protocol contains a single RTP packet, P8L46 but several RTP packets MAY be contained if permitted by the P8L47 encapsulation method (see Section 11). P8L48 P9L1 RTCP packet: A control packet consisting of a fixed header part P9L2 similar to that of RTP data packets, followed by structured P9L3 elements that vary depending upon the RTCP packet type. The P9L4 formats are defined in Section 6. Typically, multiple RTCP P9L5 packets are sent together as a compound RTCP packet in a single P9L6 packet of the underlying protocol; this is enabled by the length P9L7 field in the fixed header of each RTCP packet. P9L8 P9L9 Port: The "abstraction that transport protocols use to P9L10 distinguish among multiple destinations within a given host P9L11 computer. TCP/IP protocols identify ports using small positive P9L12 integers." [12] The transport selectors (TSEL) used by the OSI P9L13 transport layer are equivalent to ports. RTP depends upon the P9L14 lower-layer protocol to provide some mechanism such as ports to P9L15 multiplex the RTP and RTCP packets of a session. P9L16 P9L17 Transport address: The combination of a network address and port P9L18 that identifies a transport-level endpoint, for example an IP P9L19 address and a UDP port. Packets are transmitted from a source P9L20 transport address to a destination transport address. P9L21 P9L22 RTP media type: An RTP media type is the collection of payload P9L23 types which can be carried within a single RTP session. The RTP P9L24 Profile assigns RTP media types to RTP payload types. P9L25 P9L26 Multimedia session: A set of concurrent RTP sessions among a P9L27 common group of participants. For example, a videoconference P9L28 (which is a multimedia session) may contain an audio RTP session P9L29 and a video RTP session. P9L30 P9L31 RTP session: An association among a set of participants P9L32 communicating with RTP. A participant may be involved in multiple P9L33 RTP sessions at the same time. In a multimedia session, each P9L34 medium is typically carried in a separate RTP session with its own P9L35 RTCP packets unless the the encoding itself multiplexes multiple P9L36 media into a single data stream. A participant distinguishes P9L37 multiple RTP sessions by reception of different sessions using P9L38 different pairs of destination transport addresses, where a pair P9L39 of transport addresses comprises one network address plus a pair P9L40 of ports for RTP and RTCP. All participants in an RTP session may P9L41 share a common destination transport address pair, as in the case P9L42 of IP multicast, or the pairs may be different for each P9L43 participant, as in the case of individual unicast network P9L44 addresses and port pairs. In the unicast case, a participant may P9L45 receive from all other participants in the session using the same P9L46 pair of ports, or may use a distinct pair of ports for each. P9L47 P9L48 P10L1 The distinguishing feature of an RTP session is that each P10L2 maintains a full, separate space of SSRC identifiers (defined P10L3 next). The set of participants included in one RTP session P10L4 consists of those that can receive an SSRC identifier transmitted P10L5 by any one of the participants either in RTP as the SSRC or a CSRC P10L6 (also defined below) or in RTCP. For example, consider a three- P10L7 party conference implemented using unicast UDP with each P10L8 participant receiving from the other two on separate port pairs. P10L9 If each participant sends RTCP feedback about data received from P10L10 one other participant only back to that participant, then the P10L11 conference is composed of three separate point-to-point RTP P10L12 sessions. If each participant provides RTCP feedback about its P10L13 reception of one other participant to both of the other P10L14 participants, then the conference is composed of one multi-party P10L15 RTP session. The latter case simulates the behavior that would P10L16 occur with IP multicast communication among the three P10L17 participants. P10L18 P10L19 The RTP framework allows the variations defined here, but a P10L20 particular control protocol or application design will usually P10L21 impose constraints on these variations. P10L22 P10L23 Synchronization source (SSRC): The source of a stream of RTP P10L24 packets, identified by a 32-bit numeric SSRC identifier carried in P10L25 the RTP header so as not to be dependent upon the network address. P10L26 All packets from a synchronization source form part of the same P10L27 timing and sequence number space, so a receiver groups packets by P10L28 synchronization source for playback. Examples of synchronization P10L29 sources include the sender of a stream of packets derived from a P10L30 signal source such as a microphone or a camera, or an RTP mixer P10L31 (see below). A synchronization source may change its data format, P10L32 e.g., audio encoding, over time. The SSRC identifier is a P10L33 randomly chosen value meant to be globally unique within a P10L34 particular RTP session (see Section 8). A participant need not P10L35 use the same SSRC identifier for all the RTP sessions in a P10L36 multimedia session; the binding of the SSRC identifiers is P10L37 provided through RTCP (see Section 6.5.1). If a participant P10L38 generates multiple streams in one RTP session, for example from P10L39 separate video cameras, each MUST be identified as a different P10L40 SSRC. P10L41 P10L42 Contributing source (CSRC): A source of a stream of RTP packets P10L43 that has contributed to the combined stream produced by an RTP P10L44 mixer (see below). The mixer inserts a list of the SSRC P10L45 identifiers of the sources that contributed to the generation of a P10L46 particular packet into the RTP header of that packet. This list P10L47 is called the CSRC list. An example application is audio P10L48 conferencing where a mixer indicates all the talkers whose speech P11L1 was combined to produce the outgoing packet, allowing the receiver P11L2 to indicate the current talker, even though all the audio packets P11L3 contain the same SSRC identifier (that of the mixer). P11L4 P11L5 End system: An application that generates the content to be sent P11L6 in RTP packets and/or consumes the content of received RTP P11L7 packets. An end system can act as one or more synchronization P11L8 sources in a particular RTP session, but typically only one. P11L9 P11L10 Mixer: An intermediate system that receives RTP packets from one P11L11 or more sources, possibly changes the data format, combines the P11L12 packets in some manner and then forwards a new RTP packet. Since P11L13 the timing among multiple input sources will not generally be P11L14 synchronized, the mixer will make timing adjustments among the P11L15 streams and generate its own timing for the combined stream. P11L16 Thus, all data packets originating from a mixer will be identified P11L17 as having the mixer as their synchronization source. P11L18 P11L19 Translator: An intermediate system that forwards RTP packets P11L20 with their synchronization source identifier intact. Examples of P11L21 translators include devices that convert encodings without mixing, P11L22 replicators from multicast to unicast, and application-level P11L23 filters in firewalls. P11L24 P11L25 Monitor: An application that receives RTCP packets sent by P11L26 participants in an RTP session, in particular the reception P11L27 reports, and estimates the current quality of service for P11L28 distribution monitoring, fault diagnosis and long-term statistics. P11L29 The monitor function is likely to be built into the application(s) P11L30 participating in the session, but may also be a separate P11L31 application that does not otherwise participate and does not send P11L32 or receive the RTP data packets (since they are on a separate P11L33 port). These are called third-party monitors. It is also P11L34 acceptable for a third-party monitor to receive the RTP data P11L35 packets but not send RTCP packets or otherwise be counted in the P11L36 session. P11L37 P11L38 Non-RTP means: Protocols and mechanisms that may be needed in P11L39 addition to RTP to provide a usable service. In particular, for P11L40 multimedia conferences, a control protocol may distribute P11L41 multicast addresses and keys for encryption, negotiate the P11L42 encryption algorithm to be used, and define dynamic mappings P11L43 between RTP payload type values and the payload formats they P11L44 represent for formats that do not have a predefined payload type P11L45 value. Examples of such protocols include the Session Initiation P11L46 Protocol (SIP) (RFC 3261 [13]), ITU Recommendation H.323 [14] and P11L47 applications using SDP (RFC 2327 [15]), such as RTSP (RFC 2326 P11L48 [16]). For simple P12L1 applications, electronic mail or a conference database may also be P12L2 used. The specification of such protocols and mechanisms is P12L3 outside the scope of this document. P12L4 P12L5 4. Byte Order, Alignment, and Time Format P12L6 P12L7 All integer fields are carried in network byte order, that is, most P12L8 significant byte (octet) first. This byte order is commonly known as P12L9 big-endian. The transmission order is described in detail in [3]. P12L10 Unless otherwise noted, numeric constants are in decimal (base 10). P12L11 P12L12 All header data is aligned to its natural length, i.e., 16-bit fields P12L13 are aligned on even offsets, 32-bit fields are aligned at offsets P12L14 divisible by four, etc. Octets designated as padding have the value P12L15 zero. P12L16 P12L17 Wallclock time (absolute date and time) is represented using the P12L18 timestamp format of the Network Time Protocol (NTP), which is in P12L19 seconds relative to 0h UTC on 1 January 1900 [4]. The full P12L20 resolution NTP timestamp is a 64-bit unsigned fixed-point number with P12L21 the integer part in the first 32 bits and the fractional part in the P12L22 last 32 bits. In some fields where a more compact representation is P12L23 appropriate, only the middle 32 bits are used; that is, the low 16 P12L24 bits of the integer part and the high 16 bits of the fractional part. P12L25 The high 16 bits of the integer part must be determined P12L26 independently. P12L27 P12L28 An implementation is not required to run the Network Time Protocol in P12L29 order to use RTP. Other time sources, or none at all, may be used P12L30 (see the description of the NTP timestamp field in Section 6.4.1). P12L31 However, running NTP may be useful for synchronizing streams P12L32 transmitted from separate hosts. P12L33 P12L34 The NTP timestamp will wrap around to zero some time in the year P12L35 2036, but for RTP purposes, only differences between pairs of NTP P12L36 timestamps are used. So long as the pairs of timestamps can be P12L37 assumed to be within 68 years of each other, using modular arithmetic P12L38 for subtractions and comparisons makes the wraparound irrelevant. P12L39 P12L40 P12L41 P12L42 P12L43 P12L44 P12L45 P12L46 P12L47 P12L48 P13L1 5. RTP Data Transfer Protocol P13L2 P13L3 5.1 RTP Fixed Header Fields P13L4 P13L5 The RTP header has the following format: P13L6 P13L7 0 1 2 3 P13L8 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P13L9 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P13L10 |V=2|P|X| CC |M| PT | sequence number | P13L11 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P13L12 | timestamp | P13L13 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P13L14 | synchronization source (SSRC) identifier | P13L15 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P13L16 | contributing source (CSRC) identifiers | P13L17 | .... | P13L18 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P13L19 P13L20 The first twelve octets are present in every RTP packet, while the P13L21 list of CSRC identifiers is present only when inserted by a mixer. P13L22 The fields have the following meaning: P13L23 P13L24 version (V): 2 bits P13L25 This field identifies the version of RTP. The version defined by P13L26 this specification is two (2). (The value 1 is used by the first P13L27 draft version of RTP and the value 0 is used by the protocol P13L28 initially implemented in the "vat" audio tool.) P13L29 P13L30 padding (P): 1 bit P13L31 If the padding bit is set, the packet contains one or more P13L32 additional padding octets at the end which are not part of the P13L33 payload. The last octet of the padding contains a count of how P13L34 many padding octets should be ignored, including itself. Padding P13L35 may be needed by some encryption algorithms with fixed block sizes P13L36 or for carrying several RTP packets in a lower-layer protocol data P13L37 unit. P13L38 P13L39 extension (X): 1 bit P13L40 If the extension bit is set, the fixed header MUST be followed by P13L41 exactly one header extension, with a format defined in Section P13L42 5.3.1. P13L43 P13L44 CSRC count (CC): 4 bits P13L45 The CSRC count contains the number of CSRC identifiers that follow P13L46 the fixed header. P13L47 P13L48 P14L1 marker (M): 1 bit P14L2 The interpretation of the marker is defined by a profile. It is P14L3 intended to allow significant events such as frame boundaries to P14L4 be marked in the packet stream. A profile MAY define additional P14L5 marker bits or specify that there is no marker bit by changing the P14L6 number of bits in the payload type field (see Section 5.3). P14L7 P14L8 payload type (PT): 7 bits P14L9 This field identifies the format of the RTP payload and determines P14L10 its interpretation by the application. A profile MAY specify a P14L11 default static mapping of payload type codes to payload formats. P14L12 Additional payload type codes MAY be defined dynamically through P14L13 non-RTP means (see Section 3). A set of default mappings for P14L14 audio and video is specified in the companion RFC 3551 [1]. An P14L15 RTP source MAY change the payload type during a session, but this P14L16 field SHOULD NOT be used for multiplexing separate media streams P14L17 (see Section 5.2). P14L18 P14L19 A receiver MUST ignore packets with payload types that it does not P14L20 understand. P14L21 P14L22 sequence number: 16 bits P14L23 The sequence number increments by one for each RTP data packet P14L24 sent, and may be used by the receiver to detect packet loss and to P14L25 restore packet sequence. The initial value of the sequence number P14L26 SHOULD be random (unpredictable) to make known-plaintext attacks P14L27 on encryption more difficult, even if the source itself does not P14L28 encrypt according to the method in Section 9.1, because the P14L29 packets may flow through a translator that does. Techniques for P14L30 choosing unpredictable numbers are discussed in [17]. P14L31 P14L32 timestamp: 32 bits P14L33 The timestamp reflects the sampling instant of the first octet in P14L34 the RTP data packet. The sampling instant MUST be derived from a P14L35 clock that increments monotonically and linearly in time to allow P14L36 synchronization and jitter calculations (see Section 6.4.1). The P14L37 resolution of the clock MUST be sufficient for the desired P14L38 synchronization accuracy and for measuring packet arrival jitter P14L39 (one tick per video frame is typically not sufficient). The clock P14L40 frequency is dependent on the format of data carried as payload P14L41 and is specified statically in the profile or payload format P14L42 specification that defines the format, or MAY be specified P14L43 dynamically for payload formats defined through non-RTP means. If P14L44 RTP packets are generated periodically, the nominal sampling P14L45 instant as determined from the sampling clock is to be used, not a P14L46 reading of the system clock. As an example, for fixed-rate audio P14L47 the timestamp clock would likely increment by one for each P14L48 sampling period. If an audio application reads blocks covering P15L1 160 sampling periods from the input device, the timestamp would be P15L2 increased by 160 for each such block, regardless of whether the P15L3 block is transmitted in a packet or dropped as silent. P15L4 P15L5 The initial value of the timestamp SHOULD be random, as for the P15L6 sequence number. Several consecutive RTP packets will have equal P15L7 timestamps if they are (logically) generated at once, e.g., belong P15L8 to the same video frame. Consecutive RTP packets MAY contain P15L9 timestamps that are not monotonic if the data is not transmitted P15L10 in the order it was sampled, as in the case of MPEG interpolated P15L11 video frames. (The sequence numbers of the packets as transmitted P15L12 will still be monotonic.) P15L13 P15L14 RTP timestamps from different media streams may advance at P15L15 different rates and usually have independent, random offsets. P15L16 Therefore, although these timestamps are sufficient to reconstruct P15L17 the timing of a single stream, directly comparing RTP timestamps P15L18 from different media is not effective for synchronization. P15L19 Instead, for each medium the RTP timestamp is related to the P15L20 sampling instant by pairing it with a timestamp from a reference P15L21 clock (wallclock) that represents the time when the data P15L22 corresponding to the RTP timestamp was sampled. The reference P15L23 clock is shared by all media to be synchronized. The timestamp P15L24 pairs are not transmitted in every data packet, but at a lower P15L25 rate in RTCP SR packets as described in Section 6.4. P15L26 P15L27 The sampling instant is chosen as the point of reference for the P15L28 RTP timestamp because it is known to the transmitting endpoint and P15L29 has a common definition for all media, independent of encoding P15L30 delays or other processing. The purpose is to allow synchronized P15L31 presentation of all media sampled at the same time. P15L32 P15L33 Applications transmitting stored data rather than data sampled in P15L34 real time typically use a virtual presentation timeline derived P15L35 from wallclock time to determine when the next frame or other unit P15L36 of each medium in the stored data should be presented. In this P15L37 case, the RTP timestamp would reflect the presentation time for P15L38 each unit. That is, the RTP timestamp for each unit would be P15L39 related to the wallclock time at which the unit becomes current on P15L40 the virtual presentation timeline. Actual presentation occurs P15L41 some time later as determined by the receiver. P15L42 P15L43 An example describing live audio narration of prerecorded video P15L44 illustrates the significance of choosing the sampling instant as P15L45 the reference point. In this scenario, the video would be P15L46 presented locally for the narrator to view and would be P15L47 simultaneously transmitted using RTP. The "sampling instant" of a P15L48 video frame transmitted in RTP would be established by referencing P16L1 its timestamp to the wallclock time when that video frame was P16L2 presented to the narrator. The sampling instant for the audio RTP P16L3 packets containing the narrator's speech would be established by P16L4 referencing the same wallclock time when the audio was sampled. P16L5 The audio and video may even be transmitted by different hosts if P16L6 the reference clocks on the two hosts are synchronized by some P16L7 means such as NTP. A receiver can then synchronize presentation P16L8 of the audio and video packets by relating their RTP timestamps P16L9 using the timestamp pairs in RTCP SR packets. P16L10 P16L11 SSRC: 32 bits P16L12 The SSRC field identifies the synchronization source. This P16L13 identifier SHOULD be chosen randomly, with the intent that no two P16L14 synchronization sources within the same RTP session will have the P16L15 same SSRC identifier. An example algorithm for generating a P16L16 random identifier is presented in Appendix A.6. Although the P16L17 probability of multiple sources choosing the same identifier is P16L18 low, all RTP implementations must be prepared to detect and P16L19 resolve collisions. Section 8 describes the probability of P16L20 collision along with a mechanism for resolving collisions and P16L21 detecting RTP-level forwarding loops based on the uniqueness of P16L22 the SSRC identifier. If a source changes its source transport P16L23 address, it must also choose a new SSRC identifier to avoid being P16L24 interpreted as a looped source (see Section 8.2). P16L25 P16L26 CSRC list: 0 to 15 items, 32 bits each P16L27 The CSRC list identifies the contributing sources for the payload P16L28 contained in this packet. The number of identifiers is given by P16L29 the CC field. If there are more than 15 contributing sources, P16L30 only 15 can be identified. CSRC identifiers are inserted by P16L31 mixers (see Section 7.1), using the SSRC identifiers of P16L32 contributing sources. For example, for audio packets the SSRC P16L33 identifiers of all sources that were mixed together to create a P16L34 packet are listed, allowing correct talker indication at the P16L35 receiver. P16L36 P16L37 5.2 Multiplexing RTP Sessions P16L38 P16L39 For efficient protocol processing, the number of multiplexing points P16L40 should be minimized, as described in the integrated layer processing P16L41 design principle [10]. In RTP, multiplexing is provided by the P16L42 destination transport address (network address and port number) which P16L43 is different for each RTP session. For example, in a teleconference P16L44 composed of audio and video media encoded separately, each medium P16L45 SHOULD be carried in a separate RTP session with its own destination P16L46 transport address. P16L47 P16L48 P17L1 Separate audio and video streams SHOULD NOT be carried in a single P17L2 RTP session and demultiplexed based on the payload type or SSRC P17L3 fields. Interleaving packets with different RTP media types but P17L4 using the same SSRC would introduce several problems: P17L5 P17L6 1. If, say, two audio streams shared the same RTP session and the P17L7 same SSRC value, and one were to change encodings and thus acquire P17L8 a different RTP payload type, there would be no general way of P17L9 identifying which stream had changed encodings. P17L10 P17L11 2. An SSRC is defined to identify a single timing and sequence number P17L12 space. Interleaving multiple payload types would require P17L13 different timing spaces if the media clock rates differ and would P17L14 require different sequence number spaces to tell which payload P17L15 type suffered packet loss. P17L16 P17L17 3. The RTCP sender and receiver reports (see Section 6.4) can only P17L18 describe one timing and sequence number space per SSRC and do not P17L19 carry a payload type field. P17L20 P17L21 4. An RTP mixer would not be able to combine interleaved streams of P17L22 incompatible media into one stream. P17L23 P17L24 5. Carrying multiple media in one RTP session precludes: the use of P17L25 different network paths or network resource allocations if P17L26 appropriate; reception of a subset of the media if desired, for P17L27 example just audio if video would exceed the available bandwidth; P17L28 and receiver implementations that use separate processes for the P17L29 different media, whereas using separate RTP sessions permits P17L30 either single- or multiple-process implementations. P17L31 P17L32 Using a different SSRC for each medium but sending them in the same P17L33 RTP session would avoid the first three problems but not the last P17L34 two. P17L35 P17L36 On the other hand, multiplexing multiple related sources of the same P17L37 medium in one RTP session using different SSRC values is the norm for P17L38 multicast sessions. The problems listed above don't apply: an RTP P17L39 mixer can combine multiple audio sources, for example, and the same P17L40 treatment is applicable for all of them. It may also be appropriate P17L41 to multiplex streams of the same medium using different SSRC values P17L42 in other scenarios where the last two problems do not apply. P17L43 P17L44 P17L45 P17L46 P17L47 P17L48 P18L1 5.3 Profile-Specific Modifications to the RTP Header P18L2 P18L3 The existing RTP data packet header is believed to be complete for P18L4 the set of functions required in common across all the application P18L5 classes that RTP might support. However, in keeping with the ALF P18L6 design principle, the header MAY be tailored through modifications or P18L7 additions defined in a profile specification while still allowing P18L8 profile-independent monitoring and recording tools to function. P18L9 P18L10 o The marker bit and payload type field carry profile-specific P18L11 information, but they are allocated in the fixed header since many P18L12 applications are expected to need them and might otherwise have to P18L13 add another 32-bit word just to hold them. The octet containing P18L14 these fields MAY be redefined by a profile to suit different P18L15 requirements, for example with more or fewer marker bits. If P18L16 there are any marker bits, one SHOULD be located in the most P18L17 significant bit of the octet since profile-independent monitors P18L18 may be able to observe a correlation between packet loss patterns P18L19 and the marker bit. P18L20 P18L21 o Additional information that is required for a particular payload P18L22 format, such as a video encoding, SHOULD be carried in the payload P18L23 section of the packet. This might be in a header that is always P18L24 present at the start of the payload section, or might be indicated P18L25 by a reserved value in the data pattern. P18L26 P18L27 o If a particular class of applications needs additional P18L28 functionality independent of payload format, the profile under P18L29 which those applications operate SHOULD define additional fixed P18L30 fields to follow immediately after the SSRC field of the existing P18L31 fixed header. Those applications will be able to quickly and P18L32 directly access the additional fields while profile-independent P18L33 monitors or recorders can still process the RTP packets by P18L34 interpreting only the first twelve octets. P18L35 P18L36 If it turns out that additional functionality is needed in common P18L37 across all profiles, then a new version of RTP should be defined to P18L38 make a permanent change to the fixed header. P18L39 P18L40 5.3.1 RTP Header Extension P18L41 P18L42 An extension mechanism is provided to allow individual P18L43 implementations to experiment with new payload-format-independent P18L44 functions that require additional information to be carried in the P18L45 RTP data packet header. This mechanism is designed so that the P18L46 header extension may be ignored by other interoperating P18L47 implementations that have not been extended. P18L48 P19L1 Note that this header extension is intended only for limited use. P19L2 Most potential uses of this mechanism would be better done another P19L3 way, using the methods described in the previous section. For P19L4 example, a profile-specific extension to the fixed header is less P19L5 expensive to process because it is not conditional nor in a variable P19L6 location. Additional information required for a particular payload P19L7 format SHOULD NOT use this header extension, but SHOULD be carried in P19L8 the payload section of the packet. P19L9 P19L10 0 1 2 3 P19L11 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P19L12 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P19L13 | defined by profile | length | P19L14 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P19L15 | header extension | P19L16 | .... | P19L17 P19L18 If the X bit in the RTP header is one, a variable-length header P19L19 extension MUST be appended to the RTP header, following the CSRC list P19L20 if present. The header extension contains a 16-bit length field that P19L21 counts the number of 32-bit words in the extension, excluding the P19L22 four-octet extension header (therefore zero is a valid length). Only P19L23 a single extension can be appended to the RTP data header. To allow P19L24 multiple interoperating implementations to each experiment P19L25 independently with different header extensions, or to allow a P19L26 particular implementation to experiment with more than one type of P19L27 header extension, the first 16 bits of the header extension are left P19L28 open for distinguishing identifiers or parameters. The format of P19L29 these 16 bits is to be defined by the profile specification under P19L30 which the implementations are operating. This RTP specification does P19L31 not define any header extensions itself. P19L32 P19L33 6. RTP Control Protocol -- RTCP P19L34 P19L35 The RTP control protocol (RTCP) is based on the periodic transmission P19L36 of control packets to all participants in the session, using the same P19L37 distribution mechanism as the data packets. The underlying protocol P19L38 MUST provide multiplexing of the data and control packets, for P19L39 example using separate port numbers with UDP. RTCP performs four P19L40 functions: P19L41 P19L42 1. The primary function is to provide feedback on the quality of the P19L43 data distribution. This is an integral part of the RTP's role as P19L44 a transport protocol and is related to the flow and congestion P19L45 control functions of other transport protocols (see Section 10 on P19L46 the requirement for congestion control). The feedback may be P19L47 directly useful for control of adaptive encodings [18,19], but P19L48 experiments with IP multicasting have shown that it is also P20L1 critical to get feedback from the receivers to diagnose faults in P20L2 the distribution. Sending reception feedback reports to all P20L3 participants allows one who is observing problems to evaluate P20L4 whether those problems are local or global. With a distribution P20L5 mechanism like IP multicast, it is also possible for an entity P20L6 such as a network service provider who is not otherwise involved P20L7 in the session to receive the feedback information and act as a P20L8 third-party monitor to diagnose network problems. This feedback P20L9 function is performed by the RTCP sender and receiver reports, P20L10 described below in Section 6.4. P20L11 P20L12 2. RTCP carries a persistent transport-level identifier for an RTP P20L13 source called the canonical name or CNAME, Section 6.5.1. Since P20L14 the SSRC identifier may change if a conflict is discovered or a P20L15 program is restarted, receivers require the CNAME to keep track of P20L16 each participant. Receivers may also require the CNAME to P20L17 associate multiple data streams from a given participant in a set P20L18 of related RTP sessions, for example to synchronize audio and P20L19 video. Inter-media synchronization also requires the NTP and RTP P20L20 timestamps included in RTCP packets by data senders. P20L21 P20L22 3. The first two functions require that all participants send RTCP P20L23 packets, therefore the rate must be controlled in order for RTP to P20L24 scale up to a large number of participants. By having each P20L25 participant send its control packets to all the others, each can P20L26 independently observe the number of participants. This number is P20L27 used to calculate the rate at which the packets are sent, as P20L28 explained in Section 6.2. P20L29 P20L30 4. A fourth, OPTIONAL function is to convey minimal session control P20L31 information, for example participant identification to be P20L32 displayed in the user interface. This is most likely to be useful P20L33 in "loosely controlled" sessions where participants enter and P20L34 leave without membership control or parameter negotiation. RTCP P20L35 serves as a convenient channel to reach all the participants, but P20L36 it is not necessarily expected to support all the control P20L37 communication requirements of an application. A higher-level P20L38 session control protocol, which is beyond the scope of this P20L39 document, may be needed. P20L40 P20L41 Functions 1-3 SHOULD be used in all environments, but particularly in P20L42 the IP multicast environment. RTP application designers SHOULD avoid P20L43 mechanisms that can only work in unicast mode and will not scale to P20L44 larger numbers. Transmission of RTCP MAY be controlled separately P20L45 for senders and receivers, as described in Section 6.2, for cases P20L46 such as unidirectional links where feedback from receivers is not P20L47 possible. P20L48 P21L1 Non-normative note: In the multicast routing approach P21L2 called Source-Specific Multicast (SSM), there is only one sender P21L3 per "channel" (a source address, group address pair), and P21L4 receivers (except for the channel source) cannot use multicast to P21L5 communicate directly with other channel members. The P21L6 recommendations here accommodate SSM only through Section 6.2's P21L7 option of turning off receivers' RTCP entirely. Future work will P21L8 specify adaptation of RTCP for SSM so that feedback from receivers P21L9 can be maintained. P21L10 P21L11 6.1 RTCP Packet Format P21L12 P21L13 This specification defines several RTCP packet types to carry a P21L14 variety of control information: P21L15 P21L16 SR: Sender report, for transmission and reception statistics from P21L17 participants that are active senders P21L18 P21L19 RR: Receiver report, for reception statistics from participants P21L20 that are not active senders and in combination with SR for P21L21 active senders reporting on more than 31 sources P21L22 P21L23 SDES: Source description items, including CNAME P21L24 P21L25 BYE: Indicates end of participation P21L26 P21L27 APP: Application-specific functions P21L28 P21L29 Each RTCP packet begins with a fixed part similar to that of RTP data P21L30 packets, followed by structured elements that MAY be of variable P21L31 length according to the packet type but MUST end on a 32-bit P21L32 boundary. The alignment requirement and a length field in the fixed P21L33 part of each packet are included to make RTCP packets "stackable". P21L34 Multiple RTCP packets can be concatenated without any intervening P21L35 separators to form a compound RTCP packet that is sent in a single P21L36 packet of the lower layer protocol, for example UDP. There is no P21L37 explicit count of individual RTCP packets in the compound packet P21L38 since the lower layer protocols are expected to provide an overall P21L39 length to determine the end of the compound packet. P21L40 P21L41 Each individual RTCP packet in the compound packet may be processed P21L42 independently with no requirements upon the order or combination of P21L43 packets. However, in order to perform the functions of the protocol, P21L44 the following constraints are imposed: P21L45 P21L46 P21L47 P21L48 P22L1 o Reception statistics (in SR or RR) should be sent as often as P22L2 bandwidth constraints will allow to maximize the resolution of the P22L3 statistics, therefore each periodically transmitted compound RTCP P22L4 packet MUST include a report packet. P22L5 P22L6 o New receivers need to receive the CNAME for a source as soon as P22L7 possible to identify the source and to begin associating media for P22L8 purposes such as lip-sync, so each compound RTCP packet MUST also P22L9 include the SDES CNAME except when the compound RTCP packet is P22L10 split for partial encryption as described in Section 9.1. P22L11 P22L12 o The number of packet types that may appear first in the compound P22L13 packet needs to be limited to increase the number of constant bits P22L14 in the first word and the probability of successfully validating P22L15 RTCP packets against misaddressed RTP data packets or other P22L16 unrelated packets. P22L17 P22L18 Thus, all RTCP packets MUST be sent in a compound packet of at least P22L19 two individual packets, with the following format: P22L20 P22L21 Encryption prefix: If and only if the compound packet is to be P22L22 encrypted according to the method in Section 9.1, it MUST be P22L23 prefixed by a random 32-bit quantity redrawn for every compound P22L24 packet transmitted. If padding is required for the encryption, it P22L25 MUST be added to the last packet of the compound packet. P22L26 P22L27 SR or RR: The first RTCP packet in the compound packet MUST P22L28 always be a report packet to facilitate header validation as P22L29 described in Appendix A.2. This is true even if no data has been P22L30 sent or received, in which case an empty RR MUST be sent, and even P22L31 if the only other RTCP packet in the compound packet is a BYE. P22L32 P22L33 Additional RRs: If the number of sources for which reception P22L34 statistics are being reported exceeds 31, the number that will fit P22L35 into one SR or RR packet, then additional RR packets SHOULD follow P22L36 the initial report packet. P22L37 P22L38 SDES: An SDES packet containing a CNAME item MUST be included P22L39 in each compound RTCP packet, except as noted in Section 9.1. P22L40 Other source description items MAY optionally be included if P22L41 required by a particular application, subject to bandwidth P22L42 constraints (see Section 6.3.9). P22L43 P22L44 BYE or APP: Other RTCP packet types, including those yet to be P22L45 defined, MAY follow in any order, except that BYE SHOULD be the P22L46 last packet sent with a given SSRC/CSRC. Packet types MAY appear P22L47 more than once. P22L48 P23L1 An individual RTP participant SHOULD send only one compound RTCP P23L2 packet per report interval in order for the RTCP bandwidth per P23L3 participant to be estimated correctly (see Section 6.2), except when P23L4 the compound RTCP packet is split for partial encryption as described P23L5 in Section 9.1. If there are too many sources to fit all the P23L6 necessary RR packets into one compound RTCP packet without exceeding P23L7 the maximum transmission unit (MTU) of the network path, then only P23L8 the subset that will fit into one MTU SHOULD be included in each P23L9 interval. The subsets SHOULD be selected round-robin across multiple P23L10 intervals so that all sources are reported. P23L11 P23L12 It is RECOMMENDED that translators and mixers combine individual RTCP P23L13 packets from the multiple sources they are forwarding into one P23L14 compound packet whenever feasible in order to amortize the packet P23L15 overhead (see Section 7). An example RTCP compound packet as might P23L16 be produced by a mixer is shown in Fig. 1. If the overall length of P23L17 a compound packet would exceed the MTU of the network path, it SHOULD P23L18 be segmented into multiple shorter compound packets to be transmitted P23L19 in separate packets of the underlying protocol. This does not impair P23L20 the RTCP bandwidth estimation because each compound packet represents P23L21 at least one distinct participant. Note that each of the compound P23L22 packets MUST begin with an SR or RR packet. P23L23 P23L24 An implementation SHOULD ignore incoming RTCP packets with types P23L25 unknown to it. Additional RTCP packet types may be registered with P23L26 the Internet Assigned Numbers Authority (IANA) as described in P23L27 Section 15. P23L28 P23L29 if encrypted: random 32-bit integer P23L30 | P23L31 |[--------- packet --------][---------- packet ----------][-packet-] P23L32 | P23L33 | receiver chunk chunk P23L34 V reports item item item item P23L35 -------------------------------------------------------------------- P23L36 R[SR #sendinfo #site1#site2][SDES #CNAME PHONE #CNAME LOC][BYE##why] P23L37 -------------------------------------------------------------------- P23L38 | | P23L39 |<----------------------- compound packet ----------------------->| P23L40 |<-------------------------- UDP packet ------------------------->| P23L41 P23L42 #: SSRC/CSRC identifier P23L43 P23L44 Figure 1: Example of an RTCP compound packet P23L45 P23L46 P23L47 P23L48 P24L1 6.2 RTCP Transmission Interval P24L2 P24L3 RTP is designed to allow an application to scale automatically over P24L4 session sizes ranging from a few participants to thousands. For P24L5 example, in an audio conference the data traffic is inherently self- P24L6 limiting because only one or two people will speak at a time, so with P24L7 multicast distribution the data rate on any given link remains P24L8 relatively constant independent of the number of participants. P24L9 However, the control traffic is not self-limiting. If the reception P24L10 reports from each participant were sent at a constant rate, the P24L11 control traffic would grow linearly with the number of participants. P24L12 Therefore, the rate must be scaled down by dynamically calculating P24L13 the interval between RTCP packet transmissions. P24L14 P24L15 For each session, it is assumed that the data traffic is subject to P24L16 an aggregate limit called the "session bandwidth" to be divided among P24L17 the participants. This bandwidth might be reserved and the limit P24L18 enforced by the network. If there is no reservation, there may be P24L19 other constraints, depending on the environment, that establish the P24L20 "reasonable" maximum for the session to use, and that would be the P24L21 session bandwidth. The session bandwidth may be chosen based on some P24L22 cost or a priori knowledge of the available network bandwidth for the P24L23 session. It is somewhat independent of the media encoding, but the P24L24 encoding choice may be limited by the session bandwidth. Often, the P24L25 session bandwidth is the sum of the nominal bandwidths of the senders P24L26 expected to be concurrently active. For teleconference audio, this P24L27 number would typically be one sender's bandwidth. For layered P24L28 encodings, each layer is a separate RTP session with its own session P24L29 bandwidth parameter. P24L30 P24L31 The session bandwidth parameter is expected to be supplied by a P24L32 session management application when it invokes a media application, P24L33 but media applications MAY set a default based on the single-sender P24L34 data bandwidth for the encoding selected for the session. The P24L35 application MAY also enforce bandwidth limits based on multicast P24L36 scope rules or other criteria. All participants MUST use the same P24L37 value for the session bandwidth so that the same RTCP interval will P24L38 be calculated. P24L39 P24L40 Bandwidth calculations for control and data traffic include lower- P24L41 layer transport and network protocols (e.g., UDP and IP) since that P24L42 is what the resource reservation system would need to know. The P24L43 application can also be expected to know which of these protocols are P24L44 in use. Link level headers are not included in the calculation since P24L45 the packet will be encapsulated with different link level headers as P24L46 it travels. P24L47 P24L48 P25L1 The control traffic should be limited to a small and known fraction P25L2 of the session bandwidth: small so that the primary function of the P25L3 transport protocol to carry data is not impaired; known so that the P25L4 control traffic can be included in the bandwidth specification given P25L5 to a resource reservation protocol, and so that each participant can P25L6 independently calculate its share. The control traffic bandwidth is P25L7 in addition to the session bandwidth for the data traffic. It is P25L8 RECOMMENDED that the fraction of the session bandwidth added for RTCP P25L9 be fixed at 5%. It is also RECOMMENDED that 1/4 of the RTCP P25L10 bandwidth be dedicated to participants that are sending data so that P25L11 in sessions with a large number of receivers but a small number of P25L12 senders, newly joining participants will more quickly receive the P25L13 CNAME for the sending sites. When the proportion of senders is P25L14 greater than 1/4 of the participants, the senders get their P25L15 proportion of the full RTCP bandwidth. While the values of these and P25L16 other constants in the interval calculation are not critical, all P25L17 participants in the session MUST use the same values so the same P25L18 interval will be calculated. Therefore, these constants SHOULD be P25L19 fixed for a particular profile. P25L20 P25L21 A profile MAY specify that the control traffic bandwidth may be a P25L22 separate parameter of the session rather than a strict percentage of P25L23 the session bandwidth. Using a separate parameter allows rate- P25L24 adaptive applications to set an RTCP bandwidth consistent with a P25L25 "typical" data bandwidth that is lower than the maximum bandwidth P25L26 specified by the session bandwidth parameter. P25L27 P25L28 The profile MAY further specify that the control traffic bandwidth P25L29 may be divided into two separate session parameters for those P25L30 participants which are active data senders and those which are not; P25L31 let us call the parameters S and R. Following the recommendation P25L32 that 1/4 of the RTCP bandwidth be dedicated to data senders, the P25L33 RECOMMENDED default values for these two parameters would be 1.25% P25L34 and 3.75%, respectively. When the proportion of senders is greater P25L35 than S/(S+R) of the participants, the senders get their proportion of P25L36 the sum of these parameters. Using two parameters allows RTCP P25L37 reception reports to be turned off entirely for a particular session P25L38 by setting the RTCP bandwidth for non-data-senders to zero while P25L39 keeping the RTCP bandwidth for data senders non-zero so that sender P25L40 reports can still be sent for inter-media synchronization. Turning P25L41 off RTCP reception reports is NOT RECOMMENDED because they are needed P25L42 for the functions listed at the beginning of Section 6, particularly P25L43 reception quality feedback and congestion control. However, doing so P25L44 may be appropriate for systems operating on unidirectional links or P25L45 for sessions that don't require feedback on the quality of reception P25L46 or liveness of receivers and that have other means to avoid P25L47 congestion. P25L48 P26L1 The calculated interval between transmissions of compound RTCP P26L2 packets SHOULD also have a lower bound to avoid having bursts of P26L3 packets exceed the allowed bandwidth when the number of participants P26L4 is small and the traffic isn't smoothed according to the law of large P26L5 numbers. It also keeps the report interval from becoming too small P26L6 during transient outages like a network partition such that P26L7 adaptation is delayed when the partition heals. At application P26L8 startup, a delay SHOULD be imposed before the first compound RTCP P26L9 packet is sent to allow time for RTCP packets to be received from P26L10 other participants so the report interval will converge to the P26L11 correct value more quickly. This delay MAY be set to half the P26L12 minimum interval to allow quicker notification that the new P26L13 participant is present. The RECOMMENDED value for a fixed minimum P26L14 interval is 5 seconds. P26L15 P26L16 An implementation MAY scale the minimum RTCP interval to a smaller P26L17 value inversely proportional to the session bandwidth parameter with P26L18 the following limitations: P26L19 P26L20 o For multicast sessions, only active data senders MAY use the P26L21 reduced minimum value to calculate the interval for transmission P26L22 of compound RTCP packets. P26L23 P26L24 o For unicast sessions, the reduced value MAY be used by P26L25 participants that are not active data senders as well, and the P26L26 delay before sending the initial compound RTCP packet MAY be zero. P26L27 P26L28 o For all sessions, the fixed minimum SHOULD be used when P26L29 calculating the participant timeout interval (see Section 6.3.5) P26L30 so that implementations which do not use the reduced value for P26L31 transmitting RTCP packets are not timed out by other participants P26L32 prematurely. P26L33 P26L34 o The RECOMMENDED value for the reduced minimum in seconds is 360 P26L35 divided by the session bandwidth in kilobits/second. This minimum P26L36 is smaller than 5 seconds for bandwidths greater than 72 kb/s. P26L37 P26L38 The algorithm described in Section 6.3 and Appendix A.7 was designed P26L39 to meet the goals outlined in this section. It calculates the P26L40 interval between sending compound RTCP packets to divide the allowed P26L41 control traffic bandwidth among the participants. This allows an P26L42 application to provide fast response for small sessions where, for P26L43 example, identification of all participants is important, yet P26L44 automatically adapt to large sessions. The algorithm incorporates P26L45 the following characteristics: P26L46 P26L47 P26L48 P27L1 o The calculated interval between RTCP packets scales linearly with P27L2 the number of members in the group. It is this linear factor P27L3 which allows for a constant amount of control traffic when summed P27L4 across all members. P27L5 P27L6 o The interval between RTCP packets is varied randomly over the P27L7 range [0.5,1.5] times the calculated interval to avoid unintended P27L8 synchronization of all participants [20]. The first RTCP packet P27L9 sent after joining a session is also delayed by a random variation P27L10 of half the minimum RTCP interval. P27L11 P27L12 o A dynamic estimate of the average compound RTCP packet size is P27L13 calculated, including all those packets received and sent, to P27L14 automatically adapt to changes in the amount of control P27L15 information carried. P27L16 P27L17 o Since the calculated interval is dependent on the number of P27L18 observed group members, there may be undesirable startup effects P27L19 when a new user joins an existing session, or many users P27L20 simultaneously join a new session. These new users will initially P27L21 have incorrect estimates of the group membership, and thus their P27L22 RTCP transmission interval will be too short. This problem can be P27L23 significant if many users join the session simultaneously. To P27L24 deal with this, an algorithm called "timer reconsideration" is P27L25 employed. This algorithm implements a simple back-off mechanism P27L26 which causes users to hold back RTCP packet transmission if the P27L27 group sizes are increasing. P27L28 P27L29 o When users leave a session, either with a BYE or by timeout, the P27L30 group membership decreases, and thus the calculated interval P27L31 should decrease. A "reverse reconsideration" algorithm is used to P27L32 allow members to more quickly reduce their intervals in response P27L33 to group membership decreases. P27L34 P27L35 o BYE packets are given different treatment than other RTCP packets. P27L36 When a user leaves a group, and wishes to send a BYE packet, it P27L37 may do so before its next scheduled RTCP packet. However, P27L38 transmission of BYEs follows a back-off algorithm which avoids P27L39 floods of BYE packets should a large number of members P27L40 simultaneously leave the session. P27L41 P27L42 This algorithm may be used for sessions in which all participants are P27L43 allowed to send. In that case, the session bandwidth parameter is P27L44 the product of the individual sender's bandwidth times the number of P27L45 participants, and the RTCP bandwidth is 5% of that. P27L46 P27L47 Details of the algorithm's operation are given in the sections that P27L48 follow. Appendix A.7 gives an example implementation. P28L1 6.2.1 Maintaining the Number of Session Members P28L2 P28L3 Calculation of the RTCP packet interval depends upon an estimate of P28L4 the number of sites participating in the session. New sites are P28L5 added to the count when they are heard, and an entry for each SHOULD P28L6 be created in a table indexed by the SSRC or CSRC identifier (see P28L7 Section 8.2) to keep track of them. New entries MAY be considered P28L8 not valid until multiple packets carrying the new SSRC have been P28L9 received (see Appendix A.1), or until an SDES RTCP packet containing P28L10 a CNAME for that SSRC has been received. Entries MAY be deleted from P28L11 the table when an RTCP BYE packet with the corresponding SSRC P28L12 identifier is received, except that some straggler data packets might P28L13 arrive after the BYE and cause the entry to be recreated. Instead, P28L14 the entry SHOULD be marked as having received a BYE and then deleted P28L15 after an appropriate delay. P28L16 P28L17 A participant MAY mark another site inactive, or delete it if not yet P28L18 valid, if no RTP or RTCP packet has been received for a small number P28L19 of RTCP report intervals (5 is RECOMMENDED). This provides some P28L20 robustness against packet loss. All sites must have the same value P28L21 for this multiplier and must calculate roughly the same value for the P28L22 RTCP report interval in order for this timeout to work properly. P28L23 Therefore, this multiplier SHOULD be fixed for a particular profile. P28L24 P28L25 For sessions with a very large number of participants, it may be P28L26 impractical to maintain a table to store the SSRC identifier and P28L27 state information for all of them. An implementation MAY use SSRC P28L28 sampling, as described in [21], to reduce the storage requirements. P28L29 An implementation MAY use any other algorithm with similar P28L30 performance. A key requirement is that any algorithm considered P28L31 SHOULD NOT substantially underestimate the group size, although it P28L32 MAY overestimate. P28L33 P28L34 6.3 RTCP Packet Send and Receive Rules P28L35 P28L36 The rules for how to send, and what to do when receiving an RTCP P28L37 packet are outlined here. An implementation that allows operation in P28L38 a multicast environment or a multipoint unicast environment MUST meet P28L39 the requirements in Section 6.2. Such an implementation MAY use the P28L40 algorithm defined in this section to meet those requirements, or MAY P28L41 use some other algorithm so long as it provides equivalent or better P28L42 performance. An implementation which is constrained to two-party P28L43 unicast operation SHOULD still use randomization of the RTCP P28L44 transmission interval to avoid unintended synchronization of multiple P28L45 instances operating in the same environment, but MAY omit the "timer P28L46 reconsideration" and "reverse reconsideration" algorithms in Sections P28L47 6.3.3, 6.3.6 and 6.3.7. P28L48 P29L1 To execute these rules, a session participant must maintain several P29L2 pieces of state: P29L3 P29L4 tp: the last time an RTCP packet was transmitted; P29L5 P29L6 tc: the current time; P29L7 P29L8 tn: the next scheduled transmission time of an RTCP packet; P29L9 P29L10 pmembers: the estimated number of session members at the time tn P29L11 was last recomputed; P29L12 P29L13 members: the most current estimate for the number of session P29L14 members; P29L15 P29L16 senders: the most current estimate for the number of senders in P29L17 the session; P29L18 P29L19 rtcp_bw: The target RTCP bandwidth, i.e., the total bandwidth P29L20 that will be used for RTCP packets by all members of this session, P29L21 in octets per second. This will be a specified fraction of the P29L22 "session bandwidth" parameter supplied to the application at P29L23 startup. P29L24 P29L25 we_sent: Flag that is true if the application has sent data P29L26 since the 2nd previous RTCP report was transmitted. P29L27 P29L28 avg_rtcp_size: The average compound RTCP packet size, in octets, P29L29 over all RTCP packets sent and received by this participant. The P29L30 size includes lower-layer transport and network protocol headers P29L31 (e.g., UDP and IP) as explained in Section 6.2. P29L32 P29L33 initial: Flag that is true if the application has not yet sent P29L34 an RTCP packet. P29L35 P29L36 Many of these rules make use of the "calculated interval" between P29L37 packet transmissions. This interval is described in the following P29L38 section. P29L39 P29L40 6.3.1 Computing the RTCP Transmission Interval P29L41 P29L42 To maintain scalability, the average interval between packets from a P29L43 session participant should scale with the group size. This interval P29L44 is called the calculated interval. It is obtained by combining a P29L45 number of the pieces of state described above. The calculated P29L46 interval T is then determined as follows: P29L47 P29L48 P30L1 1. If the number of senders is less than or equal to 25% of the P30L2 membership (members), the interval depends on whether the P30L3 participant is a sender or not (based on the value of we_sent). P30L4 If the participant is a sender (we_sent true), the constant C is P30L5 set to the average RTCP packet size (avg_rtcp_size) divided by 25% P30L6 of the RTCP bandwidth (rtcp_bw), and the constant n is set to the P30L7 number of senders. If we_sent is not true, the constant C is set P30L8 to the average RTCP packet size divided by 75% of the RTCP P30L9 bandwidth. The constant n is set to the number of receivers P30L10 (members - senders). If the number of senders is greater than P30L11 25%, senders and receivers are treated together. The constant C P30L12 is set to the average RTCP packet size divided by the total RTCP P30L13 bandwidth and n is set to the total number of members. As stated P30L14 in Section 6.2, an RTP profile MAY specify that the RTCP bandwidth P30L15 may be explicitly defined by two separate parameters (call them S P30L16 and R) for those participants which are senders and those which P30L17 are not. In that case, the 25% fraction becomes S/(S+R) and the P30L18 75% fraction becomes R/(S+R). Note that if R is zero, the P30L19 percentage of senders is never greater than S/(S+R), and the P30L20 implementation must avoid division by zero. P30L21 P30L22 2. If the participant has not yet sent an RTCP packet (the variable P30L23 initial is true), the constant Tmin is set to 2.5 seconds, else it P30L24 is set to 5 seconds. P30L25 P30L26 3. The deterministic calculated interval Td is set to max(Tmin, n*C). P30L27 P30L28 4. The calculated interval T is set to a number uniformly distributed P30L29 between 0.5 and 1.5 times the deterministic calculated interval. P30L30 P30L31 5. The resulting value of T is divided by e-3/2=1.21828 to compensate P30L32 for the fact that the timer reconsideration algorithm converges to P30L33 a value of the RTCP bandwidth below the intended average. P30L34 P30L35 This procedure results in an interval which is random, but which, on P30L36 average, gives at least 25% of the RTCP bandwidth to senders and the P30L37 rest to receivers. If the senders constitute more than one quarter P30L38 of the membership, this procedure splits the bandwidth equally among P30L39 all participants, on average. P30L40 P30L41 6.3.2 Initialization P30L42 P30L43 Upon joining the session, the participant initializes tp to 0, tc to P30L44 0, senders to 0, pmembers to 1, members to 1, we_sent to false, P30L45 rtcp_bw to the specified fraction of the session bandwidth, initial P30L46 to true, and avg_rtcp_size to the probable size of the first RTCP P30L47 packet that the application will later construct. The calculated P30L48 interval T is then computed, and the first packet is scheduled for P31L1 time tn = T. This means that a transmission timer is set which P31L2 expires at time T. Note that an application MAY use any desired P31L3 approach for implementing this timer. P31L4 P31L5 The participant adds its own SSRC to the member table. P31L6 P31L7 6.3.3 Receiving an RTP or Non-BYE RTCP Packet P31L8 P31L9 When an RTP or RTCP packet is received from a participant whose SSRC P31L10 is not in the member table, the SSRC is added to the table, and the P31L11 value for members is updated once the participant has been validated P31L12 as described in Section 6.2.1. The same processing occurs for each P31L13 CSRC in a validated RTP packet. P31L14 P31L15 When an RTP packet is received from a participant whose SSRC is not P31L16 in the sender table, the SSRC is added to the table, and the value P31L17 for senders is updated. P31L18 P31L19 For each compound RTCP packet received, the value of avg_rtcp_size is P31L20 updated: P31L21 P31L22 avg_rtcp_size = (1/16) * packet_size + (15/16) * avg_rtcp_size P31L23 P31L24 where packet_size is the size of the RTCP packet just received. P31L25 P31L26 6.3.4 Receiving an RTCP BYE Packet P31L27 P31L28 Except as described in Section 6.3.7 for the case when an RTCP BYE is P31L29 to be transmitted, if the received packet is an RTCP BYE packet, the P31L30 SSRC is checked against the member table. If present, the entry is P31L31 removed from the table, and the value for members is updated. The P31L32 SSRC is then checked against the sender table. If present, the entry P31L33 is removed from the table, and the value for senders is updated. P31L34 P31L35 Furthermore, to make the transmission rate of RTCP packets more P31L36 adaptive to changes in group membership, the following "reverse P31L37 reconsideration" algorithm SHOULD be executed when a BYE packet is P31L38 received that reduces members to a value less than pmembers: P31L39 P31L40 o The value for tn is updated according to the following formula: P31L41 P31L42 tn = tc + (members/pmembers) * (tn - tc) P31L43 P31L44 o The value for tp is updated according the following formula: P31L45 P31L46 tp = tc - (members/pmembers) * (tc - tp). P31L47 P31L48 P32L1 o The next RTCP packet is rescheduled for transmission at time tn, P32L2 which is now earlier. P32L3 P32L4 o The value of pmembers is set equal to members. P32L5 P32L6 This algorithm does not prevent the group size estimate from P32L7 incorrectly dropping to zero for a short time due to premature P32L8 timeouts when most participants of a large session leave at once but P32L9 some remain. The algorithm does make the estimate return to the P32L10 correct value more rapidly. This situation is unusual enough and the P32L11 consequences are sufficiently harmless that this problem is deemed P32L12 only a secondary concern. P32L13 P32L14 6.3.5 Timing Out an SSRC P32L15 P32L16 At occasional intervals, the participant MUST check to see if any of P32L17 the other participants time out. To do this, the participant P32L18 computes the deterministic (without the randomization factor) P32L19 calculated interval Td for a receiver, that is, with we_sent false. P32L20 Any other session member who has not sent an RTP or RTCP packet since P32L21 time tc - MTd (M is the timeout multiplier, and defaults to 5) is P32L22 timed out. This means that its SSRC is removed from the member list, P32L23 and members is updated. A similar check is performed on the sender P32L24 list. Any member on the sender list who has not sent an RTP packet P32L25 since time tc - 2T (within the last two RTCP report intervals) is P32L26 removed from the sender list, and senders is updated. P32L27 P32L28 If any members time out, the reverse reconsideration algorithm P32L29 described in Section 6.3.4 SHOULD be performed. P32L30 P32L31 The participant MUST perform this check at least once per RTCP P32L32 transmission interval. P32L33 P32L34 6.3.6 Expiration of Transmission Timer P32L35 P32L36 When the packet transmission timer expires, the participant performs P32L37 the following operations: P32L38 P32L39 o The transmission interval T is computed as described in Section P32L40 6.3.1, including the randomization factor. P32L41 P32L42 o If tp + T is less than or equal to tc, an RTCP packet is P32L43 transmitted. tp is set to tc, then another value for T is P32L44 calculated as in the previous step and tn is set to tc + T. The P32L45 transmission timer is set to expire again at time tn. If tp + T P32L46 is greater than tc, tn is set to tp + T. No RTCP packet is P32L47 transmitted. The transmission timer is set to expire at time tn. P32L48 P33L1 o pmembers is set to members. P33L2 P33L3 If an RTCP packet is transmitted, the value of initial is set to P33L4 FALSE. Furthermore, the value of avg_rtcp_size is updated: P33L5 P33L6 avg_rtcp_size = (1/16) * packet_size + (15/16) * avg_rtcp_size P33L7 P33L8 where packet_size is the size of the RTCP packet just transmitted. P33L9 P33L10 6.3.7 Transmitting a BYE Packet P33L11 P33L12 When a participant wishes to leave a session, a BYE packet is P33L13 transmitted to inform the other participants of the event. In order P33L14 to avoid a flood of BYE packets when many participants leave the P33L15 system, a participant MUST execute the following algorithm if the P33L16 number of members is more than 50 when the participant chooses to P33L17 leave. This algorithm usurps the normal role of the members variable P33L18 to count BYE packets instead: P33L19 P33L20 o When the participant decides to leave the system, tp is reset to P33L21 tc, the current time, members and pmembers are initialized to 1, P33L22 initial is set to 1, we_sent is set to false, senders is set to 0, P33L23 and avg_rtcp_size is set to the size of the compound BYE packet. P33L24 The calculated interval T is computed. The BYE packet is then P33L25 scheduled for time tn = tc + T. P33L26 P33L27 o Every time a BYE packet from another participant is received, P33L28 members is incremented by 1 regardless of whether that participant P33L29 exists in the member table or not, and when SSRC sampling is in P33L30 use, regardless of whether or not the BYE SSRC would be included P33L31 in the sample. members is NOT incremented when other RTCP packets P33L32 or RTP packets are received, but only for BYE packets. Similarly, P33L33 avg_rtcp_size is updated only for received BYE packets. senders P33L34 is NOT updated when RTP packets arrive; it remains 0. P33L35 P33L36 o Transmission of the BYE packet then follows the rules for P33L37 transmitting a regular RTCP packet, as above. P33L38 P33L39 This allows BYE packets to be sent right away, yet controls their P33L40 total bandwidth usage. In the worst case, this could cause RTCP P33L41 control packets to use twice the bandwidth as normal (10%) -- 5% for P33L42 non-BYE RTCP packets and 5% for BYE. P33L43 P33L44 A participant that does not want to wait for the above mechanism to P33L45 allow transmission of a BYE packet MAY leave the group without P33L46 sending a BYE at all. That participant will eventually be timed out P33L47 by the other group members. P33L48 P34L1 If the group size estimate members is less than 50 when the P34L2 participant decides to leave, the participant MAY send a BYE packet P34L3 immediately. Alternatively, the participant MAY choose to execute P34L4 the above BYE backoff algorithm. P34L5 P34L6 In either case, a participant which never sent an RTP or RTCP packet P34L7 MUST NOT send a BYE packet when they leave the group. P34L8 P34L9 6.3.8 Updating we_sent P34L10 P34L11 The variable we_sent contains true if the participant has sent an RTP P34L12 packet recently, false otherwise. This determination is made by P34L13 using the same mechanisms as for managing the set of other P34L14 participants listed in the senders table. If the participant sends P34L15 an RTP packet when we_sent is false, it adds itself to the sender P34L16 table and sets we_sent to true. The reverse reconsideration P34L17 algorithm described in Section 6.3.4 SHOULD be performed to possibly P34L18 reduce the delay before sending an SR packet. Every time another RTP P34L19 packet is sent, the time of transmission of that packet is maintained P34L20 in the table. The normal sender timeout algorithm is then applied to P34L21 the participant -- if an RTP packet has not been transmitted since P34L22 time tc - 2T, the participant removes itself from the sender table, P34L23 decrements the sender count, and sets we_sent to false. P34L24 P34L25 6.3.9 Allocation of Source Description Bandwidth P34L26 P34L27 This specification defines several source description (SDES) items in P34L28 addition to the mandatory CNAME item, such as NAME (personal name) P34L29 and EMAIL (email address). It also provides a means to define new P34L30 application-specific RTCP packet types. Applications should exercise P34L31 caution in allocating control bandwidth to this additional P34L32 information because it will slow down the rate at which reception P34L33 reports and CNAME are sent, thus impairing the performance of the P34L34 protocol. It is RECOMMENDED that no more than 20% of the RTCP P34L35 bandwidth allocated to a single participant be used to carry the P34L36 additional information. Furthermore, it is not intended that all P34L37 SDES items will be included in every application. Those that are P34L38 included SHOULD be assigned a fraction of the bandwidth according to P34L39 their utility. Rather than estimate these fractions dynamically, it P34L40 is recommended that the percentages be translated statically into P34L41 report interval counts based on the typical length of an item. P34L42 P34L43 For example, an application may be designed to send only CNAME, NAME P34L44 and EMAIL and not any others. NAME might be given much higher P34L45 priority than EMAIL because the NAME would be displayed continuously P34L46 in the application's user interface, whereas EMAIL would be displayed P34L47 only when requested. At every RTCP interval, an RR packet and an P34L48 SDES packet with the CNAME item would be sent. For a small session P35L1 operating at the minimum interval, that would be every 5 seconds on P35L2 the average. Every third interval (15 seconds), one extra item would P35L3 be included in the SDES packet. Seven out of eight times this would P35L4 be the NAME item, and every eighth time (2 minutes) it would be the P35L5 EMAIL item. P35L6 P35L7 When multiple applications operate in concert using cross-application P35L8 binding through a common CNAME for each participant, for example in a P35L9 multimedia conference composed of an RTP session for each medium, the P35L10 additional SDES information MAY be sent in only one RTP session. The P35L11 other sessions would carry only the CNAME item. In particular, this P35L12 approach should be applied to the multiple sessions of a layered P35L13 encoding scheme (see Section 2.4). P35L14 P35L15 6.4 Sender and Receiver Reports P35L16 P35L17 RTP receivers provide reception quality feedback using RTCP report P35L18 packets which may take one of two forms depending upon whether or not P35L19 the receiver is also a sender. The only difference between the P35L20 sender report (SR) and receiver report (RR) forms, besides the packet P35L21 type code, is that the sender report includes a 20-byte sender P35L22 information section for use by active senders. The SR is issued if a P35L23 site has sent any data packets during the interval since issuing the P35L24 last report or the previous one, otherwise the RR is issued. P35L25 P35L26 Both the SR and RR forms include zero or more reception report P35L27 blocks, one for each of the synchronization sources from which this P35L28 receiver has received RTP data packets since the last report. P35L29 Reports are not issued for contributing sources listed in the CSRC P35L30 list. Each reception report block provides statistics about the data P35L31 received from the particular source indicated in that block. Since a P35L32 maximum of 31 reception report blocks will fit in an SR or RR packet, P35L33 additional RR packets SHOULD be stacked after the initial SR or RR P35L34 packet as needed to contain the reception reports for all sources P35L35 heard during the interval since the last report. If there are too P35L36 many sources to fit all the necessary RR packets into one compound P35L37 RTCP packet without exceeding the MTU of the network path, then only P35L38 the subset that will fit into one MTU SHOULD be included in each P35L39 interval. The subsets SHOULD be selected round-robin across multiple P35L40 intervals so that all sources are reported. P35L41 P35L42 The next sections define the formats of the two reports, how they may P35L43 be extended in a profile-specific manner if an application requires P35L44 additional feedback information, and how the reports may be used. P35L45 Details of reception reporting by translators and mixers is given in P35L46 Section 7. P35L47 P35L48 P36L1 6.4.1 SR: Sender Report RTCP Packet P36L2 P36L3 0 1 2 3 P36L4 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P36L5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L6 header |V=2|P| RC | PT=SR=200 | length | P36L7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L8 | SSRC of sender | P36L9 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P36L10 sender | NTP timestamp, most significant word | P36L11 info +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L12 | NTP timestamp, least significant word | P36L13 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L14 | RTP timestamp | P36L15 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L16 | sender's packet count | P36L17 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L18 | sender's octet count | P36L19 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P36L20 report | SSRC_1 (SSRC of first source) | P36L21 block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L22 1 | fraction lost | cumulative number of packets lost | P36L23 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L24 | extended highest sequence number received | P36L25 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L26 | interarrival jitter | P36L27 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L28 | last SR (LSR) | P36L29 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L30 | delay since last SR (DLSR) | P36L31 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P36L32 report | SSRC_2 (SSRC of second source) | P36L33 block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L34 2 : ... : P36L35 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P36L36 | profile-specific extensions | P36L37 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P36L38 P36L39 The sender report packet consists of three sections, possibly P36L40 followed by a fourth profile-specific extension section if defined. P36L41 The first section, the header, is 8 octets long. The fields have the P36L42 following meaning: P36L43 P36L44 version (V): 2 bits P36L45 Identifies the version of RTP, which is the same in RTCP packets P36L46 as in RTP data packets. The version defined by this specification P36L47 is two (2). P36L48 P37L1 padding (P): 1 bit P37L2 If the padding bit is set, this individual RTCP packet contains P37L3 some additional padding octets at the end which are not part of P37L4 the control information but are included in the length field. The P37L5 last octet of the padding is a count of how many padding octets P37L6 should be ignored, including itself (it will be a multiple of P37L7 four). Padding may be needed by some encryption algorithms with P37L8 fixed block sizes. In a compound RTCP packet, padding is only P37L9 required on one individual packet because the compound packet is P37L10 encrypted as a whole for the method in Section 9.1. Thus, padding P37L11 MUST only be added to the last individual packet, and if padding P37L12 is added to that packet, the padding bit MUST be set only on that P37L13 packet. This convention aids the header validity checks described P37L14 in Appendix A.2 and allows detection of packets from some early P37L15 implementations that incorrectly set the padding bit on the first P37L16 individual packet and add padding to the last individual packet. P37L17 P37L18 reception report count (RC): 5 bits P37L19 The number of reception report blocks contained in this packet. A P37L20 value of zero is valid. P37L21 P37L22 packet type (PT): 8 bits P37L23 Contains the constant 200 to identify this as an RTCP SR packet. P37L24 P37L25 length: 16 bits P37L26 The length of this RTCP packet in 32-bit words minus one, P37L27 including the header and any padding. (The offset of one makes P37L28 zero a valid length and avoids a possible infinite loop in P37L29 scanning a compound RTCP packet, while counting 32-bit words P37L30 avoids a validity check for a multiple of 4.) P37L31 P37L32 SSRC: 32 bits P37L33 The synchronization source identifier for the originator of this P37L34 SR packet. P37L35 P37L36 The second section, the sender information, is 20 octets long and is P37L37 present in every sender report packet. It summarizes the data P37L38 transmissions from this sender. The fields have the following P37L39 meaning: P37L40 P37L41 NTP timestamp: 64 bits P37L42 Indicates the wallclock time (see Section 4) when this report was P37L43 sent so that it may be used in combination with timestamps P37L44 returned in reception reports from other receivers to measure P37L45 round-trip propagation to those receivers. Receivers should P37L46 expect that the measurement accuracy of the timestamp may be P37L47 limited to far less than the resolution of the NTP timestamp. The P37L48 measurement uncertainty of the timestamp is not indicated as it P38L1 may not be known. On a system that has no notion of wallclock P38L2 time but does have some system-specific clock such as "system P38L3 uptime", a sender MAY use that clock as a reference to calculate P38L4 relative NTP timestamps. It is important to choose a commonly P38L5 used clock so that if separate implementations are used to produce P38L6 the individual streams of a multimedia session, all P38L7 implementations will use the same clock. Until the year 2036, P38L8 relative and absolute timestamps will differ in the high bit so P38L9 (invalid) comparisons will show a large difference; by then one P38L10 hopes relative timestamps will no longer be needed. A sender that P38L11 has no notion of wallclock or elapsed time MAY set the NTP P38L12 timestamp to zero. P38L13 P38L14 RTP timestamp: 32 bits P38L15 Corresponds to the same time as the NTP timestamp (above), but in P38L16 the same units and with the same random offset as the RTP P38L17 timestamps in data packets. This correspondence may be used for P38L18 intra- and inter-media synchronization for sources whose NTP P38L19 timestamps are synchronized, and may be used by media-independent P38L20 receivers to estimate the nominal RTP clock frequency. Note that P38L21 in most cases this timestamp will not be equal to the RTP P38L22 timestamp in any adjacent data packet. Rather, it MUST be P38L23 calculated from the corresponding NTP timestamp using the P38L24 relationship between the RTP timestamp counter and real time as P38L25 maintained by periodically checking the wallclock time at a P38L26 sampling instant. P38L27 P38L28 sender's packet count: 32 bits P38L29 The total number of RTP data packets transmitted by the sender P38L30 since starting transmission up until the time this SR packet was P38L31 generated. The count SHOULD be reset if the sender changes its P38L32 SSRC identifier. P38L33 P38L34 sender's octet count: 32 bits P38L35 The total number of payload octets (i.e., not including header or P38L36 padding) transmitted in RTP data packets by the sender since P38L37 starting transmission up until the time this SR packet was P38L38 generated. The count SHOULD be reset if the sender changes its P38L39 SSRC identifier. This field can be used to estimate the average P38L40 payload data rate. P38L41 P38L42 The third section contains zero or more reception report blocks P38L43 depending on the number of other sources heard by this sender since P38L44 the last report. Each reception report block conveys statistics on P38L45 the reception of RTP packets from a single synchronization source. P38L46 Receivers SHOULD NOT carry over statistics when a source changes its P38L47 SSRC identifier due to a collision. These statistics are: P38L48 P39L1 SSRC_n (source identifier): 32 bits P39L2 The SSRC identifier of the source to which the information in this P39L3 reception report block pertains. P39L4 P39L5 fraction lost: 8 bits P39L6 The fraction of RTP data packets from source SSRC_n lost since the P39L7 previous SR or RR packet was sent, expressed as a fixed point P39L8 number with the binary point at the left edge of the field. (That P39L9 is equivalent to taking the integer part after multiplying the P39L10 loss fraction by 256.) This fraction is defined to be the number P39L11 of packets lost divided by the number of packets expected, as P39L12 defined in the next paragraph. An implementation is shown in P39L13 Appendix A.3. If the loss is negative due to duplicates, the P39L14 fraction lost is set to zero. Note that a receiver cannot tell P39L15 whether any packets were lost after the last one received, and P39L16 that there will be no reception report block issued for a source P39L17 if all packets from that source sent during the last reporting P39L18 interval have been lost. P39L19 P39L20 cumulative number of packets lost: 24 bits P39L21 The total number of RTP data packets from source SSRC_n that have P39L22 been lost since the beginning of reception. This number is P39L23 defined to be the number of packets expected less the number of P39L24 packets actually received, where the number of packets received P39L25 includes any which are late or duplicates. Thus, packets that P39L26 arrive late are not counted as lost, and the loss may be negative P39L27 if there are duplicates. The number of packets expected is P39L28 defined to be the extended last sequence number received, as P39L29 defined next, less the initial sequence number received. This may P39L30 be calculated as shown in Appendix A.3. P39L31 P39L32 extended highest sequence number received: 32 bits P39L33 The low 16 bits contain the highest sequence number received in an P39L34 RTP data packet from source SSRC_n, and the most significant 16 P39L35 bits extend that sequence number with the corresponding count of P39L36 sequence number cycles, which may be maintained according to the P39L37 algorithm in Appendix A.1. Note that different receivers within P39L38 the same session will generate different extensions to the P39L39 sequence number if their start times differ significantly. P39L40 P39L41 interarrival jitter: 32 bits P39L42 An estimate of the statistical variance of the RTP data packet P39L43 interarrival time, measured in timestamp units and expressed as an P39L44 unsigned integer. The interarrival jitter J is defined to be the P39L45 mean deviation (smoothed absolute value) of the difference D in P39L46 packet spacing at the receiver compared to the sender for a pair P39L47 of packets. As shown in the equation below, this is equivalent to P39L48 the difference in the "relative transit time" for the two packets; P40L1 the relative transit time is the difference between a packet's RTP P40L2 timestamp and the receiver's clock at the time of arrival, P40L3 measured in the same units. P40L4 P40L5 If Si is the RTP timestamp from packet i, and Ri is the time of P40L6 arrival in RTP timestamp units for packet i, then for two packets P40L7 i and j, D may be expressed as P40L8 P40L9 D(i,j) = (Rj - Ri) - (Sj - Si) = (Rj - Sj) - (Ri - Si) P40L10 P40L11 The interarrival jitter SHOULD be calculated continuously as each P40L12 data packet i is received from source SSRC_n, using this P40L13 difference D for that packet and the previous packet i-1 in order P40L14 of arrival (not necessarily in sequence), according to the formula P40L15 P40L16 J(i) = J(i-1) + (|D(i-1,i)| - J(i-1))/16 P40L17 P40L18 Whenever a reception report is issued, the current value of J is P40L19 sampled. P40L20 P40L21 The jitter calculation MUST conform to the formula specified here P40L22 in order to allow profile-independent monitors to make valid P40L23 interpretations of reports coming from different implementations. P40L24 This algorithm is the optimal first-order estimator and the gain P40L25 parameter 1/16 gives a good noise reduction ratio while P40L26 maintaining a reasonable rate of convergence [22]. A sample P40L27 implementation is shown in Appendix A.8. See Section 6.4.4 for a P40L28 discussion of the effects of varying packet duration and delay P40L29 before transmission. P40L30 P40L31 last SR timestamp (LSR): 32 bits P40L32 The middle 32 bits out of 64 in the NTP timestamp (as explained in P40L33 Section 4) received as part of the most recent RTCP sender report P40L34 (SR) packet from source SSRC_n. If no SR has been received yet, P40L35 the field is set to zero. P40L36 P40L37 delay since last SR (DLSR): 32 bits P40L38 The delay, expressed in units of 1/65536 seconds, between P40L39 receiving the last SR packet from source SSRC_n and sending this P40L40 reception report block. If no SR packet has been received yet P40L41 from SSRC_n, the DLSR field is set to zero. P40L42 P40L43 Let SSRC_r denote the receiver issuing this receiver report. P40L44 Source SSRC_n can compute the round-trip propagation delay to P40L45 SSRC_r by recording the time A when this reception report block is P40L46 received. It calculates the total round-trip time A-LSR using the P40L47 last SR timestamp (LSR) field, and then subtracting this field to P40L48 leave the round-trip propagation delay as (A - LSR - DLSR). This P41L1 is illustrated in Fig. 2. Times are shown in both a hexadecimal P41L2 representation of the 32-bit fields and the equivalent floating- P41L3 point decimal representation. Colons indicate a 32-bit field P41L4 divided into a 16-bit integer part and 16-bit fraction part. P41L5 P41L6 This may be used as an approximate measure of distance to cluster P41L7 receivers, although some links have very asymmetric delays. P41L8 P41L9 [10 Nov 1995 11:33:25.125 UTC] [10 Nov 1995 11:33:36.5 UTC] P41L10 n SR(n) A=b710:8000 (46864.500 s) P41L11 ----------------------------------------------------------------> P41L12 v ^ P41L13 ntp_sec =0xb44db705 v ^ dlsr=0x0005:4000 ( 5.250s) P41L14 ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) P41L15 (3024992005.125 s) v ^ P41L16 r v ^ RR(n) P41L17 ----------------------------------------------------------------> P41L18 |<-DLSR->| P41L19 (5.250 s) P41L20 P41L21 A 0xb710:8000 (46864.500 s) P41L22 DLSR -0x0005:4000 ( 5.250 s) P41L23 LSR -0xb705:2000 (46853.125 s) P41L24 ------------------------------- P41L25 delay 0x0006:2000 ( 6.125 s) P41L26 P41L27 Figure 2: Example for round-trip time computation P41L28 P41L29 P41L30 P41L31 P41L32 P41L33 P41L34 P41L35 P41L36 P41L37 P41L38 P41L39 P41L40 P41L41 P41L42 P41L43 P41L44 P41L45 P41L46 P41L47 P41L48 P42L1 6.4.2 RR: Receiver Report RTCP Packet P42L2 P42L3 0 1 2 3 P42L4 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P42L5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L6 header |V=2|P| RC | PT=RR=201 | length | P42L7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L8 | SSRC of packet sender | P42L9 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P42L10 report | SSRC_1 (SSRC of first source) | P42L11 block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L12 1 | fraction lost | cumulative number of packets lost | P42L13 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L14 | extended highest sequence number received | P42L15 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L16 | interarrival jitter | P42L17 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L18 | last SR (LSR) | P42L19 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L20 | delay since last SR (DLSR) | P42L21 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P42L22 report | SSRC_2 (SSRC of second source) | P42L23 block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L24 2 : ... : P42L25 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P42L26 | profile-specific extensions | P42L27 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P42L28 P42L29 The format of the receiver report (RR) packet is the same as that of P42L30 the SR packet except that the packet type field contains the constant P42L31 201 and the five words of sender information are omitted (these are P42L32 the NTP and RTP timestamps and sender's packet and octet counts). P42L33 The remaining fields have the same meaning as for the SR packet. P42L34 P42L35 An empty RR packet (RC = 0) MUST be put at the head of a compound P42L36 RTCP packet when there is no data transmission or reception to P42L37 report. P42L38 P42L39 6.4.3 Extending the Sender and Receiver Reports P42L40 P42L41 A profile SHOULD define profile-specific extensions to the sender P42L42 report and receiver report if there is additional information that P42L43 needs to be reported regularly about the sender or receivers. This P42L44 method SHOULD be used in preference to defining another RTCP packet P42L45 type because it requires less overhead: P42L46 P42L47 o fewer octets in the packet (no RTCP header or SSRC field); P42L48 P43L1 o simpler and faster parsing because applications running under that P43L2 profile would be programmed to always expect the extension fields P43L3 in the directly accessible location after the reception reports. P43L4 P43L5 The extension is a fourth section in the sender- or receiver-report P43L6 packet which comes at the end after the reception report blocks, if P43L7 any. If additional sender information is required, then for sender P43L8 reports it would be included first in the extension section, but for P43L9 receiver reports it would not be present. If information about P43L10 receivers is to be included, that data SHOULD be structured as an P43L11 array of blocks parallel to the existing array of reception report P43L12 blocks; that is, the number of blocks would be indicated by the RC P43L13 field. P43L14 P43L15 6.4.4 Analyzing Sender and Receiver Reports P43L16 P43L17 It is expected that reception quality feedback will be useful not P43L18 only for the sender but also for other receivers and third-party P43L19 monitors. The sender may modify its transmissions based on the P43L20 feedback; receivers can determine whether problems are local, P43L21 regional or global; network managers may use profile-independent P43L22 monitors that receive only the RTCP packets and not the corresponding P43L23 RTP data packets to evaluate the performance of their networks for P43L24 multicast distribution. P43L25 P43L26 Cumulative counts are used in both the sender information and P43L27 receiver report blocks so that differences may be calculated between P43L28 any two reports to make measurements over both short and long time P43L29 periods, and to provide resilience against the loss of a report. The P43L30 difference between the last two reports received can be used to P43L31 estimate the recent quality of the distribution. The NTP timestamp P43L32 is included so that rates may be calculated from these differences P43L33 over the interval between two reports. Since that timestamp is P43L34 independent of the clock rate for the data encoding, it is possible P43L35 to implement encoding- and profile-independent quality monitors. P43L36 P43L37 An example calculation is the packet loss rate over the interval P43L38 between two reception reports. The difference in the cumulative P43L39 number of packets lost gives the number lost during that interval. P43L40 The difference in the extended last sequence numbers received gives P43L41 the number of packets expected during the interval. The ratio of P43L42 these two is the packet loss fraction over the interval. This ratio P43L43 should equal the fraction lost field if the two reports are P43L44 consecutive, but otherwise it may not. The loss rate per second can P43L45 be obtained by dividing the loss fraction by the difference in NTP P43L46 timestamps, expressed in seconds. The number of packets received is P43L47 the number of packets expected minus the number lost. The number of P43L48 P44L1 packets expected may also be used to judge the statistical validity P44L2 of any loss estimates. For example, 1 out of 5 packets lost has a P44L3 lower significance than 200 out of 1000. P44L4 P44L5 From the sender information, a third-party monitor can calculate the P44L6 average payload data rate and the average packet rate over an P44L7 interval without receiving the data. Taking the ratio of the two P44L8 gives the average payload size. If it can be assumed that packet P44L9 loss is independent of packet size, then the number of packets P44L10 received by a particular receiver times the average payload size (or P44L11 the corresponding packet size) gives the apparent throughput P44L12 available to that receiver. P44L13 P44L14 In addition to the cumulative counts which allow long-term packet P44L15 loss measurements using differences between reports, the fraction P44L16 lost field provides a short-term measurement from a single report. P44L17 This becomes more important as the size of a session scales up enough P44L18 that reception state information might not be kept for all receivers P44L19 or the interval between reports becomes long enough that only one P44L20 report might have been received from a particular receiver. P44L21 P44L22 The interarrival jitter field provides a second short-term measure of P44L23 network congestion. Packet loss tracks persistent congestion while P44L24 the jitter measure tracks transient congestion. The jitter measure P44L25 may indicate congestion before it leads to packet loss. The P44L26 interarrival jitter field is only a snapshot of the jitter at the P44L27 time of a report and is not intended to be taken quantitatively. P44L28 Rather, it is intended for comparison across a number of reports from P44L29 one receiver over time or from multiple receivers, e.g., within a P44L30 single network, at the same time. To allow comparison across P44L31 receivers, it is important the the jitter be calculated according to P44L32 the same formula by all receivers. P44L33 P44L34 Because the jitter calculation is based on the RTP timestamp which P44L35 represents the instant when the first data in the packet was sampled, P44L36 any variation in the delay between that sampling instant and the time P44L37 the packet is transmitted will affect the resulting jitter that is P44L38 calculated. Such a variation in delay would occur for audio packets P44L39 of varying duration. It will also occur for video encodings because P44L40 the timestamp is the same for all the packets of one frame but those P44L41 packets are not all transmitted at the same time. The variation in P44L42 delay until transmission does reduce the accuracy of the jitter P44L43 calculation as a measure of the behavior of the network by itself, P44L44 but it is appropriate to include considering that the receiver buffer P44L45 must accommodate it. When the jitter calculation is used as a P44L46 comparative measure, the (constant) component due to variation in P44L47 delay until transmission subtracts out so that a change in the P44L48 P45L1 network jitter component can then be observed unless it is relatively P45L2 small. If the change is small, then it is likely to be P45L3 inconsequential. P45L4 P45L5 6.5 SDES: Source Description RTCP Packet P45L6 P45L7 0 1 2 3 P45L8 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P45L9 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P45L10 header |V=2|P| SC | PT=SDES=202 | length | P45L11 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P45L12 chunk | SSRC/CSRC_1 | P45L13 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P45L14 | SDES items | P45L15 | ... | P45L16 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P45L17 chunk | SSRC/CSRC_2 | P45L18 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P45L19 | SDES items | P45L20 | ... | P45L21 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P45L22 P45L23 The SDES packet is a three-level structure composed of a header and P45L24 zero or more chunks, each of which is composed of items describing P45L25 the source identified in that chunk. The items are described P45L26 individually in subsequent sections. P45L27 P45L28 version (V), padding (P), length: P45L29 As described for the SR packet (see Section 6.4.1). P45L30 P45L31 packet type (PT): 8 bits P45L32 Contains the constant 202 to identify this as an RTCP SDES packet. P45L33 P45L34 source count (SC): 5 bits P45L35 The number of SSRC/CSRC chunks contained in this SDES packet. A P45L36 value of zero is valid but useless. P45L37 P45L38 Each chunk consists of an SSRC/CSRC identifier followed by a list of P45L39 zero or more items, which carry information about the SSRC/CSRC. P45L40 Each chunk starts on a 32-bit boundary. Each item consists of an 8- P45L41 bit type field, an 8-bit octet count describing the length of the P45L42 text (thus, not including this two-octet header), and the text P45L43 itself. Note that the text can be no longer than 255 octets, but P45L44 this is consistent with the need to limit RTCP bandwidth consumption. P45L45 P45L46 P45L47 P45L48 P46L1 The text is encoded according to the UTF-8 encoding specified in RFC P46L2 2279 [5]. US-ASCII is a subset of this encoding and requires no P46L3 additional encoding. The presence of multi-octet encodings is P46L4 indicated by setting the most significant bit of a character to a P46L5 value of one. P46L6 P46L7 Items are contiguous, i.e., items are not individually padded to a P46L8 32-bit boundary. Text is not null terminated because some multi- P46L9 octet encodings include null octets. The list of items in each chunk P46L10 MUST be terminated by one or more null octets, the first of which is P46L11 interpreted as an item type of zero to denote the end of the list. P46L12 No length octet follows the null item type octet, but additional null P46L13 octets MUST be included if needed to pad until the next 32-bit P46L14 boundary. Note that this padding is separate from that indicated by P46L15 the P bit in the RTCP header. A chunk with zero items (four null P46L16 octets) is valid but useless. P46L17 P46L18 End systems send one SDES packet containing their own source P46L19 identifier (the same as the SSRC in the fixed RTP header). A mixer P46L20 sends one SDES packet containing a chunk for each contributing source P46L21 from which it is receiving SDES information, or multiple complete P46L22 SDES packets in the format above if there are more than 31 such P46L23 sources (see Section 7). P46L24 P46L25 The SDES items currently defined are described in the next sections. P46L26 Only the CNAME item is mandatory. Some items shown here may be P46L27 useful only for particular profiles, but the item types are all P46L28 assigned from one common space to promote shared use and to simplify P46L29 profile-independent applications. Additional items may be defined in P46L30 a profile by registering the type numbers with IANA as described in P46L31 Section 15. P46L32 P46L33 6.5.1 CNAME: Canonical End-Point Identifier SDES Item P46L34 P46L35 0 1 2 3 P46L36 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P46L37 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P46L38 | CNAME=1 | length | user and domain name ... P46L39 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P46L40 P46L41 The CNAME identifier has the following properties: P46L42 P46L43 o Because the randomly allocated SSRC identifier may change if a P46L44 conflict is discovered or if a program is restarted, the CNAME P46L45 item MUST be included to provide the binding from the SSRC P46L46 identifier to an identifier for the source (sender or receiver) P46L47 that remains constant. P46L48 P47L1 o Like the SSRC identifier, the CNAME identifier SHOULD also be P47L2 unique among all participants within one RTP session. P47L3 P47L4 o To provide a binding across multiple media tools used by one P47L5 participant in a set of related RTP sessions, the CNAME SHOULD be P47L6 fixed for that participant. P47L7 P47L8 o To facilitate third-party monitoring, the CNAME SHOULD be suitable P47L9 for either a program or a person to locate the source. P47L10 P47L11 Therefore, the CNAME SHOULD be derived algorithmically and not P47L12 entered manually, when possible. To meet these requirements, the P47L13 following format SHOULD be used unless a profile specifies an P47L14 alternate syntax or semantics. The CNAME item SHOULD have the format P47L15 "user@host", or "host" if a user name is not available as on single- P47L16 user systems. For both formats, "host" is either the fully qualified P47L17 domain name of the host from which the real-time data originates, P47L18 formatted according to the rules specified in RFC 1034 [6], RFC 1035 P47L19 [7] and Section 2.1 of RFC 1123 [8]; or the standard ASCII P47L20 representation of the host's numeric address on the interface used P47L21 for the RTP communication. For example, the standard ASCII P47L22 representation of an IP Version 4 address is "dotted decimal", also P47L23 known as dotted quad, and for IP Version 6, addresses are textually P47L24 represented as groups of hexadecimal digits separated by colons (with P47L25 variations as detailed in RFC 3513 [23]). Other address types are P47L26 expected to have ASCII representations that are mutually unique. The P47L27 fully qualified domain name is more convenient for a human observer P47L28 and may avoid the need to send a NAME item in addition, but it may be P47L29 difficult or impossible to obtain reliably in some operating P47L30 environments. Applications that may be run in such environments P47L31 SHOULD use the ASCII representation of the address instead. P47L32 P47L33 Examples are "doe@sleepy.example.com", "doe@192.0.2.89" or P47L34 "doe@2201:056D::112E:144A:1E24" for a multi-user system. On a system P47L35 with no user name, examples would be "sleepy.example.com", P47L36 "192.0.2.89" or "2201:056D::112E:144A:1E24". P47L37 P47L38 The user name SHOULD be in a form that a program such as "finger" or P47L39 "talk" could use, i.e., it typically is the login name rather than P47L40 the personal name. The host name is not necessarily identical to the P47L41 one in the participant's electronic mail address. P47L42 P47L43 This syntax will not provide unique identifiers for each source if an P47L44 application permits a user to generate multiple sources from one P47L45 host. Such an application would have to rely on the SSRC to further P47L46 identify the source, or the profile for that application would have P47L47 to specify additional syntax for the CNAME identifier. P47L48 P48L1 If each application creates its CNAME independently, the resulting P48L2 CNAMEs may not be identical as would be required to provide a binding P48L3 across multiple media tools belonging to one participant in a set of P48L4 related RTP sessions. If cross-media binding is required, it may be P48L5 necessary for the CNAME of each tool to be externally configured with P48L6 the same value by a coordination tool. P48L7 P48L8 Application writers should be aware that private network address P48L9 assignments such as the Net-10 assignment proposed in RFC 1918 [24] P48L10 may create network addresses that are not globally unique. This P48L11 would lead to non-unique CNAMEs if hosts with private addresses and P48L12 no direct IP connectivity to the public Internet have their RTP P48L13 packets forwarded to the public Internet through an RTP-level P48L14 translator. (See also RFC 1627 [25].) To handle this case, P48L15 applications MAY provide a means to configure a unique CNAME, but the P48L16 burden is on the translator to translate CNAMEs from private P48L17 addresses to public addresses if necessary to keep private addresses P48L18 from being exposed. P48L19 P48L20 6.5.2 NAME: User Name SDES Item P48L21 P48L22 0 1 2 3 P48L23 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P48L24 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P48L25 | NAME=2 | length | common name of source ... P48L26 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P48L27 P48L28 This is the real name used to describe the source, e.g., "John Doe, P48L29 Bit Recycler". It may be in any form desired by the user. For P48L30 applications such as conferencing, this form of name may be the most P48L31 desirable for display in participant lists, and therefore might be P48L32 sent most frequently of those items other than CNAME. Profiles MAY P48L33 establish such priorities. The NAME value is expected to remain P48L34 constant at least for the duration of a session. It SHOULD NOT be P48L35 relied upon to be unique among all participants in the session. P48L36 P48L37 6.5.3 EMAIL: Electronic Mail Address SDES Item P48L38 P48L39 0 1 2 3 P48L40 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P48L41 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P48L42 | EMAIL=3 | length | email address of source ... P48L43 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P48L44 P48L45 The email address is formatted according to RFC 2822 [9], for P48L46 example, "John.Doe@example.com". The EMAIL value is expected to P48L47 remain constant for the duration of a session. P48L48 P49L1 6.5.4 PHONE: Phone Number SDES Item P49L2 P49L3 0 1 2 3 P49L4 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P49L5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L6 | PHONE=4 | length | phone number of source ... P49L7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L8 P49L9 The phone number SHOULD be formatted with the plus sign replacing the P49L10 international access code. For example, "+1 908 555 1212" for a P49L11 number in the United States. P49L12 P49L13 6.5.5 LOC: Geographic User Location SDES Item P49L14 P49L15 0 1 2 3 P49L16 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P49L17 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L18 | LOC=5 | length | geographic location of site ... P49L19 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L20 P49L21 Depending on the application, different degrees of detail are P49L22 appropriate for this item. For conference applications, a string P49L23 like "Murray Hill, New Jersey" may be sufficient, while, for an P49L24 active badge system, strings like "Room 2A244, AT&T BL MH" might be P49L25 appropriate. The degree of detail is left to the implementation P49L26 and/or user, but format and content MAY be prescribed by a profile. P49L27 The LOC value is expected to remain constant for the duration of a P49L28 session, except for mobile hosts. P49L29 P49L30 6.5.6 TOOL: Application or Tool Name SDES Item P49L31 P49L32 0 1 2 3 P49L33 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P49L34 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L35 | TOOL=6 | length |name/version of source appl. ... P49L36 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P49L37 P49L38 A string giving the name and possibly version of the application P49L39 generating the stream, e.g., "videotool 1.2". This information may P49L40 be useful for debugging purposes and is similar to the Mailer or P49L41 Mail-System-Version SMTP headers. The TOOL value is expected to P49L42 remain constant for the duration of the session. P49L43 P49L44 P49L45 P49L46 P49L47 P49L48 P50L1 6.5.7 NOTE: Notice/Status SDES Item P50L2 P50L3 0 1 2 3 P50L4 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P50L5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P50L6 | NOTE=7 | length | note about the source ... P50L7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P50L8 P50L9 The following semantics are suggested for this item, but these or P50L10 other semantics MAY be explicitly defined by a profile. The NOTE P50L11 item is intended for transient messages describing the current state P50L12 of the source, e.g., "on the phone, can't talk". Or, during a P50L13 seminar, this item might be used to convey the title of the talk. It P50L14 should be used only to carry exceptional information and SHOULD NOT P50L15 be included routinely by all participants because this would slow P50L16 down the rate at which reception reports and CNAME are sent, thus P50L17 impairing the performance of the protocol. In particular, it SHOULD P50L18 NOT be included as an item in a user's configuration file nor P50L19 automatically generated as in a quote-of-the-day. P50L20 P50L21 Since the NOTE item may be important to display while it is active, P50L22 the rate at which other non-CNAME items such as NAME are transmitted P50L23 might be reduced so that the NOTE item can take that part of the RTCP P50L24 bandwidth. When the transient message becomes inactive, the NOTE P50L25 item SHOULD continue to be transmitted a few times at the same P50L26 repetition rate but with a string of length zero to signal the P50L27 receivers. However, receivers SHOULD also consider the NOTE item P50L28 inactive if it is not received for a small multiple of the repetition P50L29 rate, or perhaps 20-30 RTCP intervals. P50L30 P50L31 6.5.8 PRIV: Private Extensions SDES Item P50L32 P50L33 0 1 2 3 P50L34 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P50L35 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P50L36 | PRIV=8 | length | prefix length |prefix string... P50L37 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P50L38 ... | value string ... P50L39 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P50L40 P50L41 This item is used to define experimental or application-specific SDES P50L42 extensions. The item contains a prefix consisting of a length-string P50L43 pair, followed by the value string filling the remainder of the item P50L44 and carrying the desired information. The prefix length field is 8 P50L45 bits long. The prefix string is a name chosen by the person defining P50L46 the PRIV item to be unique with respect to other PRIV items this P50L47 application might receive. The application creator might choose to P50L48 use the application name plus an additional subtype identification if P51L1 needed. Alternatively, it is RECOMMENDED that others choose a name P51L2 based on the entity they represent, then coordinate the use of the P51L3 name within that entity. P51L4 P51L5 Note that the prefix consumes some space within the item's total P51L6 length of 255 octets, so the prefix should be kept as short as P51L7 possible. This facility and the constrained RTCP bandwidth SHOULD P51L8 NOT be overloaded; it is not intended to satisfy all the control P51L9 communication requirements of all applications. P51L10 P51L11 SDES PRIV prefixes will not be registered by IANA. If some form of P51L12 the PRIV item proves to be of general utility, it SHOULD instead be P51L13 assigned a regular SDES item type registered with IANA so that no P51L14 prefix is required. This simplifies use and increases transmission P51L15 efficiency. P51L16 P51L17 6.6 BYE: Goodbye RTCP Packet P51L18 P51L19 0 1 2 3 P51L20 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P51L21 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P51L22 |V=2|P| SC | PT=BYE=203 | length | P51L23 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P51L24 | SSRC/CSRC | P51L25 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P51L26 : ... : P51L27 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ P51L28 (opt) | length | reason for leaving ... P51L29 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P51L30 P51L31 The BYE packet indicates that one or more sources are no longer P51L32 active. P51L33 P51L34 version (V), padding (P), length: P51L35 As described for the SR packet (see Section 6.4.1). P51L36 P51L37 packet type (PT): 8 bits P51L38 Contains the constant 203 to identify this as an RTCP BYE packet. P51L39 P51L40 source count (SC): 5 bits P51L41 The number of SSRC/CSRC identifiers included in this BYE packet. P51L42 A count value of zero is valid, but useless. P51L43 P51L44 The rules for when a BYE packet should be sent are specified in P51L45 Sections 6.3.7 and 8.2. P51L46 P51L47 P51L48 P52L1 If a BYE packet is received by a mixer, the mixer SHOULD forward the P52L2 BYE packet with the SSRC/CSRC identifier(s) unchanged. If a mixer P52L3 shuts down, it SHOULD send a BYE packet listing all contributing P52L4 sources it handles, as well as its own SSRC identifier. Optionally, P52L5 the BYE packet MAY include an 8-bit octet count followed by that many P52L6 octets of text indicating the reason for leaving, e.g., "camera P52L7 malfunction" or "RTP loop detected". The string has the same P52L8 encoding as that described for SDES. If the string fills the packet P52L9 to the next 32-bit boundary, the string is not null terminated. If P52L10 not, the BYE packet MUST be padded with null octets to the next 32- P52L11 bit boundary. This padding is separate from that indicated by the P P52L12 bit in the RTCP header. P52L13 P52L14 6.7 APP: Application-Defined RTCP Packet P52L15 P52L16 0 1 2 3 P52L17 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 P52L18 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P52L19 |V=2|P| subtype | PT=APP=204 | length | P52L20 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P52L21 | SSRC/CSRC | P52L22 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P52L23 | name (ASCII) | P52L24 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P52L25 | application-dependent data ... P52L26 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P52L27 P52L28 The APP packet is intended for experimental use as new applications P52L29 and new features are developed, without requiring packet type value P52L30 registration. APP packets with unrecognized names SHOULD be ignored. P52L31 After testing and if wider use is justified, it is RECOMMENDED that P52L32 each APP packet be redefined without the subtype and name fields and P52L33 registered with IANA using an RTCP packet type. P52L34 P52L35 version (V), padding (P), length: P52L36 As described for the SR packet (see Section 6.4.1). P52L37 P52L38 subtype: 5 bits P52L39 May be used as a subtype to allow a set of APP packets to be P52L40 defined under one unique name, or for any application-dependent P52L41 data. P52L42 P52L43 packet type (PT): 8 bits P52L44 Contains the constant 204 to identify this as an RTCP APP packet. P52L45 P52L46 P52L47 P52L48 P53L1 name: 4 octets P53L2 A name chosen by the person defining the set of APP packets to be P53L3 unique with respect to other APP packets this application might P53L4 receive. The application creator might choose to use the P53L5 application name, and then coordinate the allocation of subtype P53L6 values to others who want to define new packet types for the P53L7 application. Alternatively, it is RECOMMENDED that others choose P53L8 a name based on the entity they represent, then coordinate the use P53L9 of the name within that entity. The name is interpreted as a P53L10 sequence of four ASCII characters, with uppercase and lowercase P53L11 characters treated as distinct. P53L12 P53L13 application-dependent data: variable length P53L14 Application-dependent data may or may not appear in an APP packet. P53L15 It is interpreted by the application and not RTP itself. It MUST P53L16 be a multiple of 32 bits long. P53L17 P53L18 7. RTP Translators and Mixers P53L19 P53L20 In addition to end systems, RTP supports the notion of "translators" P53L21 and "mixers", which could be considered as "intermediate systems" at P53L22 the RTP level. Although this support adds some complexity to the P53L23 protocol, the need for these functions has been clearly established P53L24 by experiments with multicast audio and video applications in the P53L25 Internet. Example uses of translators and mixers given in Section P53L26 2.3 stem from the presence of firewalls and low bandwidth P53L27 connections, both of which are likely to remain. P53L28 P53L29 7.1 General Description P53L30 P53L31 An RTP translator/mixer connects two or more transport-level P53L32 "clouds". Typically, each cloud is defined by a common network and P53L33 transport protocol (e.g., IP/UDP) plus a multicast address and P53L34 transport level destination port or a pair of unicast addresses and P53L35 ports. (Network-level protocol translators, such as IP version 4 to P53L36 IP version 6, may be present within a cloud invisibly to RTP.) One P53L37 system may serve as a translator or mixer for a number of RTP P53L38 sessions, but each is considered a logically separate entity. P53L39 P53L40 In order to avoid creating a loop when a translator or mixer is P53L41 installed, the following rules MUST be observed: P53L42 P53L43 o Each of the clouds connected by translators and mixers P53L44 participating in one RTP session either MUST be distinct from all P53L45 the others in at least one of these parameters (protocol, address, P53L46 port), or MUST be isolated at the network level from the others. P53L47 P53L48 P54L1 o A derivative of the first rule is that there MUST NOT be multiple P54L2 translators or mixers connected in parallel unless by some P54L3 arrangement they partition the set of sources to be forwarded. P54L4 P54L5 Similarly, all RTP end systems that can communicate through one or P54L6 more RTP translators or mixers share the same SSRC space, that is, P54L7 the SSRC identifiers MUST be unique among all these end systems. P54L8 Section 8.2 describes the collision resolution algorithm by which P54L9 SSRC identifiers are kept unique and loops are detected. P54L10 P54L11 There may be many varieties of translators and mixers designed for P54L12 different purposes and applications. Some examples are to add or P54L13 remove encryption, change the encoding of the data or the underlying P54L14 protocols, or replicate between a multicast address and one or more P54L15 unicast addresses. The distinction between translators and mixers is P54L16 that a translator passes through the data streams from different P54L17 sources separately, whereas a mixer combines them to form one new P54L18 stream: P54L19 P54L20 Translator: Forwards RTP packets with their SSRC identifier P54L21 intact; this makes it possible for receivers to identify P54L22 individual sources even though packets from all the sources pass P54L23 through the same translator and carry the translator's network P54L24 source address. Some kinds of translators will pass through the P54L25 data untouched, but others MAY change the encoding of the data and P54L26 thus the RTP data payload type and timestamp. If multiple data P54L27 packets are re-encoded into one, or vice versa, a translator MUST P54L28 assign new sequence numbers to the outgoing packets. Losses in P54L29 the incoming packet stream may induce corresponding gaps in the P54L30 outgoing sequence numbers. Receivers cannot detect the presence P54L31 of a translator unless they know by some other means what payload P54L32 type or transport address was used by the original source. P54L33 P54L34 Mixer: Receives streams of RTP data packets from one or more P54L35 sources, possibly changes the data format, combines the streams in P54L36 some manner and then forwards the combined stream. Since the P54L37 timing among multiple input sources will not generally be P54L38 synchronized, the mixer will make timing adjustments among the P54L39 streams and generate its own timing for the combined stream, so it P54L40 is the synchronization source. Thus, all data packets forwarded P54L41 by a mixer MUST be marked with the mixer's own SSRC identifier. P54L42 In order to preserve the identity of the original sources P54L43 contributing to the mixed packet, the mixer SHOULD insert their P54L44 SSRC identifiers into the CSRC identifier list following the fixed P54L45 RTP header of the packet. A mixer that is also itself a P54L46 contributing source for some packet SHOULD explicitly include its P54L47 own SSRC identifier in the CSRC list for that packet. P54L48 P55L1 For some applications, it MAY be acceptable for a mixer not to P55L2 identify sources in the CSRC list. However, this introduces the P55L3 danger that loops involving those sources could not be detected. P55L4 P55L5 The advantage of a mixer over a translator for applications like P55L6 audio is that the output bandwidth is limited to that of one source P55L7 even when multiple sources are active on the input side. This may be P55L8 important for low-bandwidth links. The disadvantage is that P55L9 receivers on the output side don't have any control over which P55L10 sources are passed through or muted, unless some mechanism is P55L11 implemented for remote control of the mixer. The regeneration of P55L12 synchronization information by mixers also means that receivers can't P55L13 do inter-media synchronization of the original streams. A multi- P55L14 media mixer could do it. P55L15 P55L16 [E1] [E6] P55L17 | | P55L18 E1:17 | E6:15 | P55L19 | | E6:15 P55L20 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) P55L21 (M1)------------->----------------->-------------->[E7] P55L22 ^ ^ E4:47 ^ E4:47 P55L23 E2:1 | E4:47 | | M3:89 (64,45) P55L24 | | | P55L25 [E2] [E4] M3:89 (64,45) | P55L26 | legend: P55L27 [E3] --------->(M2)----------->(M3)------------| [End system] P55L28 E3:64 M2:12 (64) ^ (Mixer) P55L29 | E5:45 P55L30 | P55L31 [E5] source: SSRC (CSRCs) P55L32 -------------------> P55L33 P55L34 Figure 3: Sample RTP network with end systems, mixers and translators P55L35 P55L36 A collection of mixers and translators is shown in Fig. 3 to P55L37 illustrate their effect on SSRC and CSRC identifiers. In the figure, P55L38 end systems are shown as rectangles (named E), translators as P55L39 triangles (named T) and mixers as ovals (named M). The notation "M1: P55L40 48(1,17)" designates a packet originating a mixer M1, identified by P55L41 M1's (random) SSRC value of 48 and two CSRC identifiers, 1 and 17, P55L42 copied from the SSRC identifiers of packets from E1 and E2. P55L43 P55L44 7.2 RTCP Processing in Translators P55L45 P55L46 In addition to forwarding data packets, perhaps modified, translators P55L47 and mixers MUST also process RTCP packets. In many cases, they will P55L48 take apart the compound RTCP packets received from end systems to P56L1 aggregate SDES information and to modify the SR or RR packets. P56L2 Retransmission of this information may be triggered by the packet P56L3 arrival or by the RTCP interval timer of the translator or mixer P56L4 itself. P56L5 P56L6 A translator that does not modify the data packets, for example one P56L7 that just replicates between a multicast address and a unicast P56L8 address, MAY simply forward RTCP packets unmodified as well. A P56L9 translator that transforms the payload in some way MUST make P56L10 corresponding transformations in the SR and RR information so that it P56L11 still reflects the characteristics of the data and the reception P56L12 quality. These translators MUST NOT simply forward RTCP packets. In P56L13 general, a translator SHOULD NOT aggregate SR and RR packets from P56L14 different sources into one packet since that would reduce the P56L15 accuracy of the propagation delay measurements based on the LSR and P56L16 DLSR fields. P56L17 P56L18 SR sender information: A translator does not generate its own P56L19 sender information, but forwards the SR packets received from one P56L20 cloud to the others. The SSRC is left intact but the sender P56L21 information MUST be modified if required by the translation. If a P56L22 translator changes the data encoding, it MUST change the "sender's P56L23 byte count" field. If it also combines several data packets into P56L24 one output packet, it MUST change the "sender's packet count" P56L25 field. If it changes the timestamp frequency, it MUST change the P56L26 "RTP timestamp" field in the SR packet. P56L27 P56L28 SR/RR reception report blocks: A translator forwards reception P56L29 reports received from one cloud to the others. Note that these P56L30 flow in the direction opposite to the data. The SSRC is left P56L31 intact. If a translator combines several data packets into one P56L32 output packet, and therefore changes the sequence numbers, it MUST P56L33 make the inverse manipulation for the packet loss fields and the P56L34 "extended last sequence number" field. This may be complex. In P56L35 the extreme case, there may be no meaningful way to translate the P56L36 reception reports, so the translator MAY pass on no reception P56L37 report at all or a synthetic report based on its own reception. P56L38 The general rule is to do what makes sense for a particular P56L39 translation. P56L40 P56L41 A translator does not require an SSRC identifier of its own, but P56L42 MAY choose to allocate one for the purpose of sending reports P56L43 about what it has received. These would be sent to all the P56L44 connected clouds, each corresponding to the translation of the P56L45 data stream as sent to that cloud, since reception reports are P56L46 normally multicast to all participants. P56L47 P56L48 P57L1 SDES: Translators typically forward without change the SDES P57L2 information they receive from one cloud to the others, but MAY, P57L3 for example, decide to filter non-CNAME SDES information if P57L4 bandwidth is limited. The CNAMEs MUST be forwarded to allow SSRC P57L5 identifier collision detection to work. A translator that P57L6 generates its own RR packets MUST send SDES CNAME information P57L7 about itself to the same clouds that it sends those RR packets. P57L8 P57L9 BYE: Translators forward BYE packets unchanged. A translator P57L10 that is about to cease forwarding packets SHOULD send a BYE packet P57L11 to each connected cloud containing all the SSRC identifiers that P57L12 were previously being forwarded to that cloud, including the P57L13 translator's own SSRC identifier if it sent reports of its own. P57L14 P57L15 APP: Translators forward APP packets unchanged. P57L16 P57L17 7.3 RTCP Processing in Mixers P57L18 P57L19 Since a mixer generates a new data stream of its own, it does not P57L20 pass through SR or RR packets at all and instead generates new P57L21 information for both sides. P57L22 P57L23 SR sender information: A mixer does not pass through sender P57L24 information from the sources it mixes because the characteristics P57L25 of the source streams are lost in the mix. As a synchronization P57L26 source, the mixer SHOULD generate its own SR packets with sender P57L27 information about the mixed data stream and send them in the same P57L28 direction as the mixed stream. P57L29 P57L30 SR/RR reception report blocks: A mixer generates its own P57L31 reception reports for sources in each cloud and sends them out P57L32 only to the same cloud. It MUST NOT send these reception reports P57L33 to the other clouds and MUST NOT forward reception reports from P57L34 one cloud to the others because the sources would not be SSRCs P57L35 there (only CSRCs). P57L36 P57L37 SDES: Mixers typically forward without change the SDES P57L38 information they receive from one cloud to the others, but MAY, P57L39 for example, decide to filter non-CNAME SDES information if P57L40 bandwidth is limited. The CNAMEs MUST be forwarded to allow SSRC P57L41 identifier collision detection to work. (An identifier in a CSRC P57L42 list generated by a mixer might collide with an SSRC identifier P57L43 generated by an end system.) A mixer MUST send SDES CNAME P57L44 information about itself to the same clouds that it sends SR or RR P57L45 packets. P57L46 P57L47 P57L48 P58L1 Since mixers do not forward SR or RR packets, they will typically P58L2 be extracting SDES packets from a compound RTCP packet. To P58L3 minimize overhead, chunks from the SDES packets MAY be aggregated P58L4 into a single SDES packet which is then stacked on an SR or RR P58L5 packet originating from the mixer. A mixer which aggregates SDES P58L6 packets will use more RTCP bandwidth than an individual source P58L7 because the compound packets will be longer, but that is P58L8 appropriate since the mixer represents multiple sources. P58L9 Similarly, a mixer which passes through SDES packets as they are P58L10 received will be transmitting RTCP packets at higher than the P58L11 single source rate, but again that is correct since the packets P58L12 come from multiple sources. The RTCP packet rate may be different P58L13 on each side of the mixer. P58L14 P58L15 A mixer that does not insert CSRC identifiers MAY also refrain P58L16 from forwarding SDES CNAMEs. In this case, the SSRC identifier P58L17 spaces in the two clouds are independent. As mentioned earlier, P58L18 this mode of operation creates a danger that loops can't be P58L19 detected. P58L20 P58L21 BYE: Mixers MUST forward BYE packets. A mixer that is about to P58L22 cease forwarding packets SHOULD send a BYE packet to each P58L23 connected cloud containing all the SSRC identifiers that were P58L24 previously being forwarded to that cloud, including the mixer's P58L25 own SSRC identifier if it sent reports of its own. P58L26 P58L27 APP: The treatment of APP packets by mixers is application-specific. P58L28 P58L29 7.4 Cascaded Mixers P58L30 P58L31 An RTP session may involve a collection of mixers and translators as P58L32 shown in Fig. 3. If two mixers are cascaded, such as M2 and M3 in P58L33 the figure, packets received by a mixer may already have been mixed P58L34 and may include a CSRC list with multiple identifiers. The second P58L35 mixer SHOULD build the CSRC list for the outgoing packet using the P58L36 CSRC identifiers from already-mixed input packets and the SSRC P58L37 identifiers from unmixed input packets. This is shown in the output P58L38 arc from mixer M3 labeled M3:89(64,45) in the figure. As in the case P58L39 of mixers that are not cascaded, if the resulting CSRC list has more P58L40 than 15 identifiers, the remainder cannot be included. P58L41 P58L42 P58L43 P58L44 P58L45 P58L46 P58L47 P58L48 P59L1 8. SSRC Identifier Allocation and Use P59L2 P59L3 The SSRC identifier carried in the RTP header and in various fields P59L4 of RTCP packets is a random 32-bit number that is required to be P59L5 globally unique within an RTP session. It is crucial that the number P59L6 be chosen with care in order that participants on the same network or P59L7 starting at the same time are not likely to choose the same number. P59L8 P59L9 It is not sufficient to use the local network address (such as an P59L10 IPv4 address) for the identifier because the address may not be P59L11 unique. Since RTP translators and mixers enable interoperation among P59L12 multiple networks with different address spaces, the allocation P59L13 patterns for addresses within two spaces might result in a much P59L14 higher rate of collision than would occur with random allocation. P59L15 P59L16 Multiple sources running on one host would also conflict. P59L17 P59L18 It is also not sufficient to obtain an SSRC identifier simply by P59L19 calling random() without carefully initializing the state. An P59L20 example of how to generate a random identifier is presented in P59L21 Appendix A.6. P59L22 P59L23 8.1 Probability of Collision P59L24 P59L25 Since the identifiers are chosen randomly, it is possible that two or P59L26 more sources will choose the same number. Collision occurs with the P59L27 highest probability when all sources are started simultaneously, for P59L28 example when triggered automatically by some session management P59L29 event. If N is the number of sources and L the length of the P59L30 identifier (here, 32 bits), the probability that two sources P59L31 independently pick the same value can be approximated for large N P59L32 [26] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, the probability is P59L33 roughly 10**-4. P59L34 P59L35 The typical collision probability is much lower than the worst-case P59L36 above. When one new source joins an RTP session in which all the P59L37 other sources already have unique identifiers, the probability of P59L38 collision is just the fraction of numbers used out of the space. P59L39 Again, if N is the number of sources and L the length of the P59L40 identifier, the probability of collision is N / 2**L. For N=1000, P59L41 the probability is roughly 2*10**-7. P59L42 P59L43 The probability of collision is further reduced by the opportunity P59L44 for a new source to receive packets from other participants before P59L45 sending its first packet (either data or control). If the new source P59L46 keeps track of the other participants (by SSRC identifier), then P59L47 P59L48 P60L1 before transmitting its first packet the new source can verify that P60L2 its identifier does not conflict with any that have been received, or P60L3 else choose again. P60L4 P60L5 8.2 Collision Resolution and Loop Detection P60L6 P60L7 Although the probability of SSRC identifier collision is low, all RTP P60L8 implementations MUST be prepared to detect collisions and take the P60L9 appropriate actions to resolve them. If a source discovers at any P60L10 time that another source is using the same SSRC identifier as its P60L11 own, it MUST send an RTCP BYE packet for the old identifier and P60L12 choose another random one. (As explained below, this step is taken P60L13 only once in case of a loop.) If a receiver discovers that two other P60L14 sources are colliding, it MAY keep the packets from one and discard P60L15 the packets from the other when this can be detected by different P60L16 source transport addresses or CNAMEs. The two sources are expected P60L17 to resolve the collision so that the situation doesn't last. P60L18 P60L19 Because the random SSRC identifiers are kept globally unique for each P60L20 RTP session, they can also be used to detect loops that may be P60L21 introduced by mixers or translators. A loop causes duplication of P60L22 data and control information, either unmodified or possibly mixed, as P60L23 in the following examples: P60L24 P60L25 o A translator may incorrectly forward a packet to the same P60L26 multicast group from which it has received the packet, either P60L27 directly or through a chain of translators. In that case, the P60L28 same packet appears several times, originating from different P60L29 network sources. P60L30 P60L31 o Two translators incorrectly set up in parallel, i.e., with the P60L32 same multicast groups on both sides, would both forward packets P60L33 from one multicast group to the other. Unidirectional translators P60L34 would produce two copies; bidirectional translators would form a P60L35 loop. P60L36 P60L37 o A mixer can close a loop by sending to the same transport P60L38 destination upon which it receives packets, either directly or P60L39 through another mixer or translator. In this case a source might P60L40 show up both as an SSRC on a data packet and a CSRC in a mixed P60L41 data packet. P60L42 P60L43 A source may discover that its own packets are being looped, or that P60L44 packets from another source are being looped (a third-party loop). P60L45 Both loops and collisions in the random selection of a source P60L46 identifier result in packets arriving with the same SSRC identifier P60L47 but a different source transport address, which may be that of the P60L48 end system originating the packet or an intermediate system. P61L1 Therefore, if a source changes its source transport address, it MAY P61L2 also choose a new SSRC identifier to avoid being interpreted as a P61L3 looped source. (This is not MUST because in some applications of RTP P61L4 sources may be expected to change addresses during a session.) Note P61L5 that if a translator restarts and consequently changes the source P61L6 transport address (e.g., changes the UDP source port number) on which P61L7 it forwards packets, then all those packets will appear to receivers P61L8 to be looped because the SSRC identifiers are applied by the original P61L9 source and will not change. This problem can be avoided by keeping P61L10 the source transport address fixed across restarts, but in any case P61L11 will be resolved after a timeout at the receivers. P61L12 P61L13 Loops or collisions occurring on the far side of a translator or P61L14 mixer cannot be detected using the source transport address if all P61L15 copies of the packets go through the translator or mixer, however, P61L16 collisions may still be detected when chunks from two RTCP SDES P61L17 packets contain the same SSRC identifier but different CNAMEs. P61L18 P61L19 To detect and resolve these conflicts, an RTP implementation MUST P61L20 include an algorithm similar to the one described below, though the P61L21 implementation MAY choose a different policy for which packets from P61L22 colliding third-party sources are kept. The algorithm described P61L23 below ignores packets from a new source or loop that collide with an P61L24 established source. It resolves collisions with the participant's P61L25 own SSRC identifier by sending an RTCP BYE for the old identifier and P61L26 choosing a new one. However, when the collision was induced by a P61L27 loop of the participant's own packets, the algorithm will choose a P61L28 new identifier only once and thereafter ignore packets from the P61L29 looping source transport address. This is required to avoid a flood P61L30 of BYE packets. P61L31 P61L32 This algorithm requires keeping a table indexed by the source P61L33 identifier and containing the source transport addresses from the P61L34 first RTP packet and first RTCP packet received with that identifier, P61L35 along with other state for that source. Two source transport P61L36 addresses are required since, for example, the UDP source port P61L37 numbers may be different on RTP and RTCP packets. However, it may be P61L38 assumed that the network address is the same in both source transport P61L39 addresses. P61L40 P61L41 Each SSRC or CSRC identifier received in an RTP or RTCP packet is P61L42 looked up in the source identifier table in order to process that P61L43 data or control information. The source transport address from the P61L44 packet is compared to the corresponding source transport address in P61L45 the table to detect a loop or collision if they don't match. For P61L46 control packets, each element with its own SSRC identifier, for P61L47 example an SDES chunk, requires a separate lookup. (The SSRC P61L48 identifier in a reception report block is an exception because it P62L1 identifies a source heard by the reporter, and that SSRC identifier P62L2 is unrelated to the source transport address of the RTCP packet sent P62L3 by the reporter.) If the SSRC or CSRC is not found, a new entry is P62L4 created. These table entries are removed when an RTCP BYE packet is P62L5 received with the corresponding SSRC identifier and validated by a P62L6 matching source transport address, or after no packets have arrived P62L7 for a relatively long time (see Section 6.2.1). P62L8 P62L9 Note that if two sources on the same host are transmitting with the P62L10 same source identifier at the time a receiver begins operation, it P62L11 would be possible that the first RTP packet received came from one of P62L12 the sources while the first RTCP packet received came from the other. P62L13 This would cause the wrong RTCP information to be associated with the P62L14 RTP data, but this situation should be sufficiently rare and harmless P62L15 that it may be disregarded. P62L16 P62L17 In order to track loops of the participant's own data packets, the P62L18 implementation MUST also keep a separate list of source transport P62L19 addresses (not identifiers) that have been found to be conflicting. P62L20 As in the source identifier table, two source transport addresses P62L21 MUST be kept to separately track conflicting RTP and RTCP packets. P62L22 Note that the conflicting address list should be short, usually P62L23 empty. Each element in this list stores the source addresses plus P62L24 the time when the most recent conflicting packet was received. An P62L25 element MAY be removed from the list when no conflicting packet has P62L26 arrived from that source for a time on the order of 10 RTCP report P62L27 intervals (see Section 6.2). P62L28 P62L29 For the algorithm as shown, it is assumed that the participant's own P62L30 source identifier and state are included in the source identifier P62L31 table. The algorithm could be restructured to first make a separate P62L32 comparison against the participant's own source identifier. P62L33 P62L34 if (SSRC or CSRC identifier is not found in the source P62L35 identifier table) { P62L36 create a new entry storing the data or control source P62L37 transport address, the SSRC or CSRC and other state; P62L38 } P62L39 P62L40 /* Identifier is found in the table */ P62L41 P62L42 else if (table entry was created on receipt of a control packet P62L43 and this is the first data packet or vice versa) { P62L44 store the source transport address from this packet; P62L45 } P62L46 else if (source transport address from the packet does not match P62L47 the one saved in the table entry for this identifier) { P62L48 P63L1 /* An identifier collision or a loop is indicated */ P63L2 P63L3 if (source identifier is not the participant's own) { P63L4 /* OPTIONAL error counter step */ P63L5 if (source identifier is from an RTCP SDES chunk P63L6 containing a CNAME item that differs from the CNAME P63L7 in the table entry) { P63L8 count a third-party collision; P63L9 } else { P63L10 count a third-party loop; P63L11 } P63L12 abort processing of data packet or control element; P63L13 /* MAY choose a different policy to keep new source */ P63L14 } P63L15 P63L16 /* A collision or loop of the participant's own packets */ P63L17 P63L18 else if (source transport address is found in the list of P63L19 conflicting data or control source transport P63L20 addresses) { P63L21 /* OPTIONAL error counter step */ P63L22 if (source identifier is not from an RTCP SDES chunk P63L23 containing a CNAME item or CNAME is the P63L24 participant's own) { P63L25 count occurrence of own traffic looped; P63L26 } P63L27 mark current time in conflicting address list entry; P63L28 abort processing of data packet or control element; P63L29 } P63L30 P63L31 /* New collision, change SSRC identifier */ P63L32 P63L33 else { P63L34 log occurrence of a collision; P63L35 create a new entry in the conflicting data or control P63L36 source transport address list and mark current time; P63L37 send an RTCP BYE packet with the old SSRC identifier; P63L38 choose a new SSRC identifier; P63L39 create a new entry in the source identifier table with P63L40 the old SSRC plus the source transport address from P63L41 the data or control packet being processed; P63L42 } P63L43 } P63L44 P63L45 In this algorithm, packets from a newly conflicting source address P63L46 will be ignored and packets from the original source address will be P63L47 kept. If no packets arrive from the original source for an extended P63L48 period, the table entry will be timed out and the new source will be P64L1 able to take over. This might occur if the original source detects P64L2 the collision and moves to a new source identifier, but in the usual P64L3 case an RTCP BYE packet will be received from the original source to P64L4 delete the state without having to wait for a timeout. P64L5 P64L6 If the original source address was received through a mixer (i.e., P64L7 learned as a CSRC) and later the same source is received directly, P64L8 the receiver may be well advised to switch to the new source address P64L9 unless other sources in the mix would be lost. Furthermore, for P64L10 applications such as telephony in which some sources such as mobile P64L11 entities may change addresses during the course of an RTP session, P64L12 the RTP implementation SHOULD modify the collision detection P64L13 algorithm to accept packets from the new source transport address. P64L14 To guard against flip-flopping between addresses if a genuine P64L15 collision does occur, the algorithm SHOULD include some means to P64L16 detect this case and avoid switching. P64L17 P64L18 When a new SSRC identifier is chosen due to a collision, the P64L19 candidate identifier SHOULD first be looked up in the source P64L20 identifier table to see if it was already in use by some other P64L21 source. If so, another candidate MUST be generated and the process P64L22 repeated. P64L23 P64L24 A loop of data packets to a multicast destination can cause severe P64L25 network flooding. All mixers and translators MUST implement a loop P64L26 detection algorithm like the one here so that they can break loops. P64L27 This should limit the excess traffic to no more than one duplicate P64L28 copy of the original traffic, which may allow the session to continue P64L29 so that the cause of the loop can be found and fixed. However, in P64L30 extreme cases where a mixer or translator does not properly break the P64L31 loop and high traffic levels result, it may be necessary for end P64L32 systems to cease transmitting data or control packets entirely. This P64L33 decision may depend upon the application. An error condition SHOULD P64L34 be indicated as appropriate. Transmission MAY be attempted again P64L35 periodically after a long, random time (on the order of minutes). P64L36 P64L37 8.3 Use with Layered Encodings P64L38 P64L39 For layered encodings transmitted on separate RTP sessions (see P64L40 Section 2.4), a single SSRC identifier space SHOULD be used across P64L41 the sessions of all layers and the core (base) layer SHOULD be used P64L42 for SSRC identifier allocation and collision resolution. When a P64L43 source discovers that it has collided, it transmits an RTCP BYE P64L44 packet on only the base layer but changes the SSRC identifier to the P64L45 new value in all layers. P64L46 P64L47 P64L48 P65L1 9. Security P65L2 P65L3 Lower layer protocols may eventually provide all the security P65L4 services that may be desired for applications of RTP, including P65L5 authentication, integrity, and confidentiality. These services have P65L6 been specified for IP in [27]. Since the initial audio and video P65L7 applications using RTP needed a confidentiality service before such P65L8 services were available for the IP layer, the confidentiality service P65L9 described in the next section was defined for use with RTP and RTCP. P65L10 That description is included here to codify existing practice. New P65L11 applications of RTP MAY implement this RTP-specific confidentiality P65L12 service for backward compatibility, and/or they MAY implement P65L13 alternative security services. The overhead on the RTP protocol for P65L14 this confidentiality service is low, so the penalty will be minimal P65L15 if this service is obsoleted by other services in the future. P65L16 P65L17 Alternatively, other services, other implementations of services and P65L18 other algorithms may be defined for RTP in the future. In P65L19 particular, an RTP profile called Secure Real-time Transport Protocol P65L20 (SRTP) [28] is being developed to provide confidentiality of the RTP P65L21 payload while leaving the RTP header in the clear so that link-level P65L22 header compression algorithms can still operate. It is expected that P65L23 SRTP will be the correct choice for many applications. SRTP is based P65L24 on the Advanced Encryption Standard (AES) and provides stronger P65L25 security than the service described here. No claim is made that the P65L26 methods presented here are appropriate for a particular security P65L27 need. A profile may specify which services and algorithms should be P65L28 offered by applications, and may provide guidance as to their P65L29 appropriate use. P65L30 P65L31 Key distribution and certificates are outside the scope of this P65L32 document. P65L33 P65L34 9.1 Confidentiality P65L35 P65L36 Confidentiality means that only the intended receiver(s) can decode P65L37 the received packets; for others, the packet contains no useful P65L38 information. Confidentiality of the content is achieved by P65L39 encryption. P65L40 P65L41 When it is desired to encrypt RTP or RTCP according to the method P65L42 specified in this section, all the octets that will be encapsulated P65L43 for transmission in a single lower-layer packet are encrypted as a P65L44 unit. For RTCP, a 32-bit random number redrawn for each unit MUST be P65L45 prepended to the unit before encryption. For RTP, no prefix is P65L46 prepended; instead, the sequence number and timestamp fields are P65L47 initialized with random offsets. This is considered to be a weak P65L48 P66L1 initialization vector (IV) because of poor randomness properties. In P66L2 addition, if the subsequent field, the SSRC, can be manipulated by an P66L3 enemy, there is further weakness of the encryption method. P66L4 P66L5 For RTCP, an implementation MAY segregate the individual RTCP packets P66L6 in a compound RTCP packet into two separate compound RTCP packets, P66L7 one to be encrypted and one to be sent in the clear. For example, P66L8 SDES information might be encrypted while reception reports were sent P66L9 in the clear to accommodate third-party monitors that are not privy P66L10 to the encryption key. In this example, depicted in Fig. 4, the SDES P66L11 information MUST be appended to an RR packet with no reports (and the P66L12 random number) to satisfy the requirement that all compound RTCP P66L13 packets begin with an SR or RR packet. The SDES CNAME item is P66L14 required in either the encrypted or unencrypted packet, but not both. P66L15 The same SDES information SHOULD NOT be carried in both packets as P66L16 this may compromise the encryption. P66L17 P66L18 UDP packet UDP packet P66L19 ----------------------------- ------------------------------ P66L20 [random][RR][SDES #CNAME ...] [SR #senderinfo #site1 #site2] P66L21 ----------------------------- ------------------------------ P66L22 encrypted not encrypted P66L23 P66L24 #: SSRC identifier P66L25 P66L26 Figure 4: Encrypted and non-encrypted RTCP packets P66L27 P66L28 The presence of encryption and the use of the correct key are P66L29 confirmed by the receiver through header or payload validity checks. P66L30 Examples of such validity checks for RTP and RTCP headers are given P66L31 in Appendices A.1 and A.2. P66L32 P66L33 To be consistent with existing implementations of the initial P66L34 specification of RTP in RFC 1889, the default encryption algorithm is P66L35 the Data Encryption Standard (DES) algorithm in cipher block chaining P66L36 (CBC) mode, as described in Section 1.1 of RFC 1423 [29], except that P66L37 padding to a multiple of 8 octets is indicated as described for the P P66L38 bit in Section 5.1. The initialization vector is zero because random P66L39 values are supplied in the RTP header or by the random prefix for P66L40 compound RTCP packets. For details on the use of CBC initialization P66L41 vectors, see [30]. P66L42 P66L43 Implementations that support the encryption method specified here P66L44 SHOULD always support the DES algorithm in CBC mode as the default P66L45 cipher for this method to maximize interoperability. This method was P66L46 chosen because it has been demonstrated to be easy and practical to P66L47 use in experimental audio and video tools in operation on the P66L48 Internet. However, DES has since been found to be too easily broken. P67L1 It is RECOMMENDED that stronger encryption algorithms such as P67L2 Triple-DES be used in place of the default algorithm. Furthermore, P67L3 secure CBC mode requires that the first block of each packet be XORed P67L4 with a random, independent IV of the same size as the cipher's block P67L5 size. For RTCP, this is (partially) achieved by prepending each P67L6 packet with a 32-bit random number, independently chosen for each P67L7 packet. For RTP, the timestamp and sequence number start from random P67L8 values, but consecutive packets will not be independently randomized. P67L9 It should be noted that the randomness in both cases (RTP and RTCP) P67L10 is limited. High-security applications SHOULD consider other, more P67L11 conventional, protection means. Other encryption algorithms MAY be P67L12 specified dynamically for a session by non-RTP means. In particular, P67L13 the SRTP profile [28] based on AES is being developed to take into P67L14 account known plaintext and CBC plaintext manipulation concerns, and P67L15 will be the correct choice in the future. P67L16 P67L17 As an alternative to encryption at the IP level or at the RTP level P67L18 as described above, profiles MAY define additional payload types for P67L19 encrypted encodings. Those encodings MUST specify how padding and P67L20 other aspects of the encryption are to be handled. This method P67L21 allows encrypting only the data while leaving the headers in the P67L22 clear for applications where that is desired. It may be particularly P67L23 useful for hardware devices that will handle both decryption and P67L24 decoding. It is also valuable for applications where link-level P67L25 compression of RTP and lower-layer headers is desired and P67L26 confidentiality of the payload (but not addresses) is sufficient P67L27 since encryption of the headers precludes compression. P67L28 P67L29 9.2 Authentication and Message Integrity P67L30 P67L31 Authentication and message integrity services are not defined at the P67L32 RTP level since these services would not be directly feasible without P67L33 a key management infrastructure. It is expected that authentication P67L34 and integrity services will be provided by lower layer protocols. P67L35 P67L36 10. Congestion Control P67L37 P67L38 All transport protocols used on the Internet need to address P67L39 congestion control in some way [31]. RTP is not an exception, but P67L40 because the data transported over RTP is often inelastic (generated P67L41 at a fixed or controlled rate), the means to control congestion in P67L42 RTP may be quite different from those for other transport protocols P67L43 such as TCP. In one sense, inelasticity reduces the risk of P67L44 congestion because the RTP stream will not expand to consume all P67L45 available bandwidth as a TCP stream can. However, inelasticity also P67L46 means that the RTP stream cannot arbitrarily reduce its load on the P67L47 network to eliminate congestion when it occurs. P67L48 P68L1 Since RTP may be used for a wide variety of applications in many P68L2 different contexts, there is no single congestion control mechanism P68L3 that will work for all. Therefore, congestion control SHOULD be P68L4 defined in each RTP profile as appropriate. For some profiles, it P68L5 may be sufficient to include an applicability statement restricting P68L6 the use of that profile to environments where congestion is avoided P68L7 by engineering. For other profiles, specific methods such as data P68L8 rate adaptation based on RTCP feedback may be required. P68L9 P68L10 11. RTP over Network and Transport Protocols P68L11 P68L12 This section describes issues specific to carrying RTP packets within P68L13 particular network and transport protocols. The following rules P68L14 apply unless superseded by protocol-specific definitions outside this P68L15 specification. P68L16 P68L17 RTP relies on the underlying protocol(s) to provide demultiplexing of P68L18 RTP data and RTCP control streams. For UDP and similar protocols, P68L19 RTP SHOULD use an even destination port number and the corresponding P68L20 RTCP stream SHOULD use the next higher (odd) destination port number. P68L21 For applications that take a single port number as a parameter and P68L22 derive the RTP and RTCP port pair from that number, if an odd number P68L23 is supplied then the application SHOULD replace that number with the P68L24 next lower (even) number to use as the base of the port pair. For P68L25 applications in which the RTP and RTCP destination port numbers are P68L26 specified via explicit, separate parameters (using a signaling P68L27 protocol or other means), the application MAY disregard the P68L28 restrictions that the port numbers be even/odd and consecutive P68L29 although the use of an even/odd port pair is still encouraged. The P68L30 RTP and RTCP port numbers MUST NOT be the same since RTP relies on P68L31 the port numbers to demultiplex the RTP data and RTCP control P68L32 streams. P68L33 P68L34 In a unicast session, both participants need to identify a port pair P68L35 for receiving RTP and RTCP packets. Both participants MAY use the P68L36 same port pair. A participant MUST NOT assume that the source port P68L37 of the incoming RTP or RTCP packet can be used as the destination P68L38 port for outgoing RTP or RTCP packets. When RTP data packets are P68L39 being sent in both directions, each participant's RTCP SR packets P68L40 MUST be sent to the port that the other participant has specified for P68L41 reception of RTCP. The RTCP SR packets combine sender information P68L42 for the outgoing data plus reception report information for the P68L43 incoming data. If a side is not actively sending data (see Section P68L44 6.4), an RTCP RR packet is sent instead. P68L45 P68L46 It is RECOMMENDED that layered encoding applications (see Section P68L47 2.4) use a set of contiguous port numbers. The port numbers MUST be P68L48 distinct because of a widespread deficiency in existing operating P69L1 systems that prevents use of the same port with multiple multicast P69L2 addresses, and for unicast, there is only one permissible address. P69L3 Thus for layer n, the data port is P + 2n, and the control port is P P69L4 + 2n + 1. When IP multicast is used, the addresses MUST also be P69L5 distinct because multicast routing and group membership are managed P69L6 on an address granularity. However, allocation of contiguous IP P69L7 multicast addresses cannot be assumed because some groups may require P69L8 different scopes and may therefore be allocated from different P69L9 address ranges. P69L10 P69L11 The previous paragraph conflicts with the SDP specification, RFC 2327 P69L12 [15], which says that it is illegal for both multiple addresses and P69L13 multiple ports to be specified in the same session description P69L14 because the association of addresses with ports could be ambiguous. P69L15 It is intended that this restriction will be relaxed in a revision of P69L16 RFC 2327 to allow an equal number of addresses and ports to be P69L17 specified with a one-to-one mapping implied. P69L18 P69L19 RTP data packets contain no length field or other delineation, P69L20 therefore RTP relies on the underlying protocol(s) to provide a P69L21 length indication. The maximum length of RTP packets is limited only P69L22 by the underlying protocols. P69L23 P69L24 If RTP packets are to be carried in an underlying protocol that P69L25 provides the abstraction of a continuous octet stream rather than P69L26 messages (packets), an encapsulation of the RTP packets MUST be P69L27 defined to provide a framing mechanism. Framing is also needed if P69L28 the underlying protocol may contain padding so that the extent of the P69L29 RTP payload cannot be determined. The framing mechanism is not P69L30 defined here. P69L31 P69L32 A profile MAY specify a framing method to be used even when RTP is P69L33 carried in protocols that do provide framing in order to allow P69L34 carrying several RTP packets in one lower-layer protocol data unit, P69L35 such as a UDP packet. Carrying several RTP packets in one network or P69L36 transport packet reduces header overhead and may simplify P69L37 synchronization between different streams. P69L38 P69L39 12. Summary of Protocol Constants P69L40 P69L41 This section contains a summary listing of the constants defined in P69L42 this specification. P69L43 P69L44 The RTP payload type (PT) constants are defined in profiles rather P69L45 than this document. However, the octet of the RTP header which P69L46 contains the marker bit(s) and payload type MUST avoid the reserved P69L47 values 200 and 201 (decimal) to distinguish RTP packets from the RTCP P69L48 SR and RR packet types for the header validation procedure described P70L1 in Appendix A.1. For the standard definition of one marker bit and a P70L2 7-bit payload type field as shown in this specification, this P70L3 restriction means that payload types 72 and 73 are reserved. P70L4 P70L5 12.1 RTCP Packet Types P70L6 P70L7 abbrev. name value P70L8 SR sender report 200 P70L9 RR receiver report 201 P70L10 SDES source description 202 P70L11 BYE goodbye 203 P70L12 APP application-defined 204 P70L13 P70L14 These type values were chosen in the range 200-204 for improved P70L15 header validity checking of RTCP packets compared to RTP packets or P70L16 other unrelated packets. When the RTCP packet type field is compared P70L17 to the corresponding octet of the RTP header, this range corresponds P70L18 to the marker bit being 1 (which it usually is not in data packets) P70L19 and to the high bit of the standard payload type field being 1 (since P70L20 the static payload types are typically defined in the low half). P70L21 This range was also chosen to be some distance numerically from 0 and P70L22 255 since all-zeros and all-ones are common data patterns. P70L23 P70L24 Since all compound RTCP packets MUST begin with SR or RR, these codes P70L25 were chosen as an even/odd pair to allow the RTCP validity check to P70L26 test the maximum number of bits with mask and value. P70L27 P70L28 Additional RTCP packet types may be registered through IANA (see P70L29 Section 15). P70L30 P70L31 12.2 SDES Types P70L32 P70L33 abbrev. name value P70L34 END end of SDES list 0 P70L35 CNAME canonical name 1 P70L36 NAME user name 2 P70L37 EMAIL user's electronic mail address 3 P70L38 PHONE user's phone number 4 P70L39 LOC geographic user location 5 P70L40 TOOL name of application or tool 6 P70L41 NOTE notice about the source 7 P70L42 PRIV private extensions 8 P70L43 P70L44 Additional SDES types may be registered through IANA (see Section P70L45 15). P70L46 P70L47 P70L48 P71L1 13. RTP Profiles and Payload Format Specifications P71L2 P71L3 A complete specification of RTP for a particular application will P71L4 require one or more companion documents of two types described here: P71L5 profiles, and payload format specifications. P71L6 P71L7 RTP may be used for a variety of applications with somewhat differing P71L8 requirements. The flexibility to adapt to those requirements is P71L9 provided by allowing multiple choices in the main protocol P71L10 specification, then selecting the appropriate choices or defining P71L11 extensions for a particular environment and class of applications in P71L12 a separate profile document. Typically an application will operate P71L13 under only one profile in a particular RTP session, so there is no P71L14 explicit indication within the RTP protocol itself as to which P71L15 profile is in use. A profile for audio and video applications may be P71L16 found in the companion RFC 3551. Profiles are typically titled "RTP P71L17 Profile for ...". P71L18 P71L19 The second type of companion document is a payload format P71L20 specification, which defines how a particular kind of payload data, P71L21 such as H.261 encoded video, should be carried in RTP. These P71L22 documents are typically titled "RTP Payload Format for XYZ P71L23 Audio/Video Encoding". Payload formats may be useful under multiple P71L24 profiles and may therefore be defined independently of any particular P71L25 profile. The profile documents are then responsible for assigning a P71L26 default mapping of that format to a payload type value if needed. P71L27 P71L28 Within this specification, the following items have been identified P71L29 for possible definition within a profile, but this list is not meant P71L30 to be exhaustive: P71L31 P71L32 RTP data header: The octet in the RTP data header that contains P71L33 the marker bit and payload type field MAY be redefined by a P71L34 profile to suit different requirements, for example with more or P71L35 fewer marker bits (Section 5.3, p. 18). P71L36 P71L37 Payload types: Assuming that a payload type field is included, P71L38 the profile will usually define a set of payload formats (e.g., P71L39 media encodings) and a default static mapping of those formats to P71L40 payload type values. Some of the payload formats may be defined P71L41 by reference to separate payload format specifications. For each P71L42 payload type defined, the profile MUST specify the RTP timestamp P71L43 clock rate to be used (Section 5.1, p. 14). P71L44 P71L45 RTP data header additions: Additional fields MAY be appended to P71L46 the fixed RTP data header if some additional functionality is P71L47 required across the profile's class of applications independent of P71L48 payload type (Section 5.3, p. 18). P72L1 RTP data header extensions: The contents of the first 16 bits of P72L2 the RTP data header extension structure MUST be defined if use of P72L3 that mechanism is to be allowed under the profile for P72L4 implementation-specific extensions (Section 5.3.1, p. 18). P72L5 P72L6 RTCP packet types: New application-class-specific RTCP packet P72L7 types MAY be defined and registered with IANA. P72L8 P72L9 RTCP report interval: A profile SHOULD specify that the values P72L10 suggested in Section 6.2 for the constants employed in the P72L11 calculation of the RTCP report interval will be used. Those are P72L12 the RTCP fraction of session bandwidth, the minimum report P72L13 interval, and the bandwidth split between senders and receivers. P72L14 A profile MAY specify alternate values if they have been P72L15 demonstrated to work in a scalable manner. P72L16 P72L17 SR/RR extension: An extension section MAY be defined for the P72L18 RTCP SR and RR packets if there is additional information that P72L19 should be reported regularly about the sender or receivers P72L20 (Section 6.4.3, p. 42 and 43). P72L21 P72L22 SDES use: The profile MAY specify the relative priorities for P72L23 RTCP SDES items to be transmitted or excluded entirely (Section P72L24 6.3.9); an alternate syntax or semantics for the CNAME item P72L25 (Section 6.5.1); the format of the LOC item (Section 6.5.5); the P72L26 semantics and use of the NOTE item (Section 6.5.7); or new SDES P72L27 item types to be registered with IANA. P72L28 P72L29 Security: A profile MAY specify which security services and P72L30 algorithms should be offered by applications, and MAY provide P72L31 guidance as to their appropriate use (Section 9, p. 65). P72L32 P72L33 String-to-key mapping: A profile MAY specify how a user-provided P72L34 password or pass phrase is mapped into an encryption key. P72L35 P72L36 Congestion: A profile SHOULD specify the congestion control P72L37 behavior appropriate for that profile. P72L38 P72L39 Underlying protocol: Use of a particular underlying network or P72L40 transport layer protocol to carry RTP packets MAY be required. P72L41 P72L42 Transport mapping: A mapping of RTP and RTCP to transport-level P72L43 addresses, e.g., UDP ports, other than the standard mapping P72L44 defined in Section 11, p. 68 may be specified. P72L45 P72L46 P72L47 P72L48 P73L1 Encapsulation: An encapsulation of RTP packets may be defined to P73L2 allow multiple RTP data packets to be carried in one lower-layer P73L3 packet or to provide framing over underlying protocols that do not P73L4 already do so (Section 11, p. 69). P73L5 P73L6 It is not expected that a new profile will be required for every P73L7 application. Within one application class, it would be better to P73L8 extend an existing profile rather than make a new one in order to P73L9 facilitate interoperation among the applications since each will P73L10 typically run under only one profile. Simple extensions such as the P73L11 definition of additional payload type values or RTCP packet types may P73L12 be accomplished by registering them through IANA and publishing their P73L13 descriptions in an addendum to the profile or in a payload format P73L14 specification. P73L15 P73L16 14. Security Considerations P73L17 P73L18 RTP suffers from the same security liabilities as the underlying P73L19 protocols. For example, an impostor can fake source or destination P73L20 network addresses, or change the header or payload. Within RTCP, the P73L21 CNAME and NAME information may be used to impersonate another P73L22 participant. In addition, RTP may be sent via IP multicast, which P73L23 provides no direct means for a sender to know all the receivers of P73L24 the data sent and therefore no measure of privacy. Rightly or not, P73L25 users may be more sensitive to privacy concerns with audio and video P73L26 communication than they have been with more traditional forms of P73L27 network communication [33]. Therefore, the use of security P73L28 mechanisms with RTP is important. These mechanisms are discussed in P73L29 Section 9. P73L30 P73L31 RTP-level translators or mixers may be used to allow RTP traffic to P73L32 reach hosts behind firewalls. Appropriate firewall security P73L33 principles and practices, which are beyond the scope of this P73L34 document, should be followed in the design and installation of these P73L35 devices and in the admission of RTP applications for use behind the P73L36 firewall. P73L37 P73L38 15. IANA Considerations P73L39 P73L40 Additional RTCP packet types and SDES item types may be registered P73L41 through the Internet Assigned Numbers Authority (IANA). Since these P73L42 number spaces are small, allowing unconstrained registration of new P73L43 values would not be prudent. To facilitate review of requests and to P73L44 promote shared use of new types among multiple applications, requests P73L45 for registration of new values must be documented in an RFC or other P73L46 permanent and readily available reference such as the product of P73L47 another cooperative standards body (e.g., ITU-T). Other requests may P73L48 also be accepted, under the advice of a "designated expert." P74L1 (Contact the IANA for the contact information of the current expert.) P74L2 P74L3 RTP profile specifications SHOULD register with IANA a name for the P74L4 profile in the form "RTP/xxx", where xxx is a short abbreviation of P74L5 the profile title. These names are for use by higher-level control P74L6 protocols, such as the Session Description Protocol (SDP), RFC 2327 P74L7 [15], to refer to transport methods. P74L8 P74L9 16. Intellectual Property Rights Statement P74L10 P74L11 The IETF takes no position regarding the validity or scope of any P74L12 intellectual property or other rights that might be claimed to P74L13 pertain to the implementation or use of the technology described in P74L14 this document or the extent to which any license under such rights P74L15 might or might not be available; neither does it represent that it P74L16 has made any effort to identify any such rights. Information on the P74L17 IETF's procedures with respect to rights in standards-track and P74L18 standards-related documentation can be found in BCP-11. Copies of P74L19 claims of rights made available for publication and any assurances of P74L20 licenses to be made available, or the result of an attempt made to P74L21 obtain a general license or permission for the use of such P74L22 proprietary rights by implementors or users of this specification can P74L23 be obtained from the IETF Secretariat. P74L24 P74L25 The IETF invites any interested party to bring to its attention any P74L26 copyrights, patents or patent applications, or other proprietary P74L27 rights which may cover technology that may be required to practice P74L28 this standard. Please address the information to the IETF Executive P74L29 Director. P74L30 P74L31 17. Acknowledgments P74L32 P74L33 This memorandum is based on discussions within the IETF Audio/Video P74L34 Transport working group chaired by Stephen Casner and Colin Perkins. P74L35 The current protocol has its origins in the Network Voice Protocol P74L36 and the Packet Video Protocol (Danny Cohen and Randy Cole) and the P74L37 protocol implemented by the vat application (Van Jacobson and Steve P74L38 McCanne). Christian Huitema provided ideas for the random identifier P74L39 generator. Extensive analysis and simulation of the timer P74L40 reconsideration algorithm was done by Jonathan Rosenberg. The P74L41 additions for layered encodings were specified by Michael Speer and P74L42 Steve McCanne. P74L43 P74L44 P74L45 P74L46 P74L47 P74L48 P75L1 Appendix A - Algorithms P75L2 P75L3 We provide examples of C code for aspects of RTP sender and receiver P75L4 algorithms. There may be other implementation methods that are P75L5 faster in particular operating environments or have other advantages. P75L6 These implementation notes are for informational purposes only and P75L7 are meant to clarify the RTP specification. P75L8 P75L9 The following definitions are used for all examples; for clarity and P75L10 brevity, the structure definitions are only valid for 32-bit big- P75L11 endian (most significant octet first) architectures. Bit fields are P75L12 assumed to be packed tightly in big-endian bit order, with no P75L13 additional padding. Modifications would be required to construct a P75L14 portable implementation. P75L15 P75L16 /* P75L17 * rtp.h -- RTP header file P75L18 */ P75L19 #include P75L20 P75L21 /* P75L22 * The type definitions below are valid for 32-bit architectures and P75L23 * may have to be adjusted for 16- or 64-bit architectures. P75L24 */ P75L25 typedef unsigned char u_int8; P75L26 typedef unsigned short u_int16; P75L27 typedef unsigned int u_int32; P75L28 typedef short int16; P75L29 P75L30 /* P75L31 * Current protocol version. P75L32 */ P75L33 #define RTP_VERSION 2 P75L34 P75L35 #define RTP_SEQ_MOD (1<<16) P75L36 #define RTP_MAX_SDES 255 /* maximum text length for SDES */ P75L37 P75L38 typedef enum { P75L39 RTCP_SR = 200, P75L40 RTCP_RR = 201, P75L41 RTCP_SDES = 202, P75L42 RTCP_BYE = 203, P75L43 RTCP_APP = 204 P75L44 } rtcp_type_t; P75L45 P75L46 typedef enum { P75L47 RTCP_SDES_END = 0, P75L48 RTCP_SDES_CNAME = 1, P76L1 RTCP_SDES_NAME = 2, P76L2 RTCP_SDES_EMAIL = 3, P76L3 RTCP_SDES_PHONE = 4, P76L4 RTCP_SDES_LOC = 5, P76L5 RTCP_SDES_TOOL = 6, P76L6 RTCP_SDES_NOTE = 7, P76L7 RTCP_SDES_PRIV = 8 P76L8 } rtcp_sdes_type_t; P76L9 P76L10 /* P76L11 * RTP data header P76L12 */ P76L13 typedef struct { P76L14 unsigned int version:2; /* protocol version */ P76L15 unsigned int p:1; /* padding flag */ P76L16 unsigned int x:1; /* header extension flag */ P76L17 unsigned int cc:4; /* CSRC count */ P76L18 unsigned int m:1; /* marker bit */ P76L19 unsigned int pt:7; /* payload type */ P76L20 unsigned int seq:16; /* sequence number */ P76L21 u_int32 ts; /* timestamp */ P76L22 u_int32 ssrc; /* synchronization source */ P76L23 u_int32 csrc[1]; /* optional CSRC list */ P76L24 } rtp_hdr_t; P76L25 P76L26 /* P76L27 * RTCP common header word P76L28 */ P76L29 typedef struct { P76L30 unsigned int version:2; /* protocol version */ P76L31 unsigned int p:1; /* padding flag */ P76L32 unsigned int count:5; /* varies by packet type */ P76L33 unsigned int pt:8; /* RTCP packet type */ P76L34 u_int16 length; /* pkt len in words, w/o this word */ P76L35 } rtcp_common_t; P76L36 P76L37 /* P76L38 * Big-endian mask for version, padding bit and packet type pair P76L39 */ P76L40 #define RTCP_VALID_MASK (0xc000 | 0x2000 | 0xfe) P76L41 #define RTCP_VALID_VALUE ((RTP_VERSION << 14) | RTCP_SR) P76L42 P76L43 /* P76L44 * Reception report block P76L45 */ P76L46 typedef struct { P76L47 u_int32 ssrc; /* data source being reported */ P76L48 unsigned int fraction:8; /* fraction lost since last SR/RR */ P77L1 int lost:24; /* cumul. no. pkts lost (signed!) */ P77L2 u_int32 last_seq; /* extended last seq. no. received */ P77L3 u_int32 jitter; /* interarrival jitter */ P77L4 u_int32 lsr; /* last SR packet from this source */ P77L5 u_int32 dlsr; /* delay since last SR packet */ P77L6 } rtcp_rr_t; P77L7 P77L8 /* P77L9 * SDES item P77L10 */ P77L11 typedef struct { P77L12 u_int8 type; /* type of item (rtcp_sdes_type_t) */ P77L13 u_int8 length; /* length of item (in octets) */ P77L14 char data[1]; /* text, not null-terminated */ P77L15 } rtcp_sdes_item_t; P77L16 P77L17 /* P77L18 * One RTCP packet P77L19 */ P77L20 typedef struct { P77L21 rtcp_common_t common; /* common header */ P77L22 union { P77L23 /* sender report (SR) */ P77L24 struct { P77L25 u_int32 ssrc; /* sender generating this report */ P77L26 u_int32 ntp_sec; /* NTP timestamp */ P77L27 u_int32 ntp_frac; P77L28 u_int32 rtp_ts; /* RTP timestamp */ P77L29 u_int32 psent; /* packets sent */ P77L30 u_int32 osent; /* octets sent */ P77L31 rtcp_rr_t rr[1]; /* variable-length list */ P77L32 } sr; P77L33 P77L34 /* reception report (RR) */ P77L35 struct { P77L36 u_int32 ssrc; /* receiver generating this report */ P77L37 rtcp_rr_t rr[1]; /* variable-length list */ P77L38 } rr; P77L39 P77L40 /* source description (SDES) */ P77L41 struct rtcp_sdes { P77L42 u_int32 src; /* first SSRC/CSRC */ P77L43 rtcp_sdes_item_t item[1]; /* list of SDES items */ P77L44 } sdes; P77L45 P77L46 /* BYE */ P77L47 struct { P77L48 u_int32 src[1]; /* list of sources */ P78L1 /* can't express trailing text for reason */ P78L2 } bye; P78L3 } r; P78L4 } rtcp_t; P78L5 P78L6 typedef struct rtcp_sdes rtcp_sdes_t; P78L7 P78L8 /* P78L9 * Per-source state information P78L10 */ P78L11 typedef struct { P78L12 u_int16 max_seq; /* highest seq. number seen */ P78L13 u_int32 cycles; /* shifted count of seq. number cycles */ P78L14 u_int32 base_seq; /* base seq number */ P78L15 u_int32 bad_seq; /* last 'bad' seq number + 1 */ P78L16 u_int32 probation; /* sequ. packets till source is valid */ P78L17 u_int32 received; /* packets received */ P78L18 u_int32 expected_prior; /* packet expected at last interval */ P78L19 u_int32 received_prior; /* packet received at last interval */ P78L20 u_int32 transit; /* relative trans time for prev pkt */ P78L21 u_int32 jitter; /* estimated jitter */ P78L22 /* ... */ P78L23 } source; P78L24 P78L25 A.1 RTP Data Header Validity Checks P78L26 P78L27 An RTP receiver should check the validity of the RTP header on P78L28 incoming packets since they might be encrypted or might be from a P78L29 different application that happens to be misaddressed. Similarly, if P78L30 encryption according to the method described in Section 9 is enabled, P78L31 the header validity check is needed to verify that incoming packets P78L32 have been correctly decrypted, although a failure of the header P78L33 validity check (e.g., unknown payload type) may not necessarily P78L34 indicate decryption failure. P78L35 P78L36 Only weak validity checks are possible on an RTP data packet from a P78L37 source that has not been heard before: P78L38 P78L39 o RTP version field must equal 2. P78L40 P78L41 o The payload type must be known, and in particular it must not be P78L42 equal to SR or RR. P78L43 P78L44 o If the P bit is set, then the last octet of the packet must P78L45 contain a valid octet count, in particular, less than the total P78L46 packet length minus the header size. P78L47 P78L48 P79L1 o The X bit must be zero if the profile does not specify that the P79L2 header extension mechanism may be used. Otherwise, the extension P79L3 length field must be less than the total packet size minus the P79L4 fixed header length and padding. P79L5 P79L6 o The length of the packet must be consistent with CC and payload P79L7 type (if payloads have a known length). P79L8 P79L9 The last three checks are somewhat complex and not always possible, P79L10 leaving only the first two which total just a few bits. If the SSRC P79L11 identifier in the packet is one that has been received before, then P79L12 the packet is probably valid and checking if the sequence number is P79L13 in the expected range provides further validation. If the SSRC P79L14 identifier has not been seen before, then data packets carrying that P79L15 identifier may be considered invalid until a small number of them P79L16 arrive with consecutive sequence numbers. Those invalid packets MAY P79L17 be discarded or they MAY be stored and delivered once validation has P79L18 been achieved if the resulting delay is acceptable. P79L19 P79L20 The routine update_seq shown below ensures that a source is declared P79L21 valid only after MIN_SEQUENTIAL packets have been received in P79L22 sequence. It also validates the sequence number seq of a newly P79L23 received packet and updates the sequence state for the packet's P79L24 source in the structure to which s points. P79L25 P79L26 When a new source is heard for the first time, that is, its SSRC P79L27 identifier is not in the table (see Section 8.2), and the per-source P79L28 state is allocated for it, s->probation is set to the number of P79L29 sequential packets required before declaring a source valid P79L30 (parameter MIN_SEQUENTIAL) and other variables are initialized: P79L31 P79L32 init_seq(s, seq); P79L33 s->max_seq = seq - 1; P79L34 s->probation = MIN_SEQUENTIAL; P79L35 P79L36 A non-zero s->probation marks the source as not yet valid so the P79L37 state may be discarded after a short timeout rather than a long one, P79L38 as discussed in Section 6.2.1. P79L39 P79L40 After a source is considered valid, the sequence number is considered P79L41 valid if it is no more than MAX_DROPOUT ahead of s->max_seq nor more P79L42 than MAX_MISORDER behind. If the new sequence number is ahead of P79L43 max_seq modulo the RTP sequence number range (16 bits), but is P79L44 smaller than max_seq, it has wrapped around and the (shifted) count P79L45 of sequence number cycles is incremented. A value of one is returned P79L46 to indicate a valid sequence number. P79L47 P79L48 P80L1 Otherwise, the value zero is returned to indicate that the validation P80L2 failed, and the bad sequence number plus 1 is stored. If the next P80L3 packet received carries the next higher sequence number, it is P80L4 considered the valid start of a new packet sequence presumably caused P80L5 by an extended dropout or a source restart. Since multiple complete P80L6 sequence number cycles may have been missed, the packet loss P80L7 statistics are reset. P80L8 P80L9 Typical values for the parameters are shown, based on a maximum P80L10 misordering time of 2 seconds at 50 packets/second and a maximum P80L11 dropout of 1 minute. The dropout parameter MAX_DROPOUT should be a P80L12 small fraction of the 16-bit sequence number space to give a P80L13 reasonable probability that new sequence numbers after a restart will P80L14 not fall in the acceptable range for sequence numbers from before the P80L15 restart. P80L16 P80L17 void init_seq(source *s, u_int16 seq) P80L18 { P80L19 s->base_seq = seq; P80L20 s->max_seq = seq; P80L21 s->bad_seq = RTP_SEQ_MOD + 1; /* so seq == bad_seq is false */ P80L22 s->cycles = 0; P80L23 s->received = 0; P80L24 s->received_prior = 0; P80L25 s->expected_prior = 0; P80L26 /* other initialization */ P80L27 } P80L28 P80L29 int update_seq(source *s, u_int16 seq) P80L30 { P80L31 u_int16 udelta = seq - s->max_seq; P80L32 const int MAX_DROPOUT = 3000; P80L33 const int MAX_MISORDER = 100; P80L34 const int MIN_SEQUENTIAL = 2; P80L35 P80L36 /* P80L37 * Source is not valid until MIN_SEQUENTIAL packets with P80L38 * sequential sequence numbers have been received. P80L39 */ P80L40 if (s->probation) { P80L41 /* packet is in sequence */ P80L42 if (seq == s->max_seq + 1) { P80L43 s->probation--; P80L44 s->max_seq = seq; P80L45 if (s->probation == 0) { P80L46 init_seq(s, seq); P80L47 s->received++; P80L48 return 1; P81L1 } P81L2 } else { P81L3 s->probation = MIN_SEQUENTIAL - 1; P81L4 s->max_seq = seq; P81L5 } P81L6 return 0; P81L7 } else if (udelta < MAX_DROPOUT) { P81L8 /* in order, with permissible gap */ P81L9 if (seq < s->max_seq) { P81L10 /* P81L11 * Sequence number wrapped - count another 64K cycle. P81L12 */ P81L13 s->cycles += RTP_SEQ_MOD; P81L14 } P81L15 s->max_seq = seq; P81L16 } else if (udelta <= RTP_SEQ_MOD - MAX_MISORDER) { P81L17 /* the sequence number made a very large jump */ P81L18 if (seq == s->bad_seq) { P81L19 /* P81L20 * Two sequential packets -- assume that the other side P81L21 * restarted without telling us so just re-sync P81L22 * (i.e., pretend this was the first packet). P81L23 */ P81L24 init_seq(s, seq); P81L25 } P81L26 else { P81L27 s->bad_seq = (seq + 1) & (RTP_SEQ_MOD-1); P81L28 return 0; P81L29 } P81L30 } else { P81L31 /* duplicate or reordered packet */ P81L32 } P81L33 s->received++; P81L34 return 1; P81L35 } P81L36 P81L37 The validity check can be made stronger requiring more than two P81L38 packets in sequence. The disadvantages are that a larger number of P81L39 initial packets will be discarded (or delayed in a queue) and that P81L40 high packet loss rates could prevent validation. However, because P81L41 the RTCP header validation is relatively strong, if an RTCP packet is P81L42 received from a source before the data packets, the count could be P81L43 adjusted so that only two packets are required in sequence. If P81L44 initial data loss for a few seconds can be tolerated, an application P81L45 MAY choose to discard all data packets from a source until a valid P81L46 RTCP packet has been received from that source. P81L47 P81L48 P82L1 Depending on the application and encoding, algorithms may exploit P82L2 additional knowledge about the payload format for further validation. P82L3 For payload types where the timestamp increment is the same for all P82L4 packets, the timestamp values can be predicted from the previous P82L5 packet received from the same source using the sequence number P82L6 difference (assuming no change in payload type). P82L7 P82L8 A strong "fast-path" check is possible since with high probability P82L9 the first four octets in the header of a newly received RTP data P82L10 packet will be just the same as that of the previous packet from the P82L11 same SSRC except that the sequence number will have increased by one. P82L12 Similarly, a single-entry cache may be used for faster SSRC lookups P82L13 in applications where data is typically received from one source at a P82L14 time. P82L15 P82L16 A.2 RTCP Header Validity Checks P82L17 P82L18 The following checks should be applied to RTCP packets. P82L19 P82L20 o RTP version field must equal 2. P82L21 P82L22 o The payload type field of the first RTCP packet in a compound P82L23 packet must be equal to SR or RR. P82L24 P82L25 o The padding bit (P) should be zero for the first packet of a P82L26 compound RTCP packet because padding should only be applied, if it P82L27 is needed, to the last packet. P82L28 P82L29 o The length fields of the individual RTCP packets must add up to P82L30 the overall length of the compound RTCP packet as received. This P82L31 is a fairly strong check. P82L32 P82L33 The code fragment below performs all of these checks. The packet P82L34 type is not checked for subsequent packets since unknown packet types P82L35 may be present and should be ignored. P82L36 P82L37 u_int32 len; /* length of compound RTCP packet in words */ P82L38 rtcp_t *r; /* RTCP header */ P82L39 rtcp_t *end; /* end of compound RTCP packet */ P82L40 P82L41 if ((*(u_int16 *)r & RTCP_VALID_MASK) != RTCP_VALID_VALUE) { P82L42 /* something wrong with packet format */ P82L43 } P82L44 end = (rtcp_t *)((u_int32 *)r + len); P82L45 P82L46 do r = (rtcp_t *)((u_int32 *)r + r->common.length + 1); P82L47 while (r < end && r->common.version == 2); P82L48 P83L1 if (r != end) { P83L2 /* something wrong with packet format */ P83L3 } P83L4 P83L5 A.3 Determining Number of Packets Expected and Lost P83L6 P83L7 In order to compute packet loss rates, the number of RTP packets P83L8 expected and actually received from each source needs to be known, P83L9 using per-source state information defined in struct source P83L10 referenced via pointer s in the code below. The number of packets P83L11 received is simply the count of packets as they arrive, including any P83L12 late or duplicate packets. The number of packets expected can be P83L13 computed by the receiver as the difference between the highest P83L14 sequence number received (s->max_seq) and the first sequence number P83L15 received (s->base_seq). Since the sequence number is only 16 bits P83L16 and will wrap around, it is necessary to extend the highest sequence P83L17 number with the (shifted) count of sequence number wraparounds P83L18 (s->cycles). Both the received packet count and the count of cycles P83L19 are maintained the RTP header validity check routine in Appendix A.1. P83L20 P83L21 extended_max = s->cycles + s->max_seq; P83L22 expected = extended_max - s->base_seq + 1; P83L23 P83L24 The number of packets lost is defined to be the number of packets P83L25 expected less the number of packets actually received: P83L26 P83L27 lost = expected - s->received; P83L28 P83L29 Since this signed number is carried in 24 bits, it should be clamped P83L30 at 0x7fffff for positive loss or 0x800000 for negative loss rather P83L31 than wrapping around. P83L32 P83L33 The fraction of packets lost during the last reporting interval P83L34 (since the previous SR or RR packet was sent) is calculated from P83L35 differences in the expected and received packet counts across the P83L36 interval, where expected_prior and received_prior are the values P83L37 saved when the previous reception report was generated: P83L38 P83L39 expected_interval = expected - s->expected_prior; P83L40 s->expected_prior = expected; P83L41 received_interval = s->received - s->received_prior; P83L42 s->received_prior = s->received; P83L43 lost_interval = expected_interval - received_interval; P83L44 if (expected_interval == 0 || lost_interval <= 0) fraction = 0; P83L45 else fraction = (lost_interval << 8) / expected_interval; P83L46 P83L47 The resulting fraction is an 8-bit fixed point number with the binary P83L48 point at the left edge. P84L1 A.4 Generating RTCP SDES Packets P84L2 P84L3 This function builds one SDES chunk into buffer b composed of argc P84L4 items supplied in arrays type, value and length. It returns a P84L5 pointer to the next available location within b. P84L6 P84L7 char *rtp_write_sdes(char *b, u_int32 src, int argc, P84L8 rtcp_sdes_type_t type[], char *value[], P84L9 int length[]) P84L10 { P84L11 rtcp_sdes_t *s = (rtcp_sdes_t *)b; P84L12 rtcp_sdes_item_t *rsp; P84L13 int i; P84L14 int len; P84L15 int pad; P84L16 P84L17 /* SSRC header */ P84L18 s->src = src; P84L19 rsp = &s->item[0]; P84L20 P84L21 /* SDES items */ P84L22 for (i = 0; i < argc; i++) { P84L23 rsp->type = type[i]; P84L24 len = length[i]; P84L25 if (len > RTP_MAX_SDES) { P84L26 /* invalid length, may want to take other action */ P84L27 len = RTP_MAX_SDES; P84L28 } P84L29 rsp->length = len; P84L30 memcpy(rsp->data, value[i], len); P84L31 rsp = (rtcp_sdes_item_t *)&rsp->data[len]; P84L32 } P84L33 P84L34 /* terminate with end marker and pad to next 4-octet boundary */ P84L35 len = ((char *) rsp) - b; P84L36 pad = 4 - (len & 0x3); P84L37 b = (char *) rsp; P84L38 while (pad--) *b++ = RTCP_SDES_END; P84L39 P84L40 return b; P84L41 } P84L42 P84L43 P84L44 P84L45 P84L46 P84L47 P84L48 P85L1 A.5 Parsing RTCP SDES Packets P85L2 P85L3 This function parses an SDES packet, calling functions find_member() P85L4 to find a pointer to the information for a session member given the P85L5 SSRC identifier and member_sdes() to store the new SDES information P85L6 for that member. This function expects a pointer to the header of P85L7 the RTCP packet. P85L8 P85L9 void rtp_read_sdes(rtcp_t *r) P85L10 { P85L11 int count = r->common.count; P85L12 rtcp_sdes_t *sd = &r->r.sdes; P85L13 rtcp_sdes_item_t *rsp, *rspn; P85L14 rtcp_sdes_item_t *end = (rtcp_sdes_item_t *) P85L15 ((u_int32 *)r + r->common.length + 1); P85L16 source *s; P85L17 P85L18 while (--count >= 0) { P85L19 rsp = &sd->item[0]; P85L20 if (rsp >= end) break; P85L21 s = find_member(sd->src); P85L22 P85L23 for (; rsp->type; rsp = rspn ) { P85L24 rspn = (rtcp_sdes_item_t *)((char*)rsp+rsp->length+2); P85L25 if (rspn >= end) { P85L26 rsp = rspn; P85L27 break; P85L28 } P85L29 member_sdes(s, rsp->type, rsp->data, rsp->length); P85L30 } P85L31 sd = (rtcp_sdes_t *) P85L32 ((u_int32 *)sd + (((char *)rsp - (char *)sd) >> 2)+1); P85L33 } P85L34 if (count >= 0) { P85L35 /* invalid packet format */ P85L36 } P85L37 } P85L38 P85L39 A.6 Generating a Random 32-bit Identifier P85L40 P85L41 The following subroutine generates a random 32-bit identifier using P85L42 the MD5 routines published in RFC 1321 [32]. The system routines may P85L43 not be present on all operating systems, but they should serve as P85L44 hints as to what kinds of information may be used. Other system P85L45 calls that may be appropriate include P85L46 P85L47 P85L48 P86L1 o getdomainname(), P86L2 P86L3 o getwd(), or P86L4 P86L5 o getrusage(). P86L6 P86L7 "Live" video or audio samples are also a good source of random P86L8 numbers, but care must be taken to avoid using a turned-off P86L9 microphone or blinded camera as a source [17]. P86L10 P86L11 Use of this or a similar routine is recommended to generate the P86L12 initial seed for the random number generator producing the RTCP P86L13 period (as shown in Appendix A.7), to generate the initial values for P86L14 the sequence number and timestamp, and to generate SSRC values. P86L15 Since this routine is likely to be CPU-intensive, its direct use to P86L16 generate RTCP periods is inappropriate because predictability is not P86L17 an issue. Note that this routine produces the same result on P86L18 repeated calls until the value of the system clock changes unless P86L19 different values are supplied for the type argument. P86L20 P86L21 /* P86L22 * Generate a random 32-bit quantity. P86L23 */ P86L24 #include /* u_long */ P86L25 #include /* gettimeofday() */ P86L26 #include /* get..() */ P86L27 #include /* printf() */ P86L28 #include /* clock() */ P86L29 #include /* uname() */ P86L30 #include "global.h" /* from RFC 1321 */ P86L31 #include "md5.h" /* from RFC 1321 */ P86L32 P86L33 #define MD_CTX MD5_CTX P86L34 #define MDInit MD5Init P86L35 #define MDUpdate MD5Update P86L36 #define MDFinal MD5Final P86L37 P86L38 static u_long md_32(char *string, int length) P86L39 { P86L40 MD_CTX context; P86L41 union { P86L42 char c[16]; P86L43 u_long x[4]; P86L44 } digest; P86L45 u_long r; P86L46 int i; P86L47 P86L48 MDInit (&context); P87L1 MDUpdate (&context, string, length); P87L2 MDFinal ((unsigned char *)&digest, &context); P87L3 r = 0; P87L4 for (i = 0; i < 3; i++) { P87L5 r ^= digest.x[i]; P87L6 } P87L7 return r; P87L8 } /* md_32 */ P87L9 P87L10 /* P87L11 * Return random unsigned 32-bit quantity. Use 'type' argument if P87L12 * you need to generate several different values in close succession. P87L13 */ P87L14 u_int32 random32(int type) P87L15 { P87L16 struct { P87L17 int type; P87L18 struct timeval tv; P87L19 clock_t cpu; P87L20 pid_t pid; P87L21 u_long hid; P87L22 uid_t uid; P87L23 gid_t gid; P87L24 struct utsname name; P87L25 } s; P87L26 P87L27 gettimeofday(&s.tv, 0); P87L28 uname(&s.name); P87L29 s.type = type; P87L30 s.cpu = clock(); P87L31 s.pid = getpid(); P87L32 s.hid = gethostid(); P87L33 s.uid = getuid(); P87L34 s.gid = getgid(); P87L35 /* also: system uptime */ P87L36 P87L37 return md_32((char *)&s, sizeof(s)); P87L38 } /* random32 */ P87L39 P87L40 A.7 Computing the RTCP Transmission Interval P87L41 P87L42 The following functions implement the RTCP transmission and reception P87L43 rules described in Section 6.2. These rules are coded in several P87L44 functions: P87L45 P87L46 o rtcp_interval() computes the deterministic calculated interval, P87L47 measured in seconds. The parameters are defined in Section 6.3. P87L48 P88L1 o OnExpire() is called when the RTCP transmission timer expires. P88L2 P88L3 o OnReceive() is called whenever an RTCP packet is received. P88L4 P88L5 Both OnExpire() and OnReceive() have event e as an argument. This is P88L6 the next scheduled event for that participant, either an RTCP report P88L7 or a BYE packet. It is assumed that the following functions are P88L8 available: P88L9 P88L10 o Schedule(time t, event e) schedules an event e to occur at time t. P88L11 When time t arrives, the function OnExpire is called with e as an P88L12 argument. P88L13 P88L14 o Reschedule(time t, event e) reschedules a previously scheduled P88L15 event e for time t. P88L16 P88L17 o SendRTCPReport(event e) sends an RTCP report. P88L18 P88L19 o SendBYEPacket(event e) sends a BYE packet. P88L20 P88L21 o TypeOfEvent(event e) returns EVENT_BYE if the event being P88L22 processed is for a BYE packet to be sent, else it returns P88L23 EVENT_REPORT. P88L24 P88L25 o PacketType(p) returns PACKET_RTCP_REPORT if packet p is an RTCP P88L26 report (not BYE), PACKET_BYE if its a BYE RTCP packet, and P88L27 PACKET_RTP if its a regular RTP data packet. P88L28 P88L29 o ReceivedPacketSize() and SentPacketSize() return the size of the P88L30 referenced packet in octets. P88L31 P88L32 o NewMember(p) returns a 1 if the participant who sent packet p is P88L33 not currently in the member list, 0 otherwise. Note this function P88L34 is not sufficient for a complete implementation because each CSRC P88L35 identifier in an RTP packet and each SSRC in a BYE packet should P88L36 be processed. P88L37 P88L38 o NewSender(p) returns a 1 if the participant who sent packet p is P88L39 not currently in the sender sublist of the member list, 0 P88L40 otherwise. P88L41 P88L42 o AddMember() and RemoveMember() to add and remove participants from P88L43 the member list. P88L44 P88L45 o AddSender() and RemoveSender() to add and remove participants from P88L46 the sender sublist of the member list. P88L47 P88L48 P89L1 These functions would have to be extended for an implementation that P89L2 allows the RTCP bandwidth fractions for senders and non-senders to be P89L3 specified as explicit parameters rather than fixed values of 25% and P89L4 75%. The extended implementation of rtcp_interval() would need to P89L5 avoid division by zero if one of the parameters was zero. P89L6 P89L7 double rtcp_interval(int members, P89L8 int senders, P89L9 double rtcp_bw, P89L10 int we_sent, P89L11 double avg_rtcp_size, P89L12 int initial) P89L13 { P89L14 /* P89L15 * Minimum average time between RTCP packets from this site (in P89L16 * seconds). This time prevents the reports from `clumping' when P89L17 * sessions are small and the law of large numbers isn't helping P89L18 * to smooth out the traffic. It also keeps the report interval P89L19 * from becoming ridiculously small during transient outages like P89L20 * a network partition. P89L21 */ P89L22 double const RTCP_MIN_TIME = 5.; P89L23 /* P89L24 * Fraction of the RTCP bandwidth to be shared among active P89L25 * senders. (This fraction was chosen so that in a typical P89L26 * session with one or two active senders, the computed report P89L27 * time would be roughly equal to the minimum report time so that P89L28 * we don't unnecessarily slow down receiver reports.) The P89L29 * receiver fraction must be 1 - the sender fraction. P89L30 */ P89L31 double const RTCP_SENDER_BW_FRACTION = 0.25; P89L32 double const RTCP_RCVR_BW_FRACTION = (1-RTCP_SENDER_BW_FRACTION); P89L33 /* P89L34 /* To compensate for "timer reconsideration" converging to a P89L35 * value below the intended average. P89L36 */ P89L37 double const COMPENSATION = 2.71828 - 1.5; P89L38 P89L39 double t; /* interval */ P89L40 double rtcp_min_time = RTCP_MIN_TIME; P89L41 int n; /* no. of members for computation */ P89L42 P89L43 /* P89L44 * Very first call at application start-up uses half the min P89L45 * delay for quicker notification while still allowing some time P89L46 * before reporting for randomization and to learn about other P89L47 * sources so the report interval will converge to the correct P89L48 * interval more quickly. P90L1 */ P90L2 if (initial) { P90L3 rtcp_min_time /= 2; P90L4 } P90L5 /* P90L6 * Dedicate a fraction of the RTCP bandwidth to senders unless P90L7 * the number of senders is large enough that their share is P90L8 * more than that fraction. P90L9 */ P90L10 n = members; P90L11 if (senders <= members * RTCP_SENDER_BW_FRACTION) { P90L12 if (we_sent) { P90L13 rtcp_bw *= RTCP_SENDER_BW_FRACTION; P90L14 n = senders; P90L15 } else { P90L16 rtcp_bw *= RTCP_RCVR_BW_FRACTION; P90L17 n -= senders; P90L18 } P90L19 } P90L20 P90L21 /* P90L22 * The effective number of sites times the average packet size is P90L23 * the total number of octets sent when each site sends a report. P90L24 * Dividing this by the effective bandwidth gives the time P90L25 * interval over which those packets must be sent in order to P90L26 * meet the bandwidth target, with a minimum enforced. In that P90L27 * time interval we send one report so this time is also our P90L28 * average time between reports. P90L29 */ P90L30 t = avg_rtcp_size * n / rtcp_bw; P90L31 if (t < rtcp_min_time) t = rtcp_min_time; P90L32 P90L33 /* P90L34 * To avoid traffic bursts from unintended synchronization with P90L35 * other sites, we then pick our actual next report interval as a P90L36 * random number uniformly distributed between 0.5*t and 1.5*t. P90L37 */ P90L38 t = t * (drand48() + 0.5); P90L39 t = t / COMPENSATION; P90L40 return t; P90L41 } P90L42 P90L43 void OnExpire(event e, P90L44 int members, P90L45 int senders, P90L46 double rtcp_bw, P90L47 int we_sent, P90L48 double *avg_rtcp_size, P91L1 int *initial, P91L2 time_tp tc, P91L3 time_tp *tp, P91L4 int *pmembers) P91L5 { P91L6 /* This function is responsible for deciding whether to send an P91L7 * RTCP report or BYE packet now, or to reschedule transmission. P91L8 * It is also responsible for updating the pmembers, initial, tp, P91L9 * and avg_rtcp_size state variables. This function should be P91L10 * called upon expiration of the event timer used by Schedule(). P91L11 */ P91L12 P91L13 double t; /* Interval */ P91L14 double tn; /* Next transmit time */ P91L15 P91L16 /* In the case of a BYE, we use "timer reconsideration" to P91L17 * reschedule the transmission of the BYE if necessary */ P91L18 P91L19 if (TypeOfEvent(e) == EVENT_BYE) { P91L20 t = rtcp_interval(members, P91L21 senders, P91L22 rtcp_bw, P91L23 we_sent, P91L24 *avg_rtcp_size, P91L25 *initial); P91L26 tn = *tp + t; P91L27 if (tn <= tc) { P91L28 SendBYEPacket(e); P91L29 exit(1); P91L30 } else { P91L31 Schedule(tn, e); P91L32 } P91L33 P91L34 } else if (TypeOfEvent(e) == EVENT_REPORT) { P91L35 t = rtcp_interval(members, P91L36 senders, P91L37 rtcp_bw, P91L38 we_sent, P91L39 *avg_rtcp_size, P91L40 *initial); P91L41 tn = *tp + t; P91L42 if (tn <= tc) { P91L43 SendRTCPReport(e); P91L44 *avg_rtcp_size = (1./16.)*SentPacketSize(e) + P91L45 (15./16.)*(*avg_rtcp_size); P91L46 *tp = tc; P91L47 P91L48 /* We must redraw the interval. Don't reuse the P92L1 one computed above, since its not actually P92L2 distributed the same, as we are conditioned P92L3 on it being small enough to cause a packet to P92L4 be sent */ P92L5 P92L6 t = rtcp_interval(members, P92L7 senders, P92L8 rtcp_bw, P92L9 we_sent, P92L10 *avg_rtcp_size, P92L11 *initial); P92L12 P92L13 Schedule(t+tc,e); P92L14 *initial = 0; P92L15 } else { P92L16 Schedule(tn, e); P92L17 } P92L18 *pmembers = members; P92L19 } P92L20 } P92L21 P92L22 void OnReceive(packet p, P92L23 event e, P92L24 int *members, P92L25 int *pmembers, P92L26 int *senders, P92L27 double *avg_rtcp_size, P92L28 double *tp, P92L29 double tc, P92L30 double tn) P92L31 { P92L32 /* What we do depends on whether we have left the group, and are P92L33 * waiting to send a BYE (TypeOfEvent(e) == EVENT_BYE) or an RTCP P92L34 * report. p represents the packet that was just received. */ P92L35 P92L36 if (PacketType(p) == PACKET_RTCP_REPORT) { P92L37 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) { P92L38 AddMember(p); P92L39 *members += 1; P92L40 } P92L41 *avg_rtcp_size = (1./16.)*ReceivedPacketSize(p) + P92L42 (15./16.)*(*avg_rtcp_size); P92L43 } else if (PacketType(p) == PACKET_RTP) { P92L44 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) { P92L45 AddMember(p); P92L46 *members += 1; P92L47 } P92L48 if (NewSender(p) && (TypeOfEvent(e) == EVENT_REPORT)) { P93L1 AddSender(p); P93L2 *senders += 1; P93L3 } P93L4 } else if (PacketType(p) == PACKET_BYE) { P93L5 *avg_rtcp_size = (1./16.)*ReceivedPacketSize(p) + P93L6 (15./16.)*(*avg_rtcp_size); P93L7 P93L8 if (TypeOfEvent(e) == EVENT_REPORT) { P93L9 if (NewSender(p) == FALSE) { P93L10 RemoveSender(p); P93L11 *senders -= 1; P93L12 } P93L13 P93L14 if (NewMember(p) == FALSE) { P93L15 RemoveMember(p); P93L16 *members -= 1; P93L17 } P93L18 P93L19 if (*members < *pmembers) { P93L20 tn = tc + P93L21 (((double) *members)/(*pmembers))*(tn - tc); P93L22 *tp = tc - P93L23 (((double) *members)/(*pmembers))*(tc - *tp); P93L24 P93L25 /* Reschedule the next report for time tn */ P93L26 P93L27 Reschedule(tn, e); P93L28 *pmembers = *members; P93L29 } P93L30 P93L31 } else if (TypeOfEvent(e) == EVENT_BYE) { P93L32 *members += 1; P93L33 } P93L34 } P93L35 } P93L36 P93L37 P93L38 P93L39 P93L40 P93L41 P93L42 P93L43 P93L44 P93L45 P93L46 P93L47 P93L48 P94L1 A.8 Estimating the Interarrival Jitter P94L2 P94L3 The code fragments below implement the algorithm given in Section P94L4 6.4.1 for calculating an estimate of the statistical variance of the P94L5 RTP data interarrival time to be inserted in the interarrival jitter P94L6 field of reception reports. The inputs are r->ts, the timestamp from P94L7 the incoming packet, and arrival, the current time in the same units. P94L8 Here s points to state for the source; s->transit holds the relative P94L9 transit time for the previous packet, and s->jitter holds the P94L10 estimated jitter. The jitter field of the reception report is P94L11 measured in timestamp units and expressed as an unsigned integer, but P94L12 the jitter estimate is kept in a floating point. As each data packet P94L13 arrives, the jitter estimate is updated: P94L14 P94L15 int transit = arrival - r->ts; P94L16 int d = transit - s->transit; P94L17 s->transit = transit; P94L18 if (d < 0) d = -d; P94L19 s->jitter += (1./16.) * ((double)d - s->jitter); P94L20 P94L21 When a reception report block (to which rr points) is generated for P94L22 this member, the current jitter estimate is returned: P94L23 P94L24 rr->jitter = (u_int32) s->jitter; P94L25 P94L26 Alternatively, the jitter estimate can be kept as an integer, but P94L27 scaled to reduce round-off error. The calculation is the same except P94L28 for the last line: P94L29 P94L30 s->jitter += d - ((s->jitter + 8) >> 4); P94L31 P94L32 In this case, the estimate is sampled for the reception report as: P94L33 P94L34 rr->jitter = s->jitter >> 4; P94L35 P94L36 P94L37 P94L38 P94L39 P94L40 P94L41 P94L42 P94L43 P94L44 P94L45 P94L46 P94L47 P94L48 P95L1 Appendix B - Changes from RFC 1889 P95L2 P95L3 Most of this RFC is identical to RFC 1889. There are no changes in P95L4 the packet formats on the wire, only changes to the rules and P95L5 algorithms governing how the protocol is used. The biggest change is P95L6 an enhancement to the scalable timer algorithm for calculating when P95L7 to send RTCP packets: P95L8 P95L9 o The algorithm for calculating the RTCP transmission interval P95L10 specified in Sections 6.2 and 6.3 and illustrated in Appendix A.7 P95L11 is augmented to include "reconsideration" to minimize transmission P95L12 in excess of the intended rate when many participants join a P95L13 session simultaneously, and "reverse reconsideration" to reduce P95L14 the incidence and duration of false participant timeouts when the P95L15 number of participants drops rapidly. Reverse reconsideration is P95L16 also used to possibly shorten the delay before sending RTCP SR P95L17 when transitioning from passive receiver to active sender mode. P95L18 P95L19 o Section 6.3.7 specifies new rules controlling when an RTCP BYE P95L20 packet should be sent in order to avoid a flood of packets when P95L21 many participants leave a session simultaneously. P95L22 P95L23 o The requirement to retain state for inactive participants for a P95L24 period long enough to span typical network partitions was removed P95L25 from Section 6.2.1. In a session where many participants join for P95L26 a brief time and fail to send BYE, this requirement would cause a P95L27 significant overestimate of the number of participants. The P95L28 reconsideration algorithm added in this revision compensates for P95L29 the large number of new participants joining simultaneously when a P95L30 partition heals. P95L31 P95L32 It should be noted that these enhancements only have a significant P95L33 effect when the number of session participants is large (thousands) P95L34 and most of the participants join or leave at the same time. This P95L35 makes testing in a live network difficult. However, the algorithm P95L36 was subjected to a thorough analysis and simulation to verify its P95L37 performance. Furthermore, the enhanced algorithm was designed to P95L38 interoperate with the algorithm in RFC 1889 such that the degree of P95L39 reduction in excess RTCP bandwidth during a step join is proportional P95L40 to the fraction of participants that implement the enhanced P95L41 algorithm. Interoperation of the two algorithms has been verified P95L42 experimentally on live networks. P95L43 P95L44 Other functional changes were: P95L45 P95L46 o Section 6.2.1 specifies that implementations may store only a P95L47 sampling of the participants' SSRC identifiers to allow scaling to P95L48 very large sessions. Algorithms are specified in RFC 2762 [21]. P96L1 o In Section 6.2 it is specified that RTCP sender and non-sender P96L2 bandwidths may be set as separate parameters of the session rather P96L3 than a strict percentage of the session bandwidth, and may be set P96L4 to zero. The requirement that RTCP was mandatory for RTP sessions P96L5 using IP multicast was relaxed. However, a clarification was also P96L6 added that turning off RTCP is NOT RECOMMENDED. P96L7 P96L8 o In Sections 6.2, 6.3.1 and Appendix A.7, it is specified that the P96L9 fraction of participants below which senders get dedicated RTCP P96L10 bandwidth changes from the fixed 1/4 to a ratio based on the RTCP P96L11 sender and non-sender bandwidth parameters when those are given. P96L12 The condition that no bandwidth is dedicated to senders when there P96L13 are no senders was removed since that is expected to be a P96L14 transitory state. It also keeps non-senders from using sender P96L15 RTCP bandwidth when that is not intended. P96L16 P96L17 o Also in Section 6.2 it is specified that the minimum RTCP interval P96L18 may be scaled to smaller values for high bandwidth sessions, and P96L19 that the initial RTCP delay may be set to zero for unicast P96L20 sessions. P96L21 P96L22 o Timing out a participant is to be based on inactivity for a number P96L23 of RTCP report intervals calculated using the receiver RTCP P96L24 bandwidth fraction even for active senders. P96L25 P96L26 o Sections 7.2 and 7.3 specify that translators and mixers should P96L27 send BYE packets for the sources they are no longer forwarding. P96L28 P96L29 o Rule changes for layered encodings are defined in Sections 2.4, P96L30 6.3.9, 8.3 and 11. In the last of these, it is noted that the P96L31 address and port assignment rule conflicts with the SDP P96L32 specification, RFC 2327 [15], but it is intended that this P96L33 restriction will be relaxed in a revision of RFC 2327. P96L34 P96L35 o The convention for using even/odd port pairs for RTP and RTCP in P96L36 Section 11 was clarified to refer to destination ports. The P96L37 requirement to use an even/odd port pair was removed if the two P96L38 ports are specified explicitly. For unicast RTP sessions, P96L39 distinct port pairs may be used for the two ends (Sections 3, 7.1 P96L40 and 11). P96L41 P96L42 o A new Section 10 was added to explain the requirement for P96L43 congestion control in applications using RTP. P96L44 P96L45 o In Section 8.2, the requirement that a new SSRC identifier MUST be P96L46 chosen whenever the source transport address is changed has been P96L47 relaxed to say that a new SSRC identifier MAY be chosen. P96L48 Correspondingly, it was clarified that an implementation MAY P97L1 choose to keep packets from the new source address rather than the P97L2 existing source address when an SSRC collision occurs between two P97L3 other participants, and SHOULD do so for applications such as P97L4 telephony in which some sources such as mobile entities may change P97L5 addresses during the course of an RTP session. P97L6 P97L7 o An indentation bug in the RFC 1889 printing of the pseudo-code for P97L8 the collision detection and resolution algorithm in Section 8.2 P97L9 has been corrected by translating the syntax to pseudo C language, P97L10 and the algorithm has been modified to remove the restriction that P97L11 both RTP and RTCP must be sent from the same source port number. P97L12 P97L13 o The description of the padding mechanism for RTCP packets was P97L14 clarified and it is specified that padding MUST only be applied to P97L15 the last packet of a compound RTCP packet. P97L16 P97L17 o In Section A.1, initialization of base_seq was corrected to be seq P97L18 rather than seq - 1, and the text was corrected to say the bad P97L19 sequence number plus 1 is stored. The initialization of max_seq P97L20 and other variables for the algorithm was separated from the text P97L21 to make clear that this initialization must be done in addition to P97L22 calling the init_seq() function (and a few words lost in RFC 1889 P97L23 when processing the document from source to output form were P97L24 restored). P97L25 P97L26 o Clamping of number of packets lost in Section A.3 was corrected to P97L27 use both positive and negative limits. P97L28 P97L29 o The specification of "relative" NTP timestamp in the RTCP SR P97L30 section now defines these timestamps to be based on the most P97L31 common system-specific clock, such as system uptime, rather than P97L32 on session elapsed time which would not be the same for multiple P97L33 applications started on the same machine at different times. P97L34 P97L35 Non-functional changes: P97L36 P97L37 o It is specified that a receiver MUST ignore packets with payload P97L38 types it does not understand. P97L39 P97L40 o In Fig. 2, the floating point NTP timestamp value was corrected, P97L41 some missing leading zeros were added in a hex number, and the UTC P97L42 timezone was specified. P97L43 P97L44 o The inconsequence of NTP timestamps wrapping around in the year P97L45 2036 is explained. P97L46 P97L47 P97L48 P98L1 o The policy for registration of RTCP packet types and SDES types P98L2 was clarified in a new Section 15, IANA Considerations. The P98L3 suggestion that experimenters register the numbers they need and P98L4 then unregister those which prove to be unneeded has been removed P98L5 in favor of using APP and PRIV. Registration of profile names was P98L6 also specified. P98L7 P98L8 o The reference for the UTF-8 character set was changed from an P98L9 X/Open Preliminary Specification to be RFC 2279. P98L10 P98L11 o The reference for RFC 1597 was updated to RFC 1918 and the P98L12 reference for RFC 2543 was updated to RFC 3261. P98L13 P98L14 o The last paragraph of the introduction in RFC 1889, which P98L15 cautioned implementors to limit deployment in the Internet, was P98L16 removed because it was deemed no longer relevant. P98L17 P98L18 o A non-normative note regarding the use of RTP with Source-Specific P98L19 Multicast (SSM) was added in Section 6. P98L20 P98L21 o The definition of "RTP session" in Section 3 was expanded to P98L22 acknowledge that a single session may use multiple destination P98L23 transport addresses (as was always the case for a translator or P98L24 mixer) and to explain that the distinguishing feature of an RTP P98L25 session is that each corresponds to a separate SSRC identifier P98L26 space. A new definition of "multimedia session" was added to P98L27 reduce confusion about the word "session". P98L28 P98L29 o The meaning of "sampling instant" was explained in more detail as P98L30 part of the definition of the timestamp field of the RTP header in P98L31 Section 5.1. P98L32 P98L33 o Small clarifications of the text have been made in several places, P98L34 some in response to questions from readers. In particular: P98L35 P98L36 - In RFC 1889, the first five words of the second sentence of P98L37 Section 2.2 were lost in processing the document from source to P98L38 output form, but are now restored. P98L39 P98L40 - A definition for "RTP media type" was added in Section 3 to P98L41 allow the explanation of multiplexing RTP sessions in Section P98L42 5.2 to be more clear regarding the multiplexing of multiple P98L43 media. That section also now explains that multiplexing P98L44 multiple sources of the same medium based on SSRC identifiers P98L45 may be appropriate and is the norm for multicast sessions. P98L46 P98L47 - The definition for "non-RTP means" was expanded to include P98L48 examples of other protocols constituting non-RTP means. P99L1 - The description of the session bandwidth parameter is expanded P99L2 in Section 6.2, including a clarification that the control P99L3 traffic bandwidth is in addition to the session bandwidth for P99L4 the data traffic. P99L5 P99L6 - The effect of varying packet duration on the jitter calculation P99L7 was explained in Section 6.4.4. P99L8 P99L9 - The method for terminating and padding a sequence of SDES items P99L10 was clarified in Section 6.5. P99L11 P99L12 - IPv6 address examples were added in the description of SDES P99L13 CNAME in Section 6.5.1, and "example.com" was used in place of P99L14 other example domain names. P99L15 P99L16 - The Security section added a formal reference to IPSEC now that P99L17 it is available, and says that the confidentiality method P99L18 defined in this specification is primarily to codify existing P99L19 practice. It is RECOMMENDED that stronger encryption P99L20 algorithms such as Triple-DES be used in place of the default P99L21 algorithm, and noted that the SRTP profile based on AES will be P99L22 the correct choice in the future. A caution about the weakness P99L23 of the RTP header as an initialization vector was added. It P99L24 was also noted that payload-only encryption is necessary to P99L25 allow for header compression. P99L26 P99L27 - The method for partial encryption of RTCP was clarified; in P99L28 particular, SDES CNAME is carried in only one part when the P99L29 compound RTCP packet is split. P99L30 P99L31 - It is clarified that only one compound RTCP packet should be P99L32 sent per reporting interval and that if there are too many P99L33 active sources for the reports to fit in the MTU, then a subset P99L34 of the sources should be selected round-robin over multiple P99L35 intervals. P99L36 P99L37 - A note was added in Appendix A.1 that packets may be saved P99L38 during RTP header validation and delivered upon success. P99L39 P99L40 - Section 7.3 now explains that a mixer aggregating SDES packets P99L41 uses more RTCP bandwidth due to longer packets, and a mixer P99L42 passing through RTCP naturally sends packets at higher than the P99L43 single source rate, but both behaviors are valid. P99L44 P99L45 - Section 13 clarifies that an RTP application may use multiple P99L46 profiles but typically only one in a given session. P99L47 P99L48 P100L1 - The terms MUST, SHOULD, MAY, etc. are used as defined in RFC P100L2 2119. P100L3 P100L4 - The bibliography was divided into normative and informative P100L5 references. P100L6 P100L7 References P100L8 P100L9 Normative References P100L10 P100L11 [1] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video P100L12 Conferences with Minimal Control", RFC 3551, July 2003. P100L13 P100L14 [2] Bradner, S., "Key Words for Use in RFCs to Indicate Requirement P100L15 Levels", BCP 14, RFC 2119, March 1997. P100L16 P100L17 [3] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. P100L18 P100L19 [4] Mills, D., "Network Time Protocol (Version 3) Specification, P100L20 Implementation and Analysis", RFC 1305, March 1992. P100L21 P100L22 [5] Yergeau, F., "UTF-8, a Transformation Format of ISO 10646", RFC P100L23 2279, January 1998. P100L24 P100L25 [6] Mockapetris, P., "Domain Names - Concepts and Facilities", STD P100L26 13, RFC 1034, November 1987. P100L27 P100L28 [7] Mockapetris, P., "Domain Names - Implementation and P100L29 Specification", STD 13, RFC 1035, November 1987. P100L30 P100L31 [8] Braden, R., "Requirements for Internet Hosts - Application and P100L32 Support", STD 3, RFC 1123, October 1989. P100L33 P100L34 [9] Resnick, P., "Internet Message Format", RFC 2822, April 2001. P100L35 P100L36 Informative References P100L37 P100L38 [10] Clark, D. and D. Tennenhouse, "Architectural Considerations for P100L39 a New Generation of Protocols," in SIGCOMM Symposium on P100L40 Communications Architectures and Protocols , (Philadelphia, P100L41 Pennsylvania), pp. 200--208, IEEE Computer Communications P100L42 Review, Vol. 20(4), September 1990. P100L43 P100L44 [11] Schulzrinne, H., "Issues in designing a transport protocol for P100L45 audio and video conferences and other multiparticipant real-time P100L46 applications." expired Internet Draft, October 1993. P100L47 P100L48 P101L1 [12] Comer, D., Internetworking with TCP/IP , vol. 1. Englewood P101L2 Cliffs, New Jersey: Prentice Hall, 1991. P101L3 P101L4 [13] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., P101L5 Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: P101L6 Session Initiation Protocol", RFC 3261, June 2002. P101L7 P101L8 [14] International Telecommunication Union, "Visual telephone systems P101L9 and equipment for local area networks which provide a non- P101L10 guaranteed quality of service", Recommendation H.323, P101L11 Telecommunication Standardization Sector of ITU, Geneva, P101L12 Switzerland, July 2003. P101L13 P101L14 [15] Handley, M. and V. Jacobson, "SDP: Session Description P101L15 Protocol", RFC 2327, April 1998. P101L16 P101L17 [16] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming P101L18 Protocol (RTSP)", RFC 2326, April 1998. P101L19 P101L20 [17] Eastlake 3rd, D., Crocker, S. and J. Schiller, "Randomness P101L21 Recommendations for Security", RFC 1750, December 1994. P101L22 P101L23 [18] Bolot, J.-C., Turletti, T. and I. Wakeman, "Scalable Feedback P101L24 Control for Multicast Video Distribution in the Internet", in P101L25 SIGCOMM Symposium on Communications Architectures and Protocols, P101L26 (London, England), pp. 58--67, ACM, August 1994. P101L27 P101L28 [19] Busse, I., Deffner, B. and H. Schulzrinne, "Dynamic QoS Control P101L29 of Multimedia Applications Based on RTP", Computer P101L30 Communications , vol. 19, pp. 49--58, January 1996. P101L31 P101L32 [20] Floyd, S. and V. Jacobson, "The Synchronization of Periodic P101L33 Routing Messages", in SIGCOMM Symposium on Communications P101L34 Architectures and Protocols (D. P. Sidhu, ed.), (San Francisco, P101L35 California), pp. 33--44, ACM, September 1993. Also in [34]. P101L36 P101L37 [21] Rosenberg, J. and H. Schulzrinne, "Sampling of the Group P101L38 Membership in RTP", RFC 2762, February 2000. P101L39 P101L40 [22] Cadzow, J., Foundations of Digital Signal Processing and Data P101L41 Analysis New York, New York: Macmillan, 1987. P101L42 P101L43 [23] Hinden, R. and S. Deering, "Internet Protocol Version 6 (IPv6) P101L44 Addressing Architecture", RFC 3513, April 2003. P101L45 P101L46 [24] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G. and E. P101L47 Lear, "Address Allocation for Private Internets", RFC 1918, P101L48 February 1996. P102L1 [25] Lear, E., Fair, E., Crocker, D. and T. Kessler, "Network 10 P102L2 Considered Harmful (Some Practices Shouldn't be Codified)", RFC P102L3 1627, July 1994. P102L4 P102L5 [26] Feller, W., An Introduction to Probability Theory and its P102L6 Applications, vol. 1. New York, New York: John Wiley and Sons, P102L7 third ed., 1968. P102L8 P102L9 [27] Kent, S. and R. Atkinson, "Security Architecture for the P102L10 Internet Protocol", RFC 2401, November 1998. P102L11 P102L12 [28] Baugher, M., Blom, R., Carrara, E., McGrew, D., Naslund, M., P102L13 Norrman, K. and D. Oran, "Secure Real-time Transport Protocol", P102L14 Work in Progress, April 2003. P102L15 P102L16 [29] Balenson, D., "Privacy Enhancement for Internet Electronic Mail: P102L17 Part III", RFC 1423, February 1993. P102L18 P102L19 [30] Voydock, V. and S. Kent, "Security Mechanisms in High-Level P102L20 Network Protocols", ACM Computing Surveys, vol. 15, pp. 135-171, P102L21 June 1983. P102L22 P102L23 [31] Floyd, S., "Congestion Control Principles", BCP 41, RFC 2914, P102L24 September 2000. P102L25 P102L26 [32] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April P102L27 1992. P102L28 P102L29 [33] Stubblebine, S., "Security Services for Multimedia P102L30 Conferencing", in 16th National Computer Security Conference, P102L31 (Baltimore, Maryland), pp. 391--395, September 1993. P102L32 P102L33 [34] Floyd, S. and V. Jacobson, "The Synchronization of Periodic P102L34 Routing Messages", IEEE/ACM Transactions on Networking, vol. 2, P102L35 pp. 122--136, April 1994. P102L36 P102L37 P102L38 P102L39 P102L40 P102L41 P102L42 P102L43 P102L44 P102L45 P102L46 P102L47 P102L48 P103L1 Authors' Addresses P103L2 P103L3 Henning Schulzrinne P103L4 Department of Computer Science P103L5 Columbia University P103L6 1214 Amsterdam Avenue P103L7 New York, NY 10027 P103L8 United States P103L9 P103L10 EMail: schulzrinne@cs.columbia.edu P103L11 P103L12 P103L13 Stephen L. Casner P103L14 Packet Design P103L15 3400 Hillview Avenue, Building 3 P103L16 Palo Alto, CA 94304 P103L17 United States P103L18 P103L19 EMail: casner@acm.org P103L20 P103L21 P103L22 Ron Frederick P103L23 Blue Coat Systems Inc. P103L24 650 Almanor Avenue P103L25 Sunnyvale, CA 94085 P103L26 United States P103L27 P103L28 EMail: ronf@bluecoat.com P103L29 P103L30 P103L31 Van Jacobson P103L32 Packet Design P103L33 3400 Hillview Avenue, Building 3 P103L34 Palo Alto, CA 94304 P103L35 United States P103L36 P103L37 EMail: van@packetdesign.com P103L38 P103L39 P103L40 P103L41 P103L42 P103L43 P103L44 P103L45 P103L46 P103L47 P103L48 P104L1 Full Copyright Statement P104L2 P104L3 Copyright (C) The Internet Society (2003). All Rights Reserved. P104L4 P104L5 This document and translations of it may be copied and furnished to P104L6 others, and derivative works that comment on or otherwise explain it P104L7 or assist in its implementation may be prepared, copied, published P104L8 and distributed, in whole or in part, without restriction of any P104L9 kind, provided that the above copyright notice and this paragraph are P104L10 included on all such copies and derivative works. However, this P104L11 document itself may not be modified in any way, such as by removing P104L12 the copyright notice or references to the Internet Society or other P104L13 Internet organizations, except as needed for the purpose of P104L14 developing Internet standards in which case the procedures for P104L15 copyrights defined in the Internet Standards process must be P104L16 followed, or as required to translate it into languages other than P104L17 English. P104L18 P104L19 The limited permissions granted above are perpetual and will not be P104L20 revoked by the Internet Society or its successors or assigns. P104L21 P104L22 This document and the information contained herein is provided on an P104L23 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING P104L24 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING P104L25 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION P104L26 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF P104L27 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. P104L28 P104L29 Acknowledgement P104L30 P104L31 Funding for the RFC Editor function is currently provided by the P104L32 Internet Society. P104L33 P104L34 P104L35 P104L36 P104L37 P104L38 P104L39 P104L40 P104L41 P104L42 P104L43 P104L44 P104L45 P104L46 P104L47 P104L48