Network Working Group J. Rosenberg Request for Comments: 3261 dynamicsoft Obsoletes: 2543 H. Schulzrinne Category: Standards Track Columbia U. G. Camarillo Ericsson A. Johnston WorldCom J. Peterson Neustar R. Sparks dynamicsoft M. Handley ICIR E. Schooler AT&T June 2002 SIP: Session Initiation Protocol Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document describes Session Initiation Protocol (SIP), an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. These sessions include Internet telephone calls, multimedia distribution, and multimedia conferences. SIP invitations used to create sessions carry session descriptions that allow participants to agree on a set of compatible media types. SIP makes use of elements called proxy servers to help route requests to the user's current location, authenticate and authorize users for services, implement provider call-routing policies, and provide features to users. SIP also provides a registration function that allows users to upload their current locations for use by proxy servers. SIP runs on top of several different transport protocols. Rosenberg, et. al. Standards Track [Page 1] RFC 3261 SIP: Session Initiation Protocol June 2002 Table of Contents 1 Introduction ........................................ 8 2 Overview of SIP Functionality ....................... 9 3 Terminology ......................................... 10 4 Overview of Operation ............................... 10 5 Structure of the Protocol ........................... 18 6 Definitions ......................................... 20 7 SIP Messages ........................................ 26 7.1 Requests ............................................ 27 7.2 Responses ........................................... 28 7.3 Header Fields ....................................... 29 7.3.1 Header Field Format ................................. 30 7.3.2 Header Field Classification ......................... 32 7.3.3 Compact Form ........................................ 32 7.4 Bodies .............................................. 33 7.4.1 Message Body Type ................................... 33 7.4.2 Message Body Length ................................. 33 7.5 Framing SIP Messages ................................ 34 8 General User Agent Behavior ......................... 34 8.1 UAC Behavior ........................................ 35 8.1.1 Generating the Request .............................. 35 8.1.1.1 Request-URI ......................................... 35 8.1.1.2 To .................................................. 36 8.1.1.3 From ................................................ 37 8.1.1.4 Call-ID ............................................. 37 8.1.1.5 CSeq ................................................ 38 8.1.1.6 Max-Forwards ........................................ 38 8.1.1.7 Via ................................................. 39 8.1.1.8 Contact ............................................. 40 8.1.1.9 Supported and Require ............................... 40 8.1.1.10 Additional Message Components ....................... 41 8.1.2 Sending the Request ................................. 41 8.1.3 Processing Responses ................................ 42 8.1.3.1 Transaction Layer Errors ............................ 42 8.1.3.2 Unrecognized Responses .............................. 42 8.1.3.3 Vias ................................................ 43 8.1.3.4 Processing 3xx Responses ............................ 43 8.1.3.5 Processing 4xx Responses ............................ 45 8.2 UAS Behavior ........................................ 46 8.2.1 Method Inspection ................................... 46 8.2.2 Header Inspection ................................... 46 8.2.2.1 To and Request-URI .................................. 46 8.2.2.2 Merged Requests ..................................... 47 8.2.2.3 Require ............................................. 47 8.2.3 Content Processing .................................. 48 8.2.4 Applying Extensions ................................. 49 8.2.5 Processing the Request .............................. 49 Rosenberg, et. al. Standards Track [Page 2] RFC 3261 SIP: Session Initiation Protocol June 2002 8.2.6 Generating the Response ............................. 49 8.2.6.1 Sending a Provisional Response ...................... 49 8.2.6.2 Headers and Tags .................................... 50 8.2.7 Stateless UAS Behavior .............................. 50 8.3 Redirect Servers .................................... 51 9 Canceling a Request ................................. 53 9.1 Client Behavior ..................................... 53 9.2 Server Behavior ..................................... 55 10 Registrations ....................................... 56 10.1 Overview ............................................ 56 10.2 Constructing the REGISTER Request ................... 57 10.2.1 Adding Bindings ..................................... 59 10.2.1.1 Setting the Expiration Interval of Contact Addresses 60 10.2.1.2 Preferences among Contact Addresses ................. 61 10.2.2 Removing Bindings ................................... 61 10.2.3 Fetching Bindings ................................... 61 10.2.4 Refreshing Bindings ................................. 61 10.2.5 Setting the Internal Clock .......................... 62 10.2.6 Discovering a Registrar ............................. 62 10.2.7 Transmitting a Request .............................. 62 10.2.8 Error Responses ..................................... 63 10.3 Processing REGISTER Requests ........................ 63 11 Querying for Capabilities ........................... 66 11.1 Construction of OPTIONS Request ..................... 67 11.2 Processing of OPTIONS Request ....................... 68 12 Dialogs ............................................. 69 12.1 Creation of a Dialog ................................ 70 12.1.1 UAS behavior ........................................ 70 12.1.2 UAC Behavior ........................................ 71 12.2 Requests within a Dialog ............................ 72 12.2.1 UAC Behavior ........................................ 73 12.2.1.1 Generating the Request .............................. 73 12.2.1.2 Processing the Responses ............................ 75 12.2.2 UAS Behavior ........................................ 76 12.3 Termination of a Dialog ............................. 77 13 Initiating a Session ................................ 77 13.1 Overview ............................................ 77 13.2 UAC Processing ...................................... 78 13.2.1 Creating the Initial INVITE ......................... 78 13.2.2 Processing INVITE Responses ......................... 81 13.2.2.1 1xx Responses ....................................... 81 13.2.2.2 3xx Responses ....................................... 81 13.2.2.3 4xx, 5xx and 6xx Responses .......................... 81 13.2.2.4 2xx Responses ....................................... 82 13.3 UAS Processing ...................................... 83 13.3.1 Processing of the INVITE ............................ 83 13.3.1.1 Progress ............................................ 84 13.3.1.2 The INVITE is Redirected ............................ 84 Rosenberg, et. al. Standards Track [Page 3] RFC 3261 SIP: Session Initiation Protocol June 2002 13.3.1.3 The INVITE is Rejected .............................. 85 13.3.1.4 The INVITE is Accepted .............................. 85 14 Modifying an Existing Session ....................... 86 14.1 UAC Behavior ........................................ 86 14.2 UAS Behavior ........................................ 88 15 Terminating a Session ............................... 89 15.1 Terminating a Session with a BYE Request ............ 90 15.1.1 UAC Behavior ........................................ 90 15.1.2 UAS Behavior ........................................ 91 16 Proxy Behavior ...................................... 91 16.1 Overview ............................................ 91 16.2 Stateful Proxy ...................................... 92 16.3 Request Validation .................................. 94 16.4 Route Information Preprocessing ..................... 96 16.5 Determining Request Targets ......................... 97 16.6 Request Forwarding .................................. 99 16.7 Response Processing ................................. 107 16.8 Processing Timer C .................................. 114 16.9 Handling Transport Errors ........................... 115 16.10 CANCEL Processing ................................... 115 16.11 Stateless Proxy ..................................... 116 16.12 Summary of Proxy Route Processing ................... 118 16.12.1 Examples ............................................ 118 16.12.1.1 Basic SIP Trapezoid ................................. 118 16.12.1.2 Traversing a Strict-Routing Proxy ................... 120 16.12.1.3 Rewriting Record-Route Header Field Values .......... 121 17 Transactions ........................................ 122 17.1 Client Transaction .................................. 124 17.1.1 INVITE Client Transaction ........................... 125 17.1.1.1 Overview of INVITE Transaction ...................... 125 17.1.1.2 Formal Description .................................. 125 17.1.1.3 Construction of the ACK Request ..................... 129 17.1.2 Non-INVITE Client Transaction ....................... 130 17.1.2.1 Overview of the non-INVITE Transaction .............. 130 17.1.2.2 Formal Description .................................. 131 17.1.3 Matching Responses to Client Transactions ........... 132 17.1.4 Handling Transport Errors ........................... 133 17.2 Server Transaction .................................. 134 17.2.1 INVITE Server Transaction ........................... 134 17.2.2 Non-INVITE Server Transaction ....................... 137 17.2.3 Matching Requests to Server Transactions ............ 138 17.2.4 Handling Transport Errors ........................... 141 18 Transport ........................................... 141 18.1 Clients ............................................. 142 18.1.1 Sending Requests .................................... 142 18.1.2 Receiving Responses ................................. 144 18.2 Servers ............................................. 145 18.2.1 Receiving Requests .................................. 145 Rosenberg, et. al. Standards Track [Page 4] RFC 3261 SIP: Session Initiation Protocol June 2002 18.2.2 Sending Responses ................................... 146 18.3 Framing ............................................. 147 18.4 Error Handling ...................................... 147 19 Common Message Components ........................... 147 19.1 SIP and SIPS Uniform Resource Indicators ............ 148 19.1.1 SIP and SIPS URI Components ......................... 148 19.1.2 Character Escaping Requirements ..................... 152 19.1.3 Example SIP and SIPS URIs ........................... 153 19.1.4 URI Comparison ...................................... 153 19.1.5 Forming Requests from a URI ......................... 156 19.1.6 Relating SIP URIs and tel URLs ...................... 157 19.2 Option Tags ......................................... 158 19.3 Tags ................................................ 159 20 Header Fields ....................................... 159 20.1 Accept .............................................. 161 20.2 Accept-Encoding ..................................... 163 20.3 Accept-Language ..................................... 164 20.4 Alert-Info .......................................... 164 20.5 Allow ............................................... 165 20.6 Authentication-Info ................................. 165 20.7 Authorization ....................................... 165 20.8 Call-ID ............................................. 166 20.9 Call-Info ........................................... 166 20.10 Contact ............................................. 167 20.11 Content-Disposition ................................. 168 20.12 Content-Encoding .................................... 169 20.13 Content-Language .................................... 169 20.14 Content-Length ...................................... 169 20.15 Content-Type ........................................ 170 20.16 CSeq ................................................ 170 20.17 Date ................................................ 170 20.18 Error-Info .......................................... 171 20.19 Expires ............................................. 171 20.20 From ................................................ 172 20.21 In-Reply-To ......................................... 172 20.22 Max-Forwards ........................................ 173 20.23 Min-Expires ......................................... 173 20.24 MIME-Version ........................................ 173 20.25 Organization ........................................ 174 20.26 Priority ............................................ 174 20.27 Proxy-Authenticate .................................. 174 20.28 Proxy-Authorization ................................. 175 20.29 Proxy-Require ....................................... 175 20.30 Record-Route ........................................ 175 20.31 Reply-To ............................................ 176 20.32 Require ............................................. 176 20.33 Retry-After ......................................... 176 20.34 Route ............................................... 177 Rosenberg, et. al. Standards Track [Page 5] RFC 3261 SIP: Session Initiation Protocol June 2002 20.35 Server .............................................. 177 20.36 Subject ............................................. 177 20.37 Supported ........................................... 178 20.38 Timestamp ........................................... 178 20.39 To .................................................. 178 20.40 Unsupported ......................................... 179 20.41 User-Agent .......................................... 179 20.42 Via ................................................. 179 20.43 Warning ............................................. 180 20.44 WWW-Authenticate .................................... 182 21 Response Codes ...................................... 182 21.1 Provisional 1xx ..................................... 182 21.1.1 100 Trying .......................................... 183 21.1.2 180 Ringing ......................................... 183 21.1.3 181 Call Is Being Forwarded ......................... 183 21.1.4 182 Queued .......................................... 183 21.1.5 183 Session Progress ................................ 183 21.2 Successful 2xx ...................................... 183 21.2.1 200 OK .............................................. 183 21.3 Redirection 3xx ..................................... 184 21.3.1 300 Multiple Choices ................................ 184 21.3.2 301 Moved Permanently ............................... 184 21.3.3 302 Moved Temporarily ............................... 184 21.3.4 305 Use Proxy ....................................... 185 21.3.5 380 Alternative Service ............................. 185 21.4 Request Failure 4xx ................................. 185 21.4.1 400 Bad Request ..................................... 185 21.4.2 401 Unauthorized .................................... 185 21.4.3 402 Payment Required ................................ 186 21.4.4 403 Forbidden ....................................... 186 21.4.5 404 Not Found ....................................... 186 21.4.6 405 Method Not Allowed .............................. 186 21.4.7 406 Not Acceptable .................................. 186 21.4.8 407 Proxy Authentication Required ................... 186 21.4.9 408 Request Timeout ................................. 186 21.4.10 410 Gone ............................................ 187 21.4.11 413 Request Entity Too Large ........................ 187 21.4.12 414 Request-URI Too Long ............................ 187 21.4.13 415 Unsupported Media Type .......................... 187 21.4.14 416 Unsupported URI Scheme .......................... 187 21.4.15 420 Bad Extension ................................... 187 21.4.16 421 Extension Required .............................. 188 21.4.17 423 Interval Too Brief .............................. 188 21.4.18 480 Temporarily Unavailable ......................... 188 21.4.19 481 Call/Transaction Does Not Exist ................. 188 21.4.20 482 Loop Detected ................................... 188 21.4.21 483 Too Many Hops ................................... 189 21.4.22 484 Address Incomplete .............................. 189 Rosenberg, et. al. Standards Track [Page 6] RFC 3261 SIP: Session Initiation Protocol June 2002 21.4.23 485 Ambiguous ....................................... 189 21.4.24 486 Busy Here ....................................... 189 21.4.25 487 Request Terminated .............................. 190 21.4.26 488 Not Acceptable Here ............................. 190 21.4.27 491 Request Pending ................................. 190 21.4.28 493 Undecipherable .................................. 190 21.5 Server Failure 5xx .................................. 190 21.5.1 500 Server Internal Error ........................... 190 21.5.2 501 Not Implemented ................................. 191 21.5.3 502 Bad Gateway ..................................... 191 21.5.4 503 Service Unavailable ............................. 191 21.5.5 504 Server Time-out ................................. 191 21.5.6 505 Version Not Supported ........................... 192 21.5.7 513 Message Too Large ............................... 192 21.6 Global Failures 6xx ................................. 192 21.6.1 600 Busy Everywhere ................................. 192 21.6.2 603 Decline ......................................... 192 21.6.3 604 Does Not Exist Anywhere ......................... 192 21.6.4 606 Not Acceptable .................................. 192 22 Usage of HTTP Authentication ........................ 193 22.1 Framework ........................................... 193 22.2 User-to-User Authentication ......................... 195 22.3 Proxy-to-User Authentication ........................ 197 22.4 The Digest Authentication Scheme .................... 199 23 S/MIME .............................................. 201 23.1 S/MIME Certificates ................................. 201 23.2 S/MIME Key Exchange ................................. 202 23.3 Securing MIME bodies ................................ 205 23.4 SIP Header Privacy and Integrity using S/MIME: Tunneling SIP ....................................... 207 23.4.1 Integrity and Confidentiality Properties of SIP Headers ............................................. 207 23.4.1.1 Integrity ........................................... 207 23.4.1.2 Confidentiality ..................................... 208 23.4.2 Tunneling Integrity and Authentication .............. 209 23.4.3 Tunneling Encryption ................................ 211 24 Examples ............................................ 213 24.1 Registration ........................................ 213 24.2 Session Setup ....................................... 214 25 Augmented BNF for the SIP Protocol .................. 219 25.1 Basic Rules ......................................... 219 26 Security Considerations: Threat Model and Security Usage Recommendations ............................... 232 26.1 Attacks and Threat Models ........................... 233 26.1.1 Registration Hijacking .............................. 233 26.1.2 Impersonating a Server .............................. 234 26.1.3 Tampering with Message Bodies ....................... 235 26.1.4 Tearing Down Sessions ............................... 235 Rosenberg, et. al. Standards Track [Page 7] RFC 3261 SIP: Session Initiation Protocol June 2002 26.1.5 Denial of Service and Amplification ................. 236 26.2 Security Mechanisms ................................. 237 26.2.1 Transport and Network Layer Security ................ 238 26.2.2 SIPS URI Scheme ..................................... 239 26.2.3 HTTP Authentication ................................. 240 26.2.4 S/MIME .............................................. 240 26.3 Implementing Security Mechanisms .................... 241 26.3.1 Requirements for Implementers of SIP ................ 241 26.3.2 Security Solutions .................................. 242 26.3.2.1 Registration ........................................ 242 26.3.2.2 Interdomain Requests ................................ 243 26.3.2.3 Peer-to-Peer Requests ............................... 245 26.3.2.4 DoS Protection ...................................... 246 26.4 Limitations ......................................... 247 26.4.1 HTTP Digest ......................................... 247 26.4.2 S/MIME .............................................. 248 26.4.3 TLS ................................................. 249 26.4.4 SIPS URIs ........................................... 249 26.5 Privacy ............................................. 251 27 IANA Considerations ................................. 252 27.1 Option Tags ......................................... 252 27.2 Warn-Codes .......................................... 252 27.3 Header Field Names .................................. 253 27.4 Method and Response Codes ........................... 253 27.5 The "message/sip" MIME type. ....................... 254 27.6 New Content-Disposition Parameter Registrations ..... 255 28 Changes From RFC 2543 ............................... 255 28.1 Major Functional Changes ............................ 255 28.2 Minor Functional Changes ............................ 260 29 Normative References ................................ 261 30 Informative References .............................. 262 A Table of Timer Values ............................... 265 Acknowledgments ................................................ 266 Authors' Addresses ............................................. 267 Full Copyright Statement ....................................... 269 1 Introduction There are many applications of the Internet that require the creation and management of a session, where a session is considered an exchange of data between an association of participants. The implementation of these applications is complicated by the practices of participants: users may move between endpoints, they may be addressable by multiple names, and they may communicate in several different media - sometimes simultaneously. Numerous protocols have been authored that carry various forms of real-time multimedia session data such as voice, video, or text messages. The Session Initiation Protocol (SIP) works in concert with these protocols by Rosenberg, et. al. Standards Track [Page 8] RFC 3261 SIP: Session Initiation Protocol June 2002 enabling Internet endpoints (called user agents) to discover one another and to agree on a characterization of a session they would like to share. For locating prospective session participants, and for other functions, SIP enables the creation of an infrastructure of network hosts (called proxy servers) to which user agents can send registrations, invitations to sessions, and other requests. SIP is an agile, general-purpose tool for creating, modifying, and terminating sessions that works independently of underlying transport protocols and without dependency on the type of session that is being established. 2 Overview of SIP Functionality SIP is an application-layer control protocol that can establish, modify, and terminate multimedia sessions (conferences) such as Internet telephony calls. SIP can also invite participants to already existing sessions, such as multicast conferences. Media can be added to (and removed from) an existing session. SIP transparently supports name mapping and redirection services, which supports personal mobility [27] - users can maintain a single externally visible identifier regardless of their network location. SIP supports five facets of establishing and terminating multimedia communications: User location: determination of the end system to be used for communication; User availability: determination of the willingness of the called party to engage in communications; User capabilities: determination of the media and media parameters to be used; Session setup: "ringing", establishment of session parameters at both called and calling party; Session management: including transfer and termination of sessions, modifying session parameters, and invoking services. SIP is not a vertically integrated communications system. SIP is rather a component that can be used with other IETF protocols to build a complete multimedia architecture. Typically, these architectures will include protocols such as the Real-time Transport Protocol (RTP) (RFC 1889 [28]) for transporting real-time data and providing QoS feedback, the Real-Time streaming protocol (RTSP) (RFC 2326 [29]) for controlling delivery of streaming media, the Media Rosenberg, et. al. Standards Track [Page 9] RFC 3261 SIP: Session Initiation Protocol June 2002 Gateway Control Protocol (MEGACO) (RFC 3015 [30]) for controlling gateways to the Public Switched Telephone Network (PSTN), and the Session Description Protocol (SDP) (RFC 2327 [1]) for describing multimedia sessions. Therefore, SIP should be used in conjunction with other protocols in order to provide complete services to the users. However, the basic functionality and operation of SIP does not depend on any of these protocols. SIP does not provide services. Rather, SIP provides primitives that can be used to implement different services. For example, SIP can locate a user and deliver an opaque object to his current location. If this primitive is used to deliver a session description written in SDP, for instance, the endpoints can agree on the parameters of a session. If the same primitive is used to deliver a photo of the caller as well as the session description, a "caller ID" service can be easily implemented. As this example shows, a single primitive is typically used to provide several different services. SIP does not offer conference control services such as floor control or voting and does not prescribe how a conference is to be managed. SIP can be used to initiate a session that uses some other conference control protocol. Since SIP messages and the sessions they establish can pass through entirely different networks, SIP cannot, and does not, provide any kind of network resource reservation capabilities. The nature of the services provided make security particularly important. To that end, SIP provides a suite of security services, which include denial-of-service prevention, authentication (both user to user and proxy to user), integrity protection, and encryption and privacy services. SIP works with both IPv4 and IPv6. 3 Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [2] and indicate requirement levels for compliant SIP implementations. 4 Overview of Operation This section introduces the basic operations of SIP using simple examples. This section is tutorial in nature and does not contain any normative statements. Rosenberg, et. al. Standards Track [Page 10] RFC 3261 SIP: Session Initiation Protocol June 2002 The first example shows the basic functions of SIP: location of an end point, signal of a desire to communicate, negotiation of session parameters to establish the session, and teardown of the session once established. Figure 1 shows a typical example of a SIP message exchange between two users, Alice and Bob. (Each message is labeled with the letter "F" and a number for reference by the text.) In this example, Alice uses a SIP application on her PC (referred to as a softphone) to call Bob on his SIP phone over the Internet. Also shown are two SIP proxy servers that act on behalf of Alice and Bob to facilitate the session establishment. This typical arrangement is often referred to as the "SIP trapezoid" as shown by the geometric shape of the dotted lines in Figure 1. Alice "calls" Bob using his SIP identity, a type of Uniform Resource Identifier (URI) called a SIP URI. SIP URIs are defined in Section 19.1. It has a similar form to an email address, typically containing a username and a host name. In this case, it is sip:bob@biloxi.com, where biloxi.com is the domain of Bob's SIP service provider. Alice has a SIP URI of sip:alice@atlanta.com. Alice might have typed in Bob's URI or perhaps clicked on a hyperlink or an entry in an address book. SIP also provides a secure URI, called a SIPS URI. An example would be sips:bob@biloxi.com. A call made to a SIPS URI guarantees that secure, encrypted transport (namely TLS) is used to carry all SIP messages from the caller to the domain of the callee. From there, the request is sent securely to the callee, but with security mechanisms that depend on the policy of the domain of the callee. SIP is based on an HTTP-like request/response transaction model. Each transaction consists of a request that invokes a particular method, or function, on the server and at least one response. In this example, the transaction begins with Alice's softphone sending an INVITE request addressed to Bob's SIP URI. INVITE is an example of a SIP method that specifies the action that the requestor (Alice) wants the server (Bob) to take. The INVITE request contains a number of header fields. Header fields are named attributes that provide additional information about a message. The ones present in an INVITE include a unique identifier for the call, the destination address, Alice's address, and information about the type of session that Alice wishes to establish with Bob. The INVITE (message F1 in Figure 1) might look like this: Rosenberg, et. al. Standards Track [Page 11] RFC 3261 SIP: Session Initiation Protocol June 2002 atlanta.com . . . biloxi.com . proxy proxy . . . Alice's . . . . . . . . . . . . . . . . . . . . Bob's softphone SIP Phone | | | | | INVITE F1 | | | |--------------->| INVITE F2 | | | 100 Trying F3 |--------------->| INVITE F4 | |<---------------| 100 Trying F5 |--------------->| | |<-------------- | 180 Ringing F6 | | | 180 Ringing F7 |<---------------| | 180 Ringing F8 |<---------------| 200 OK F9 | |<---------------| 200 OK F10 |<---------------| | 200 OK F11 |<---------------| | |<---------------| | | | ACK F12 | |------------------------------------------------->| | Media Session | |<================================================>| | BYE F13 | |<-------------------------------------------------| | 200 OK F14 | |------------------------------------------------->| | | Figure 1: SIP session setup example with SIP trapezoid INVITE sip:bob@biloxi.com SIP/2.0 Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bK776asdhds Max-Forwards: 70 To: Bob From: Alice ;tag=1928301774 Call-ID: a84b4c76e66710@pc33.atlanta.com CSeq: 314159 INVITE Contact: Content-Type: application/sdp Content-Length: 142 (Alice's SDP not shown) The first line of the text-encoded message contains the method name (INVITE). The lines that follow are a list of header fields. This example contains a minimum required set. The header fields are briefly described below: Rosenberg, et. al. Standards Track [Page 12] RFC 3261 SIP: Session Initiation Protocol June 2002 Via contains the address (pc33.atlanta.com) at which Alice is expecting to receive responses to this request. It also contains a branch parameter that identifies this transaction. To contains a display name (Bob) and a SIP or SIPS URI (sip:bob@biloxi.com) towards which the request was originally directed. Display names are described in RFC 2822 [3]. From also contains a display name (Alice) and a SIP or SIPS URI (sip:alice@atlanta.com) that indicate the originator of the request. This header field also has a tag parameter containing a random string (1928301774) that was added to the URI by the softphone. It is used for identification purposes. Call-ID contains a globally unique identifier for this call, generated by the combination of a random string and the softphone's host name or IP address. The combination of the To tag, From tag, and Call-ID completely defines a peer-to-peer SIP relationship between Alice and Bob and is referred to as a dialog. CSeq or Command Sequence contains an integer and a method name. The CSeq number is incremented for each new request within a dialog and is a traditional sequence number. Contact contains a SIP or SIPS URI that represents a direct route to contact Alice, usually composed of a username at a fully qualified domain name (FQDN). While an FQDN is preferred, many end systems do not have registered domain names, so IP addresses are permitted. While the Via header field tells other elements where to send the response, the Contact header field tells other elements where to send future requests. Max-Forwards serves to limit the number of hops a request can make on the way to its destination. It consists of an integer that is decremented by one at each hop. Content-Type contains a description of the message body (not shown). Content-Length contains an octet (byte) count of the message body. The complete set of SIP header fields is defined in Section 20. The details of the session, such as the type of media, codec, or sampling rate, are not described using SIP. Rather, the body of a SIP message contains a description of the session, encoded in some other protocol format. One such format is the Session Description Protocol (SDP) (RFC 2327 [1]). This SDP message (not shown in the Rosenberg, et. al. Standards Track [Page 13] RFC 3261 SIP: Session Initiation Protocol June 2002 example) is carried by the SIP message in a way that is analogous to a document attachment being carried by an email message, or a web page being carried in an HTTP message. Since the softphone does not know the location of Bob or the SIP server in the biloxi.com domain, the softphone sends the INVITE to the SIP server that serves Alice's domain, atlanta.com. The address of the atlanta.com SIP server could have been configured in Alice's softphone, or it could have been discovered by DHCP, for example. The atlanta.com SIP server is a type of SIP server known as a proxy server. A proxy server receives SIP requests and forwards them on behalf of the requestor. In this example, the proxy server receives the INVITE request and sends a 100 (Trying) response back to Alice's softphone. The 100 (Trying) response indicates that the INVITE has been received and that the proxy is working on her behalf to route the INVITE to the destination. Responses in SIP use a three-digit code followed by a descriptive phrase. This response contains the same To, From, Call-ID, CSeq and branch parameter in the Via as the INVITE, which allows Alice's softphone to correlate this response to the sent INVITE. The atlanta.com proxy server locates the proxy server at biloxi.com, possibly by performing a particular type of DNS (Domain Name Service) lookup to find the SIP server that serves the biloxi.com domain. This is described in [4]. As a result, it obtains the IP address of the biloxi.com proxy server and forwards, or proxies, the INVITE request there. Before forwarding the request, the atlanta.com proxy server adds an additional Via header field value that contains its own address (the INVITE already contains Alice's address in the first Via). The biloxi.com proxy server receives the INVITE and responds with a 100 (Trying) response back to the atlanta.com proxy server to indicate that it has received the INVITE and is processing the request. The proxy server consults a database, generically called a location service, that contains the current IP address of Bob. (We shall see in the next section how this database can be populated.) The biloxi.com proxy server adds another Via header field value with its own address to the INVITE and proxies it to Bob's SIP phone. Bob's SIP phone receives the INVITE and alerts Bob to the incoming call from Alice so that Bob can decide whether to answer the call, that is, Bob's phone rings. Bob's SIP phone indicates this in a 180 (Ringing) response, which is routed back through the two proxies in the reverse direction. Each proxy uses the Via header field to determine where to send the response and removes its own address from the top. As a result, although DNS and location service lookups were required to route the initial INVITE, the 180 (Ringing) response can be returned to the caller without lookups or without state being Rosenberg, et. al. Standards Track [Page 14] RFC 3261 SIP: Session Initiation Protocol June 2002 maintained in the proxies. This also has the desirable property that each proxy that sees the INVITE will also see all responses to the INVITE. When Alice's softphone receives the 180 (Ringing) response, it passes this information to Alice, perhaps using an audio ringback tone or by displaying a message on Alice's screen. In this example, Bob decides to answer the call. When he picks up the handset, his SIP phone sends a 200 (OK) response to indicate that the call has been answered. The 200 (OK) contains a message body with the SDP media description of the type of session that Bob is willing to establish with Alice. As a result, there is a two-phase exchange of SDP messages: Alice sent one to Bob, and Bob sent one back to Alice. This two-phase exchange provides basic negotiation capabilities and is based on a simple offer/answer model of SDP exchange. If Bob did not wish to answer the call or was busy on another call, an error response would have been sent instead of the 200 (OK), which would have resulted in no media session being established. The complete list of SIP response codes is in Section 21. The 200 (OK) (message F9 in Figure 1) might look like this as Bob sends it out: SIP/2.0 200 OK Via: SIP/2.0/UDP server10.biloxi.com ;branch=z9hG4bKnashds8;received=192.0.2.3 Via: SIP/2.0/UDP bigbox3.site3.atlanta.com ;branch=z9hG4bK77ef4c2312983.1;received=192.0.2.2 Via: SIP/2.0/UDP pc33.atlanta.com ;branch=z9hG4bK776asdhds ;received=192.0.2.1 To: Bob ;tag=a6c85cf From: Alice ;tag=1928301774 Call-ID: a84b4c76e66710@pc33.atlanta.com CSeq: 314159 INVITE Contact: Content-Type: application/sdp Content-Length: 131 (Bob's SDP not shown) The first line of the response contains the response code (200) and the reason phrase (OK). The remaining lines contain header fields. The Via, To, From, Call-ID, and CSeq header fields are copied from the INVITE request. (There are three Via header field values - one added by Alice's SIP phone, one added by the atlanta.com proxy, and one added by the biloxi.com proxy.) Bob's SIP phone has added a tag parameter to the To header field. This tag will be incorporated by both endpoints into the dialog and will be included in all future Rosenberg, et. al. Standards Track [Page 15] RFC 3261 SIP: Session Initiation Protocol June 2002 requests and responses in this call. The Contact header field contains a URI at which Bob can be directly reached at his SIP phone. The Content-Type and Content-Length refer to the message body (not shown) that contains Bob's SDP media information. In addition to DNS and location service lookups shown in this example, proxy servers can make flexible "routing decisions" to decide where to send a request. For example, if Bob's SIP phone returned a 486 (Busy Here) response, the biloxi.com proxy server could proxy the INVITE to Bob's voicemail server. A proxy server can also send an INVITE to a number of locations at the same time. This type of parallel search is known as forking. In this case, the 200 (OK) is routed back through the two proxies and is received by Alice's softphone, which then stops the ringback tone and indicates that the call has been answered. Finally, Alice's softphone sends an acknowledgement message, ACK, to Bob's SIP phone to confirm the reception of the final response (200 (OK)). In this example, the ACK is sent directly from Alice's softphone to Bob's SIP phone, bypassing the two proxies. This occurs because the endpoints have learned each other's address from the Contact header fields through the INVITE/200 (OK) exchange, which was not known when the initial INVITE was sent. The lookups performed by the two proxies are no longer needed, so the proxies drop out of the call flow. This completes the INVITE/200/ACK three-way handshake used to establish SIP sessions. Full details on session setup are in Section 13. Alice and Bob's media session has now begun, and they send media packets using the format to which they agreed in the exchange of SDP. In general, the end-to-end media packets take a different path from the SIP signaling messages. During the session, either Alice or Bob may decide to change the characteristics of the media session. This is accomplished by sending a re-INVITE containing a new media description. This re- INVITE references the existing dialog so that the other party knows that it is to modify an existing session instead of establishing a new session. The other party sends a 200 (OK) to accept the change. The requestor responds to the 200 (OK) with an ACK. If the other party does not accept the change, he sends an error response such as 488 (Not Acceptable Here), which also receives an ACK. However, the failure of the re-INVITE does not cause the existing call to fail - the session continues using the previously negotiated characteristics. Full details on session modification are in Section 14. Rosenberg, et. al. Standards Track [Page 16] RFC 3261 SIP: Session Initiation Protocol June 2002 At the end of the call, Bob disconnects (hangs up) first and generates a BYE message. This BYE is routed directly to Alice's softphone, again bypassing the proxies. Alice confirms receipt of the BYE with a 200 (OK) response, which terminates the session and the BYE transaction. No ACK is sent - an ACK is only sent in response to a response to an INVITE request. The reasons for this special handling for INVITE will be discussed later, but relate to the reliability mechanisms in SIP, the length of time it can take for a ringing phone to be answered, and forking. For this reason, request handling in SIP is often classified as either INVITE or non- INVITE, referring to all other methods besides INVITE. Full details on session termination are in Section 15. Section 24.2 describes the messages shown in Figure 1 in full. In some cases, it may be useful for proxies in the SIP signaling path to see all the messaging between the endpoints for the duration of the session. For example, if the biloxi.com proxy server wished to remain in the SIP messaging path beyond the initial INVITE, it would add to the INVITE a required routing header field known as Record- Route that contained a URI resolving to the hostname or IP address of the proxy. This information would be received by both Bob's SIP phone and (due to the Record-Route header field being passed back in the 200 (OK)) Alice's softphone and stored for the duration of the dialog. The biloxi.com proxy server would then receive and proxy the ACK, BYE, and 200 (OK) to the BYE. Each proxy can independently decide to receive subsequent messages, and those messages will pass through all proxies that elect to receive it. This capability is frequently used for proxies that are providing mid-call features. Registration is another common operation in SIP. Registration is one way that the biloxi.com server can learn the current location of Bob. Upon initialization, and at periodic intervals, Bob's SIP phone sends REGISTER messages to a server in the biloxi.com domain known as a SIP registrar. The REGISTER messages associate Bob's SIP or SIPS URI (sip:bob@biloxi.com) with the machine into which he is currently logged (conveyed as a SIP or SIPS URI in the Contact header field). The registrar writes this association, also called a binding, to a database, called the location service, where it can be used by the proxy in the biloxi.com domain. Often, a registrar server for a domain is co-located with the proxy for that domain. It is an important concept that the distinction between types of SIP servers is logical, not physical. Bob is not limited to registering from a single device. For example, both his SIP phone at home and the one in the office could send registrations. This information is stored together in the location Rosenberg, et. al. Standards Track [Page 17] RFC 3261 SIP: Session Initiation Protocol June 2002 service and allows a proxy to perform various types of searches to locate Bob. Similarly, more than one user can be registered on a single device at the same time. The location service is just an abstract concept. It generally contains information that allows a proxy to input a URI and receive a set of zero or more URIs that tell the proxy where to send the request. Registrations are one way to create this information, but not the only way. Arbitrary mapping functions can be configured at the discretion of the administrator. Finally, it is important to note that in SIP, registration is used for routing incoming SIP requests and has no role in authorizing outgoing requests. Authorization and authentication are handled in SIP either on a request-by-request basis with a challenge/response mechanism, or by using a lower layer scheme as discussed in Section 26. The complete set of SIP message details for this registration example is in Section 24.1. Additional operations in SIP, such as querying for the capabilities of a SIP server or client using OPTIONS, or canceling a pending request using CANCEL, will be introduced in later sections. 5 Structure of the Protocol SIP is structured as a layered protocol, which means that its behavior is described in terms of a set of fairly independent processing stages with only a loose coupling between each stage. The protocol behavior is described as layers for the purpose of presentation, allowing the description of functions common across elements in a single section. It does not dictate an implementation in any way. When we say that an element "contains" a layer, we mean it is compliant to the set of rules defined by that layer. Not every element specified by the protocol contains every layer. Furthermore, the elements specified by SIP are logical elements, not physical ones. A physical realization can choose to act as different logical elements, perhaps even on a transaction-by-transaction basis. The lowest layer of SIP is its syntax and encoding. Its encoding is specified using an augmented Backus-Naur Form grammar (BNF). The complete BNF is specified in Section 25; an overview of a SIP message's structure can be found in Section 7. Rosenberg, et. al. Standards Track [Page 18] RFC 3261 SIP: Session Initiation Protocol June 2002 The second layer is the transport layer. It defines how a client sends requests and receives responses and how a server receives requests and sends responses over the network. All SIP elements contain a transport layer. The transport layer is described in Section 18. The third layer is the transaction layer. Transactions are a fundamental component of SIP. A transaction is a request sent by a client transaction (using the transport layer) to a server transaction, along with all responses to that request sent from the server transaction back to the client. The transaction layer handles application-layer retransmissions, matching of responses to requests, and application-layer timeouts. Any task that a user agent client (UAC) accomplishes takes place using a series of transactions. Discussion of transactions can be found in Section 17. User agents contain a transaction layer, as do stateful proxies. Stateless proxies do not contain a transaction layer. The transaction layer has a client component (referred to as a client transaction) and a server component (referred to as a server transaction), each of which are represented by a finite state machine that is constructed to process a particular request. The layer above the transaction layer is called the transaction user (TU). Each of the SIP entities, except the stateless proxy, is a transaction user. When a TU wishes to send a request, it creates a client transaction instance and passes it the request along with the destination IP address, port, and transport to which to send the request. A TU that creates a client transaction can also cancel it. When a client cancels a transaction, it requests that the server stop further processing, revert to the state that existed before the transaction was initiated, and generate a specific error response to that transaction. This is done with a CANCEL request, which constitutes its own transaction, but references the transaction to be cancelled (Section 9). The SIP elements, that is, user agent clients and servers, stateless and stateful proxies and registrars, contain a core that distinguishes them from each other. Cores, except for the stateless proxy, are transaction users. While the behavior of the UAC and UAS cores depends on the method, there are some common rules for all methods (Section 8). For a UAC, these rules govern the construction of a request; for a UAS, they govern the processing of a request and generating a response. Since registrations play an important role in SIP, a UAS that handles a REGISTER is given the special name registrar. Section 10 describes UAC and UAS core behavior for the REGISTER method. Section 11 describes UAC and UAS core behavior for the OPTIONS method, used for determining the capabilities of a UA. Rosenberg, et. al. Standards Track [Page 19] RFC 3261 SIP: Session Initiation Protocol June 2002 Certain other requests are sent within a dialog. A dialog is a peer-to-peer SIP relationship between two user agents that persists for some time. The dialog facilitates sequencing of messages and proper routing of requests between the user agents. The INVITE method is the only way defined in this specification to establish a dialog. When a UAC sends a request that is within the context of a dialog, it follows the common UAC rules as discussed in Section 8 but also the rules for mid-dialog requests. Section 12 discusses dialogs and presents the procedures for their construction and maintenance, in addition to construction of requests within a dialog. The most important method in SIP is the INVITE method, which is used to establish a session between participants. A session is a collection of participants, and streams of media between them, for the purposes of communication. Section 13 discusses how sessions are initiated, resulting in one or more SIP dialogs. Section 14 discusses how characteristics of that session are modified through the use of an INVITE request within a dialog. Finally, section 15 discusses how a session is terminated. The procedures of Sections 8, 10, 11, 12, 13, 14, and 15 deal entirely with the UA core (Section 9 describes cancellation, which applies to both UA core and proxy core). Section 16 discusses the proxy element, which facilitates routing of messages between user agents. 6 Definitions The following terms have special significance for SIP. Address-of-Record: An address-of-record (AOR) is a SIP or SIPS URI that points to a domain with a location service that can map the URI to another URI where the user might be available. Typically, the location service is populated through registrations. An AOR is frequently thought of as the "public address" of the user. Back-to-Back User Agent: A back-to-back user agent (B2BUA) is a logical entity that receives a request and processes it as a user agent server (UAS). In order to determine how the request should be answered, it acts as a user agent client (UAC) and generates requests. Unlike a proxy server, it maintains dialog state and must participate in all requests sent on the dialogs it has established. Since it is a concatenation of a UAC and UAS, no explicit definitions are needed for its behavior. Rosenberg, et. al. Standards Track [Page 20] RFC 3261 SIP: Session Initiation Protocol June 2002 Call: A call is an informal term that refers to some communication between peers, generally set up for the purposes of a multimedia conversation. Call Leg: Another name for a dialog [31]; no longer used in this specification. Call Stateful: A proxy is call stateful if it retains state for a dialog from the initiating INVITE to the terminating BYE request. A call stateful proxy is always transaction stateful, but the converse is not necessarily true. Client: A client is any network element that sends SIP requests and receives SIP responses. Clients may or may not interact directly with a human user. User agent clients and proxies are clients. Conference: A multimedia session (see below) that contains multiple participants. Core: Core designates the functions specific to a particular type of SIP entity, i.e., specific to either a stateful or stateless proxy, a user agent or registrar. All cores, except those for the stateless proxy, are transaction users. Dialog: A dialog is a peer-to-peer SIP relationship between two UAs that persists for some time. A dialog is established by SIP messages, such as a 2xx response to an INVITE request. A dialog is identified by a call identifier, local tag, and a remote tag. A dialog was formerly known as a call leg in RFC 2543. Downstream: A direction of message forwarding within a transaction that refers to the direction that requests flow from the user agent client to user agent server. Final Response: A response that terminates a SIP transaction, as opposed to a provisional response that does not. All 2xx, 3xx, 4xx, 5xx and 6xx responses are final. Header: A header is a component of a SIP message that conveys information about the message. It is structured as a sequence of header fields. Header Field: A header field is a component of the SIP message header. A header field can appear as one or more header field rows. Header field rows consist of a header field name and zero or more header field values. Multiple header field values on a Rosenberg, et. al. Standards Track [Page 21] RFC 3261 SIP: Session Initiation Protocol June 2002 given header field row are separated by commas. Some header fields can only have a single header field value, and as a result, always appear as a single header field row. Header Field Value: A header field value is a single value; a header field consists of zero or more header field values. Home Domain: The domain providing service to a SIP user. Typically, this is the domain present in the URI in the address-of-record of a registration. Informational Response: Same as a provisional response. Initiator, Calling Party, Caller: The party initiating a session (and dialog) with an INVITE request. A caller retains this role from the time it sends the initial INVITE that established a dialog until the termination of that dialog. Invitation: An INVITE request. Invitee, Invited User, Called Party, Callee: The party that receives an INVITE request for the purpose of establishing a new session. A callee retains this role from the time it receives the INVITE until the termination of the dialog established by that INVITE. Location Service: A location service is used by a SIP redirect or proxy server to obtain information about a callee's possible location(s). It contains a list of bindings of address-of- record keys to zero or more contact addresses. The bindings can be created and removed in many ways; this specification defines a REGISTER method that updates the bindings. Loop: A request that arrives at a proxy, is forwarded, and later arrives back at the same proxy. When it arrives the second time, its Request-URI is identical to the first time, and other header fields that affect proxy operation are unchanged, so that the proxy would make the same processing decision on the request it made the first time. Looped requests are errors, and the procedures for detecting them and handling them are described by the protocol. Loose Routing: A proxy is said to be loose routing if it follows the procedures defined in this specification for processing of the Route header field. These procedures separate the destination of the request (present in the Request-URI) from Rosenberg, et. al. Standards Track [Page 22] RFC 3261 SIP: Session Initiation Protocol June 2002 the set of proxies that need to be visited along the way (present in the Route header field). A proxy compliant to these mechanisms is also known as a loose router. Message: Data sent between SIP elements as part of the protocol. SIP messages are either requests or responses. Method: The method is the primary function that a request is meant to invoke on a server. The method is carried in the request message itself. Example methods are INVITE and BYE. Outbound Proxy: A proxy that receives requests from a client, even though it may not be the server resolved by the Request-URI. Typically, a UA is manually configured with an outbound proxy, or can learn about one through auto-configuration protocols. Parallel Search: In a parallel search, a proxy issues several requests to possible user locations upon receiving an incoming request. Rather than issuing one request and then waiting for the final response before issuing the next request as in a sequential search, a parallel search issues requests without waiting for the result of previous requests. Provisional Response: A response used by the server to indicate progress, but that does not terminate a SIP transaction. 1xx responses are provisional, other responses are considered final. Proxy, Proxy Server: An intermediary entity that acts as both a server and a client for the purpose of making requests on behalf of other clients. A proxy server primarily plays the role of routing, which means its job is to ensure that a request is sent to another entity "closer" to the targeted user. Proxies are also useful for enforcing policy (for example, making sure a user is allowed to make a call). A proxy interprets, and, if necessary, rewrites specific parts of a request message before forwarding it. Recursion: A client recurses on a 3xx response when it generates a new request to one or more of the URIs in the Contact header field in the response. Redirect Server: A redirect server is a user agent server that generates 3xx responses to requests it receives, directing the client to contact an alternate set of URIs. Rosenberg, et. al. Standards Track [Page 23] RFC 3261 SIP: Session Initiation Protocol June 2002 Registrar: A registrar is a server that accepts REGISTER requests and places the information it receives in those requests into the location service for the domain it handles. Regular Transaction: A regular transaction is any transaction with a method other than INVITE, ACK, or CANCEL. Request: A SIP message sent from a client to a server, for the purpose of invoking a particular operation. Response: A SIP message sent from a server to a client, for indicating the status of a request sent from the client to the server. Ringback: Ringback is the signaling tone produced by the calling party's application indicating that a called party is being alerted (ringing). Route Set: A route set is a collection of ordered SIP or SIPS URI which represent a list of proxies that must be traversed when sending a particular request. A route set can be learned, through headers like Record-Route, or it can be configured. Server: A server is a network element that receives requests in order to service them and sends back responses to those requests. Examples of servers are proxies, user agent servers, redirect servers, and registrars. Sequential Search: In a sequential search, a proxy server attempts