There are two major protocol stacks supporting VoIP. They are
- ITU-T H.323: Packet-based multimedia communications systems, and
- Session Initiation Protocol (SIP) by the Internet Engineering Task Force (IETF)
Differences between both protocol stacks are in philosophy, flexibility, specific formats, scalability and other aspects (cf. Refs. [1, 2]). For example, H.323 is a vertically integrated suite of protocols for voice, video and data communication over packet-based networks. SIP is a more flexible standard for initiating multimedia sessions between endpoints. This note covers VoIP applications based on the SIP protocol stack with highlighted selected functional items. The SIP protocol was developed in 1996. Its standard format version was created and described in 2002 under RFC 3261 (cf. Ref. [3]).
Unlike in the case of TDM-based voice communication (as provided by PSTN where there is coexistence of voice and signaling within the same channel – hence the term, in-band signaling), voice and signal communication channels are strictly separated in the VoIP network. Signal sessions are mostly provided by a server, which replaces a standard PBX in the IP environment. The voice stream is created point-to-point between end terminals. Figure 1 illustrates signaling paths and voice paths in VoIP as operating over a LAN.
The SIP is a signaling protocol for multimedia sessions in IP. It supports all key phases of the sessions: it initiates the session, maintains it including facilitating session negotiations between endpoints and, eventually, assists in terminating the session resulting in graceful closure of interactive communication between end users. The SIP, as a network protocol, is flexible and these sessions do not include voice only, but they also support communication via video, chat, interactive games, and virtual reality.
The messaging under SIP is ASCII text based (this is in contrast with H.323 messaging, which is in binary format). Consequently, messages are composed of relative long character strings and thus are less suitable for networks where bandwidth, delay, and/or processing are a concern. Actually, that concern has been recognized and then attempts of creating binary messaging have been pursued; these attempts are reflected in RFC 3485 and RFC 3486, among other things.
The structure of the SIP protocol is layered. The lowest layer of SIP is its syntax and encoding. The second layer is the transport layer. It defines how a client sends requests and receives responses and how a server receives requests and sends responses over the network. The third layer is the transaction layer. Transactions are fundamental components of SIP. A transaction is a request sent by a client transaction (using the transport layer) to a server transaction, along with all responses to that request sent from the server back to the client.
The SIP defines components such as:
- UAC (User agent client) – client in the terminal that initiates SIP signaling;
- UAS (User agent server) – server in the terminal that responds to the SIP signaling from the UAC;
- UA (User Agent) – SIP network terminal (SIP telephones, or gateway to other networks), contains UAC and UAS;
- Proxy server – receives connection requests from the UA and transfers them to another proxy server if the particular station is not in its administration;
- Redirect server – receives connection requests and sends them back to the requester including destination data instead of sending them to the calling party;
- Location Server – receives registration requests from the UA and updates the terminal database with them.
Basic SIP messages are such as:
- INVITE – request to establish connection
- ACK – acknowledgement of INVITE
- BYE – connection termination
- CANCEL – termination of non-established connection
- REGISTER – UA registration in SIP proxy
- OPTIONS – inquiry of server options.
Answers to SIP messages are in the digital format. Here are a few examples:
- 1XX – information messages (100 – trying, 180 – ringing, 183 – progress)
- 2XX – successful request completion (200 – OK)
- 3XX – call forwarding, the inquiry should be directed elsewhere (302 – temporarily moved, 305 – use proxy)
An example of SIP-based transactions related to connection establishment, maintenance and closure is given in Figure 2.
VOCAL’s VoIP solutions (based on H.323 protocol) provide a comprehensive mobile VoIP software library to develop custom VoIP applications. Selected details are included in Ref. [4]). Contact us to discuss your mobile VoIP application with oour engineering staff.
More Information
References
- VoIP Technologies, Shigeru Kashihara (Editor), INTECH 2011
- Mobile Voice over IP (MVOIP): An Application-level Protocol for Call Hand-off in Real Time Applications, G. Ayorkor Mills-Tettey and David Kotz, Proc. Of 21st IEEE Int. Perf, Comp. and Comm. Conf, pp 271-279, 2002
- RFC 3261: Session Initiation Protocol
- VOCAL’s Analog Telephone Adapter Reference Design