Complete Communications Engineering

There are two major protocol stacks supporting VoIP. They are

Differences between both protocol stacks are in philosophy, flexibility, specific formats, scalability and other aspects (cf. Refs. [1, 2]). For example, H.323 is a vertically integrated suite of protocols for voice, video and data communication over packet-based networks. SIP is a more flexible standard for initiating multimedia sessions between endpoints. This note covers VoIP applications based on the SIP protocol stack with highlighted selected functional items.  The SIP protocol was developed in 1996. Its standard format version was created and described in 2002 under RFC 3261 (cf. Ref. [3]).

Unlike in the case of TDM-based voice communication (as provided by PSTN where there is coexistence of voice and signaling within the same channel – hence the term, in-band signaling), voice and signal communication channels are strictly separated in the VoIP network. Signal sessions are mostly provided by a server, which replaces a standard PBX in the IP environment. The voice stream is created point-to-point between end terminals. Figure 1 illustrates signaling paths and voice paths in VoIP as operating over a LAN.

Mobile VOIP SIP signaling
Figure 1: Voice and signaling communication channels are strictly separated in a VoIP network, as shown here. The signaling methods are provided by SIP.

The SIP is a signaling protocol for multimedia sessions in IP. It supports all key phases of the sessions: it initiates the session, maintains it including facilitating session negotiations between endpoints and, eventually, assists in terminating the session resulting in graceful closure of interactive communication between end users. The SIP, as a network protocol, is flexible and these sessions do not include voice only, but they also support communication via video, chat, interactive games, and virtual reality.

The messaging under SIP is ASCII text based (this is in contrast with H.323 messaging, which is in binary format). Consequently, messages are composed of relative long character strings and thus are less suitable for networks where bandwidth, delay, and/or processing are a concern. Actually, that concern has been recognized and then attempts of creating binary messaging have been pursued;  these attempts are reflected in RFC 3485 and RFC 3486, among other things.

The structure of the SIP protocol is layered. The lowest layer of SIP is its syntax and encoding. The second layer is the transport layer. It defines how a client sends requests and receives responses and how a server receives requests and sends responses over the network. The third layer is the transaction layer. Transactions are fundamental components of SIP. A transaction is a request sent by a client transaction (using the transport layer) to a server transaction, along with all responses to that request sent from the server back to the client.

The SIP defines components such as:

Basic SIP messages are such as:

Answers to SIP messages are in the digital format. Here are a few examples:

An example of SIP-based transactions related to connection establishment, maintenance and closure is given in Figure 2.

Mobile VoIP SIP connection
Figure 2: Establishing a SIP-based connection, maintaining the connection and connection closure

VOCAL’s VoIP solutions (based on H.323 protocol) provide a comprehensive mobile VoIP software library to develop custom VoIP applications. Selected details are included in Ref. [4]). Contact us to discuss your mobile VoIP application with oour engineering staff.

More Information


  1. VoIP Technologies, Shigeru Kashihara (Editor), INTECH 2011
  2. Mobile Voice over IP (MVOIP): An Application-level Protocol for Call Hand-off in Real Time Applications, G. Ayorkor Mills-Tettey and David Kotz, Proc. Of 21st IEEE Int. Perf, Comp. and Comm. Conf, pp 271-279, 2002
  3. RFC 3261: Session Initiation Protocol
  4. VOCAL’s Analog Telephone Adapter Reference Design