No matter what type of communication is taking place, you need a reliable framework to start the conversation.
Like a typical conversation, you need an introduction before jumping in. Session Initiation Protocol (SIP) is that for the digital world. It’s a simple way to start and end online conversations—and the foundation of real-time technology that has revolutionized communication as we know it.
- Session Initiation Protocol (SIP) Overview
- Noteworthy SIP Protocol Features
- How Does the SIP Protocol Work?
- Differences Between SIP and VoIP
Session Initiation Protocol (SIP) Overview
Acronyms overrun the world of digital communication. They seem complicated, but most are simple. Session Initiation Protocol, known as SIP, is no exception.
A protocol is a set of universal standards for computer communication. A SIP session is any live communication, such as phone calls, conferencing, and even video. And initiation means the beginning.
The most common example of SIP in practice is for phone calls with Voice over IP, or VoIP. But it has many other uses as well. Video conferencing, instant messaging, and even computer games may use SIP.
Two decades ago, the Internet Engineering Task Force (IETF) standardized SIP in 1999 in RFC 3261. SIP leveraged two other popular protocols. The first was Hypertext Transfer Protocol, or HTTP, which establishes the widely adopted methods for viewing websites. SIP interactions appear like those used for accessing the web.
The second protocol IETF adopted is Simple Mail Transfer Protocol, or SMTP, which is what we use to send email. Like email headers, SIP includes metadata for a session between two parties. To take advantage of SIP, you must have a SIP Phone.
On a technical level, SIP carries VoIP traffic over either UDP or TCP on ports 5060 or 5061. By comparison, browsing the web typically occurs over ports 80 and 443.
Related: What Is PSTN and How Does It Work?
Unlimited SIP calls for only $24.95/mo!
Noteworthy SIP Protocol features
It’s important to note that SIP resides in the application layer of protocols, and is only a signaling protocol. SIP doesn’t handle any details of the multimedia sessions themselves.
The Session Description Protocol, or SDP, handles media details. Real-time Transport Protocol, or RTP, handles the delivery of data.
SIP starts by initiating a call with one or more people. In this message, the SIP client establishes parameters for the request. It does this using the header fields previously mentioned.
These fields can show user availability, user location, and user capabilities, as well as other data. Upon receiving the request, the other party or parties either reply with an acceptance or rejection.
One of the features of the header field is indicating secure transmissions. When using the URI scheme sips, the protocol denotes that messaging must take place with Transport Layer Security (TLS).
Telephone calls made over SIP are relayed over to the traditional phone network, from a SIP service provider like Nextiva.
Understanding the SIP User-Agent Field
Another feature available for SIP requests is the User-Agent field. This field can be used to specify the software and hardware of the requesting device. The called party can then use this information to check for compatibility.
As an open standard, SIP allows users with different service providers to communicate. SIP can also convert an IP telephony call into a connection to a landline through a process known as SIP trunking. This allows you to communicate with both VoIP and landline numbers instead of having to choose between VoIP or landline.
During the conversation, SIP can control session management. For example, it can add a new user to the stream, such as on a conference call. Or it can include an additional media type, like sharing files or turning off video.
Finally, when the session is over, SIP is responsible for ensuring both parties end the conversation at the same time.
How does the SIP Protocol work?
Session Initiation Protocol works with bidirectional communication. For every SIP message, one device sends a request, and the other device receives and later responds.
Responses are coded based on their message. Different preceding numbers in a three-digit sequence have different meanings.
For example, 1xx response codes mean the device received and is processing the message. Codes starting with 2xx mean completion, 3xx is used for redirections, etc.
The most common code is 200, meaning the action was completed successfully without further details.
A SIP request or reply is relatively short, with just a few lines explaining the details of the call.
What is the role of a SIP registrar?
While SIP messages can contact another party directly, they usually go through a SIP proxy server—kind of like a switchboard.
The SIP server handles SIP requests and directs them to individual users. From there, devices establish trusted communication with each other. Where does a SIP request come from? Most likely, these requests originate from a SIP Phone or a softphone app.
A SIP registrar is similar to an address book. It associates the various users with the access points on the IP network where one can reach them.
Notably, a user’s address is not an IP address, but a separate SIP address that resembles an email. This identifier is what allows multiple devices such as a smartphone or a desk phone to ring at the same time.
A related but different type of server is a redirect server. It works similar to the post office’s change of address function, where it forwards mail to a new location.
Like a registrar, a redirect server has a list of locations. But instead of making connections, the server sends a 3xx redirect message that indicates the site has moved.
Does SIP use TCP or UDP?
Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are different ways to send data packets. Both methods are called transport protocols.
In a call, those packets usually include around 10-30 milliseconds of audio. Depending on the codec used, it might be best to use one or the other. SIP operates independently of the transport protocol, which network administrators appreciate.
TCP is designed for accuracy
TCP prevents confusion and works with inherent redundancy. To send the next packet, the sender must receive a confirmation response. If the receiving party doesn’t deliver a response within a few seconds (or the sender gets an error message), the sending party resends the packet. This exchange is often referred to as a three-way handshake.
Packets are sent one at a time, only after they’re received correctly. Applications where accuracy is a top priority, like email and websites, typically use this protocol.
UDP is built for speed
UDP maximizes speed. This protocol sends packets without confirmation, and even sometimes dispatched out of order. For services like live audio or video, constant communication is more critical than perfect contact.
Skipping or misplacing a few milliseconds of audio is preferable to a lag of a few seconds waiting for a packet to be confirmed or resent.
For the initial SIP request, however, either can be used. Since it’s a small packet, the possible lag times of TCP aren’t noticeable. However, some services choose to use UDP. It depends on the carrier and technical details of your setup.
Differences between SIP and VoIP
If you’re researching how VoIP phone service works, you might be confused with all the acronyms.
Voice over Internet Protocol, or VoIP, is a set of protocols used for voice calling over the internet, called internet telephony. Internet phone service uses VoIP, which also utilizes the SIP protocol.
Built into the core of VoIP is the SIP protocol, along with other open standards. You don’t have to choose between VoIP and SIP protocol, as they often come together. They function similar to HTTP and TLS; they are used in conjunction with each other to provide fast, reliable, and secure communication.
A more direct comparison would be between PRI and SIP, with PRI representing an older form of establishing a communication channel as SIP does.
Related: What is PRI?
Congrats, you’ve now mastered the SIP Protocol
Just like that, now you have a much better understanding of the SIP protocol and how it works. While technical, the concept is easy to digest.
Session Initiation Protocol is a critical set of standards to establish real-time digital communication. It helps two or more parties have a successful, productive interaction.
Unlike many of the complex protocols in telecommunications, SIP protocol is a simpler one. It controls the beginning, end, channels, and users during the call. With this deeper understanding, you can fix common VoIP problems and prevent them from happening.
One of the most valuable features of the SIP protocol is trunking. Trunking allows you to provide SIP-based phone service to your PBX instead of completely overhauling your company’s phone system. It lowers costs and improves service. What could be better than that?
If you’re looking for SIP trunking in your business, consider Nextiva as your next SIP trunking provider. In 2020, U.S. News & World Report rated Nextiva as the best business phone service making it the top choice for businesses across the United States.
SIP Trunking starts at $14.95/mo!
Julie Bai is a product manager at Nextiva, UCaaS evangelist, no-bull communicator and translator for people, dog lover, and mother to an adorably active boy.