The purpose of NAT (Network Address Translation) is to allow multiple hosts on a private LAN not directly reachable from a WAN to send information to and receive it from hosts on the WAN. This is done with the help of the NAT server, which is connected to the WAN by one interface with a public IP address, and to the LAN by another interface with a private address. This document describes issues connected with the implementation of NAT and its implications for the operation of PortaSIP, with an overview of some fundamental NAT concepts.
The NAT server acts as a router for hosts on the LAN. When an IP packet addressed to a host on the WAN comes from a host on the LAN, the NAT server replaces the private IP address in the packet with the public IP address of its WAN interface and sends the packet on to its destination. The NAT server also performs in-memory mapping between the public WAN address the packet was sent to and the private LAN address it was received from, so that when the reply comes, it can carry out a reverse translation (e.g., replace the public destination address of the packet with the private one and forward it to the destination on the LAN). Since the NAT server can potentially map multiple private addresses into a single public one, it is possible that a TCP or UDP packet originally sent from, for example, port A of the host on the private LAN will then, after being processed in the translation, be sent from a completely different port B of the NAT’s WAN interface. The following figure illustrates this: here “HOST 1” is a host on a private network with private IP address 192.168.0.7; “NAT” is the NAT server connected to the WAN via an interface with public IP address 18.104.22.168; and “Server” is the host on the WAN with which “HOST 1” communicates.
A problem relating to the SIP User Agent (UA) arises when the UA is situated behind a NAT server. When establishing a multimedia session, the NAT server sends UDP information indicating which port it should use to send a media stream to the remote UA. Since there is a NAT server between them, the actual UDP port to which the remote UA should send its RTP stream may differ from the port reported by the UA on a private LAN (12345 vs. 56789 in the figure above) and there is no reliable way for such a UA to discover this mapping.
However, as was noted above, the packets may not have an altered post-translation port in all cases. If the ports are equal, a multimedia session will be established without difficulty. Unfortunately, there are no formal rules that can be applied to ensure correct operation, but there are some factors which influence mapping. The following are the major factors:
- How the NAT server is implemented internally. Most NAT servers try to preserve the original source port when forwarding, if possible. This is not strictly required, however, and therefore some of them will just use a random source port for outgoing connections.
- Whether or not another session has already been established through the NAT from a different host on the LAN with the same source port. In this case, the NAT server is likely to allocate a random port for sending out packets to the WAN. Please note that the term “already established” is somewhat vague in this context. The NAT server has no way to tell when a UDP session is finished, so generally it uses an inactivity timer, removing the mapping when that timer expires. Again, the actual length of the timeout period is implementation-specific and may vary from vendor to vendor, or even from one version by the same vendor to another.
NAT and SIP
There are two parts to a SIP-based phone call. The first is the signaling (that is, the protocol messages that set up the phone call) and the second is the actual media stream (e.g., the RTP packets that travel directly between the end devices, for example, between client and gateway).
SIP signaling can traverse NAT in a fairly straightforward way, since there is usually one proxy. The first hop from NAT receives the SIP messages from the client (via the NAT), and then returns messages to the same location. The proxy needs to return SIP packets to the same port it received them from, e.g., to the IP:port that the packets were sent from (not to any standard SIP port, e.g., 5060). SIP has tags which tell the proxy to do this. The “received” tag tells the proxy to return a packet to a specific IP and the “rport” tag contains the port to return it to. Note that SIP signaling should be able to traverse any type of NAT as long as the proxy returns SIP messages to the NAT from the same source port it received the initial message from. The initial SIP message, sent to the proxy IP:port, initiates mapping on the NAT, and the proxy returns packets to the NAT from that same IP:port. This is enabled in any NAT scenario.
Registering a client which is behind a NAT requires either a registrar that can save the IP:port in its registration information, based on the port and IP that it identifies as the source of the SIP message, or a client that is aware of its external mapped address and port and can insert them into the contact information as the IP:port for receiving SIP messages. You should be careful to use a registration interval shorter than the keepalive time for NAT mapping.
RTP – media stream
An RTP that must traverse a NAT cannot be managed as easily as SIP signaling. In the case of RTP, the SIP message body contains the information that the endpoints need in order to communicate directly with each other. This information is contained in the SDP message. The endpoint clients fill in this information according to what they know about themselves. A client sitting behind a NAT knows only its internal IP:port, and this is what it enters in the SDP body of the outgoing SIP message. When the destination endpoint wishes to begin sending packets to the originating endpoint, it will use the received SDP information containing the internal IP:port of the originating endpoint, and so the packets will never arrive.
Understanding the SIP server’s role in NAT traversal
Below is a simplified scheme of a typical SIP call:
It must be understood that SIP signaling messages between two endpoints always pass through a proxy server, while media streams usually flow from one endpoint to another directly. Since the SIP Server is located on a public network, it can identify the real IP addresses of both parties and correct them in the SIP message, if necessary, before sending this message further. Also, the SIP Server can identify the real source ports from which SIP messages arrive, and correct these as well. This allows SIP signaling to flow freely even if one or both UAs participating in a call are on private networks behind NATs.
Unfortunately, due to the fact that an RTP media stream uses a different UDP port, flowing not through the SIP server but directly from one UA to another, there is no such simple and universal NAT traversal solution. There are 3 ways of dealing with this problem:
- Insert an RTP proxy integrated with the SIP Server into the RTP path. The RTP proxy can then perform the same trick for the media stream as the SIP Server does for signaling: identify the real source IP address/UDP port for each party and use these addresses/ports as targets for RTP, rather than using the private addresses/ports indicated by the UAs. This method helps in all cases where properly configured UAs supporting symmetric media are used. However, it adds another hop in media propagation, thus increasing audio delay and possibly decreasing quality due to greater packet loss.
- Assume that the NAT will not change the UDP port when resending an RTP stream from its WAN interface, in which case the SIP Server can correct the IP address for the RTP stream in SIP messages. This method is quite unreliable; in some cases it works, while in others it fails.
- Use “smart” UAs or NAT routers, or a combination of both, which are able to figure out the correct WAN IP address/port for the media by themselves. There are several technologies available for this purpose, such as STUN, UPnP and so on. A detailed description of them lies beyond the scope of this document, but may easily be found on the Internet.
Which NAT traversal method is the best?
There is no “ideal” solution, since all methods have their own advantages and drawbacks. However, the RTP proxy method is the preferred solution due to the fact that it allows you to provide service regardless of the type or configuration of SIP phone and NAT router. Thus you can say to customers: “Take this box, and your IP phone will work anywhere in the world!”.
In general, the “smart” method will only work if you are both an ISP and ITSP, and so provide your customers with both DSL/cable routers and SIP phones. In this case, they can only use the service while on your network.
NAT call scenarios and setup guidelines
With regard to NAT traversal, there are several distinct SIP call scenarios, each of which should be handled differently. These scenarios differ in that, in case 2, the media stream will always pass through one or more NATs, as the endpoints cannot communicate with each other directly, while in cases 1 and 3 it is possible to arrange things so that a media stream flows directly from one endpoint to another.
Calls between SIP phones
- A call is made from one SIP UA (SIP phone) to another SIP UA (SIP phone), with both phones on public IP addresses (outside a NAT). In this case, the phones can communicate directly and no RTP proxying is required.
- A call is made from one SIP UA (SIP phone) to another SIP UA (SIP phone), and at least one of the phones is on a private network behind a NAT. Here an RTP proxy should be used to prevent “no audio” problems.
- A call is made from one SIP UA (SIP phone) to another SIP UA (SIP phone), with both phones on the same private network (behind the same NAT). This scenario is likely to be encountered in a corporate environment, where a hosted PBX service is provided. In this case, it is beneficial to enable both phones to communicate directly (via their private IP addresses), so that the voice traffic never leaves the LAN.
Calls between SIP phones and remote gateways
- A call is made from/to a SIP phone on a public IP address from/to a VoIP GW (a VoIP GW is always assumed to be on a public IP address). In this case, the RTP stream may flow directly between the GW and SIP phone, and no RTP proxying is required.
- A call is made from/to a UA under a NAT from/to a VoIP GW, and the remote gateway supports SIP COMEDIA extensions. In this case, the RTP stream may flow directly between the gateway and the SIP phone, and there is no need to use an RTP proxy. However,
you need to configure your Cisco GW in order to ensure proper NAT traversal:
sip-ua nat symmetric check-media-src.
- A call is made from/to a UA under a NAT from/to a VoIP GW, and the remote gateway does not support SIP COMEDIA extensions. An RTP proxy is required in this case.
RTP proxy in PortaSIP
This provides an effective NAT traversal solution according to the RTP proxy method described above. The RTP proxy is fully controlled by PortaSIP, and is absolutely transparent to the SIP phone.
The RTP proxy does not perform any transcoding by default, and therefore requires a minimum amount of system resources for call processing. PortaSIP can support about 4500 simultaneous calls when performing RTP proxying on an average PC server.
During the call initiation phase, PortaSwitch gathers information about the NAT status of both parties (caller and called) participating in the call and decides about RTP proxying.
For a SIP phone, the possible conditions are:
- SIP phone on a public IP address.
- SIP phone behind NAT.
Thus, the RTP proxy engagement logic for calls between SIP phones can be summarized as follows:
- If both phones are on public IP addresses, do not use an RTP proxy; rather, allow the media stream to go directly between them.
- If both phones are behind the same NAT router, do not use an RTP proxy; rather, allow the media stream to go directly between them.
- Otherwise, the RTP proxy is used.
Calls between SIP phones and remote gateways
If the called (or calling) party is a remote gateway or remote SIP proxy, its NAT traversal capabilities are described in the PortaBilling configuration under connection properties. The possible values are:
- Optimal – this connection supports NAT traversal, so it can communicate with an IP phone behind NAT directly. This is the best possible scenario, since you can entirely avoid using an RTP proxy when exchanging calls with this carrier.
- OnNat – this connection does not support NAT traversal. Direct communication with an IP phone is possible only if that phone is on a public IP address.
- Always – regardless of NAT traversal capabilities, you must always use an RTP proxy when communicating with this carrier. This may be necessary if you do not want to allow them to see your customer’s real IP address, or perhaps simply because this carrier has a good network connection to your SIP server, but a poor connection to the rest of the world. Thus you will need to proxy his traffic to ensure good call quality.
- Direct – always send a call directly to this gateway, and never engage an RTP proxy.
PortaSIP cannot detect whether a remote gateway supports Comedia extensions (symmetric NAT traversal). If you do not use your own gateway for termination, you should clarify this matter with your vendor and set up the NAT traversal status accordingly.
After the NAT status of the IP phone (behind NAT or on a public IP) and the NAT traversal status of the connection have been identified, a decision is made as follows:
- If the connection has Always NAT traversal status, activate the RTP proxy.
- If the connection has Direct NAT traversal status, do not activate the RTP proxy.
- If the phone is behind NAT and the connectio