There is variety in media formats and the codecs that process them. Newer endpoints may support brand new codecs, while most VoIP carriers still support traditional telephony codecs (G.729 and G.711). To deliver good sound quality and solve possible codec incompatibility issues, PortaSIP can perform transcoding and transrating.
For example, to deliver good call quality to customers even with limited bandwidth, you use Speex and iLBC codecs in your mobile application. Let’s say that your vendor uses a traditional G.711 codec as a first codec and G.729 as a fallback. When a call from a mobile app goes to a vendor, PortaSIP converts the media from Speex into G.711 for the vendor, and then back, for the app. This ensures that your customers can hear each other during their calls.
PortaSIP supports transcoding for the following codecs: G711, GSM, Opus, Speex, iLBC, LPC and G729.
Let’s have a closer look at how it works. When a call is being established, PortaSIP:
- acquires a list of codecs from the calling party (A),
- modifies that list (e.g., adds codecs that it can transcode and/or reorders the codecs according to a setting defined in a connection),
- sends the updated list of codecs to the called party (B),
- receives a list of codecs from the called party (B),
- defines the codec to be used for the called party (B) (the first codec in the list), and therefore
- answers the calling party (A) with a codec to be used.
- Once a call is answered, PortaSIP receives the media stream, as is, extracts the audio data, converts it and then sends the converted stream to another party.
So when a customer makes a call from a mobile app, PortaSIP offers the following list of codecs to the vendor: Speex, Opus, iLBC, G.711, GSM, LPC, and G.729.
The vendor replies with its list of codecs starting with the preferred one: G.711 or G.729. PortaSIP defines the G.711 codec for the vendor and replies to the caller with its preferred codec – Speex. Thus, both parties use the codecs that they prefer (Speex and G.711).
Some codecs (e.g., G.723, DVI4) are not yet supported for transcoding. Thus, if a caller requests such a codec for a call, PortaSIP includes this codec in the offer. The selection of codecs used for calls depends on the callee:
- if a callee chooses an unsupported codec for a call, PortaSIP passes the media stream, as is (e.g., both parties use G.723, DVI4),
- if a callee chooses another codec, PortaSIP replies to the caller with the other requested codec.
For example, let’s say that your mobile application is developed to use G.723 as the preferred codec. Speex and G.711 are also workable. Your vendor supports both G.729 (as a preferred one) and G.711.
When a call arrives, PortaSIP offers the vendor an updated list of codecs: Opus, Speex, G.711, GSM, LPC, G.729. Upon the vendor’s reply, PortaSIP defines G.729 for the vendor and sends Speex in answer to the caller. Therefore, PortaSIP converts the media stream to G.729 for the vendor and to Speex for the customer.
Transcoding places additional load on the system, so performance depends on the hardware capabilities of your system and the codecs that you use. For example, on an average PC server, PortaSIP can support up to:
- 1000 simultaneous conversions between G.729 and G.711, or
- 600 simultaneous conversions between Speex and 729, or
- 500 simultaneous conversions between iLBC and G.711.
Transrating
Transrating is a process of configuring a different packetization for a voice codec (the amount of voice to be transmitted in a single packet in milliseconds). For example, transrating G.729 30 ms to G.729 20 ms.
Let’s say that you have customers who use your voice services via satellite. For the purpose of sound quality, the media is transmitted in packets with 30 ms packetization time. Since your telco only accepts 20 ms, when one of these customers makes a call, PortaSwitch then adjusts the packetization time from 30 ms to 20 ms and then back during the call.
The flow is similar to transcoding. When a call is being established, PortaSIP:
- acquires a list of codecs from the calling party (e.g., G.729 30 ms),
- modifies that list (e.g., adds codecs that it can transcode and/or reorders the codecs according to a setting defined in a connection),
- sends the updated list of codecs to the called party (B),
- receives a list of codecs from the called party (B),
- defines the codec to be used for the called party (B) (e.g., G.729 20 ms), and therefore
- answers the calling party (A) with a codec to be used (G.729 30 ms).
Once a call is answered, PortaSIP receives the media stream with 30 ms packetization time from a customer, extracts the audio data, converts its packetization time to 20 ms and sends the converted stream to the vendor and then back.
PortaSIP supports transrating for the following codecs: G.711, GSM, Opus, Speex, iLBC, LPC and G.729.
Some codecs (e.g., G.723, DVI4) are not yet supported for transcoding and therefor transrating. Thus, if both parties use G.723, for example, PortaSIP passes the media stream as is.
Since transcoding and transrating are resource-intensive processes, they are disabled by default. To enable both transcoding and transrating, configure PortaSIP to always proxy the media stream and then enable the Transcoding.allow_transcoding option on the configuration server.
Thus, when transcoding and transrating are enabled, PortaSIP serves as the mediator between the two endpoint clients. The RTP proxy converts audio traffic when the endpoints use one of the codecs supported by PortaSIP, but their preferred codecs and/or packetization time differ.
The RTP proxy passes the audio traffic, as is, when:
- both endpoints use the same codec and encoding parameters, or
- both endpoints use a common codec that is not supported by PortaSIP (e.g., G.723). In this case, issues with sound quality may appear.
With transcoding and transrating, you forget about codec compatibility between customer-premises equipment (CPE). You can choose any combination of CPEs and carriers to optimize sound quality and also minimize your termination costs.