Our first three WebRTC articles have taught us what exactly is WebRTC, how to create a simple signaling server and WebRTC web app client, as well as how to create a simple WebRTC Android app client, but the examples always pretty much followed a happy path, without too much added complexity. That is, of course, good for learning purposes, however in the real world we are often in need of solving a bit more complex requests, finding new ways of troubleshooting and handling potential errors that come along the way. In this article, we are covering just that.

Exploring more options

Using standard “vanilla” WebRTC implementation is great and most of the time it’s all we need for the purpose of creating a simple WebRTC app solution. However, sometimes we need a bit more control over our app(s), perhaps to change some internals parameters that are unique for our use-case (such as using specific certificate, relay server, audio/video codecs, bitrates etc.), and luckily WebRTC doesn’t lack the support for those situations as well. Let’s explore few such options.

Note: all code snippets shown in the text below are also available at this article’s Github repository.

Peer connection parameters

Firstly, let’s dive into one option that WebRTC gives us, which might sometimes be hidden in plain sight, and that is to create a peer connection object, taking into account all/most of its available parameters. The following method shows how to create a WebRTC peer connection in TypeScript (Angular), by applying all available parameters:

createPeerConnection(

  iceServers: RTCIceServer[],

  sdpSemantics: ‘unified-plan’ | ‘plan-b’,

  bundlePolicy: RTCBundlePolicy,

  iceTransportPolicy: RTCIceTransportPolicy,

  rtcpMuxPolicy: RTCRtcpMuxPolicy,

  peerIdentity: string,

  certificates: RTCCertificate[],

  iceCandidatePoolSize: number): RTCPeerConnection {

  return new RTCPeerConnection({

    iceServers, sdpSemantics, bundlePolicy, iceTransportPolicy, rtcpMuxPolicy, peerIdentity, certificates, iceCandidatePoolSize

  } as RTCConfiguration);

}

As seen, currently as much as eight different peer connection parameters can be applied to customize the peer connection. The parameters are the following:

1) ICE servers

List of servers that may be used by the Interactive Connectivity Establishment (ICE) agent. They can be in one of two types: STUN or TURN, and are important so that the remote peers can find each other over the Internet, for the connection to start and to serve as a media data relay if needed. More about ICE servers and their purpose can be read in the first WebRTC article.

2) SDP semantics

Session Description Protocol (SDP) semantics define which internal WebRTC algorithm will be used for the peer connection on both ends. It can either be “plan-b” (an outdated semantics created by Google for Google Chrome in the past that is now deprecated) or “unified-plan” (current WebRTC standard semantics supported by every modern browser and native platform).

3) Bundle policy

Defines how to handle the negotiation of candidates when the remote peer is not compatible with SDP bundle standard. It can be set to:

  • “balanced” – ICE agent creates one “RTCDtlsTransport” object for each type of content: audio, video and data channel.
  • “max-compat” – One “RTCDtlsTransport” object is created per media track, plus a separate one for data channels.
  • “max-bundle” – A single “RTCDtlsTransport” object is created to carry all peer connection data.

4) ICE transport policy

The policy that determines how ICE candidates are transported from one peer to another. Can be set to “all” (all ICE candidates are considered) or “relay” (only relayed ICE candidates will be considered, meaning mostly those passed through TURN server).

5) RTCP mux policy

The RTCP mux policy which will be used when gathering ICE candidates (to support non-multiplexed RTCP). Can be either “negotiate” (both RTP and RTCP candidates are required) or “require” (only RTP candidates).

6) Peer identity

It specifies the target peer identity for the peer connection. If this value is set, the peer connection will not connect to a remote peer unless it can authenticate with the given name.

7) Certificates

The list of RTC certificates which will be used by the peer connection for authentication.

8) ICE candidate pool size

The preferable size of the ICE candidate pool. This can be used to limit the number of ICE candidates that are gathered, thus establishing the connection faster.

Using these parameters can really impact the peer connection in a positive fashion, especially if we are trying to optimize the connection or adjust it to some very specific use-case. More about each parameter can be found at the Mozilla developer page for WebRTC peer connection class.

Changing audio/video codecs

The next interesting possibility that’s feasible with WebRTC is to manually choose which audio/video codecs will the peer(s) use when sending the media data. By default, WebRTC has the algorithm that automatically chooses the codecs that are most widely supported, which is in most cases “OPUS” (with default parameters) for audio and “VP8” for video. If we want to change that (for example to set “VP9” or “H264” as video codecs), we need to use a bit hacky method called SDP munging.

SDP munging basically refers to the process of changing the SDP content manually (via code), without using WebRTC-provided APIs. This is done because WebRTC APIs don’t yet provide support for all the possibilities that WebRTC offers. As time goes, WebRTC developers will add the support for more and more things with its APIs, but for the time being SDP munging is the only viable solution for such use-cases.

In order to change the codecs via SDP munging we need to rearrange their payload number inside SDP lines so that the preferred codecs payload is listed in front of the others. The Typescript (Angular) methods to do this are the following:

setCodecs(sdp: RTCSessionDescriptionInit, type: ‘audio’ | ‘video’, codecMimeType: string): RTCSessionDescriptionInit {

    const sdpLines = sdp.sdp.split(‘\r\n’);

    sdpLines.forEach((str, i) => {

      if (str.startsWith(‘m=’ + type)) {

        const lineWords = str.split(‘ ‘);

        const payloads = this.getPayloads(sdp.sdp, codecMimeType);

        if (payloads.length > 0) {

          payloads.forEach(codec => {

            const index = lineWords.indexOf(codec, 2);

            lineWords.splice(index, 1);

          });

          str = lineWords[0] + ‘ ‘ + lineWords[1] + ‘ ‘ + lineWords[2];

          payloads.forEach(codec => {

            str = str + ‘ ‘ + codec;

          });

          for (let k = 3; k < lineWords.length; k++) {

            str = str + ‘ ‘ + lineWords[k];

          }

        }

        sdpLines[i] = str;

      }

    });

    sdp = new RTCSessionDescription({

      type: sdp.type,

      sdp: sdpLines.join(‘\r\n’),

    });

    return sdp;

}

 

getPayloads(sdp: string, codec: string): string[] {

    const payloads = [];

    const sdpLines = sdp.split(‘\r\n’);

    sdpLines.forEach((str, i) => {

      if (str.indexOf(‘a=rtpmap:’) !== –1 && str.indexOf(codec) !== –1) {

        payloads.push(str.split(‘a=rtpmap:’).pop().split(‘ ‘)[0]);

      }

    });

    return payloads.filter((v, i) => payloads.indexOf(v) === i);

}

Of course, to know which codecs are available to be used we need to get that list from WebRTC APIs. Typescript (Angular) example of fetching the available codecs is the following:

getCodecs(type: ‘audio’ | ‘video’): string[] {

    return RTCRtpSender.getCapabilities(type).codecs.map(c => c.mimeType).filter((value, index, self) => self.indexOf(value) === index);

}

Once the codecs order in SDP lines is changed, WebRTC should start using the preferred codecs for the media communication.

Changing bitrates

The bitrates that WebRTC operates with (in SDP) are split into three main categories:

  • Start bitrate – The bitrate at which WebRTC will try to start the media session. If too high value is provided here, WebRTC will automatically choose the highest available.
  • Minimal (min) bitrate – The bitrate from which WebRTC will try to never go below. If too high value is provided, WebRTC will once again go to the highest available bitrate and won’t go lower.
  • Maximum (max) bitrate – The top limit bitrate that WebRTC will never exceed. Defining too low max bitrate can result in audio/video quality loss, while going too high can result in unnecessary network data usage, so it should be really careful with setting up this parameter.

By default, if one of those bitrates is not provided, WebRTC will automatically choose one that’s optimal by its algorithm.

In terms of the implementation, bitrate changing is once again done via SDP munging since no WebRTC API is yet available. The code example for Typescript (Angular) can be seen here:

changeBitrate(sdp: RTCSessionDescriptionInit, start: string, min: string, max: string): any | RTCSessionDescriptionInit {

      const sdpLines = sdp.sdp.split(‘\r\n’);

      sdpLines.forEach((str, i) => {

        if (str.indexOf(‘a=fmtp’) !== –1) {

          if (str.indexOf(‘x-google-‘) === –1) {

              sdpLines[i] = str + `;x-google-max-bitrate=${max};x-google-min-bitrate=${min};x-google-start-bitrate=${start}`;

          } else {

              sdpLines[i] = str.split(‘;x-google-‘)[0] + `;x-google-max-bitrate=${max};x-google-min-bitrate=${min};x-google-start-bitrate=${start}`;

          }

        }

      });

      sdp = new RTCSessionDescription({

        type: sdp.type,

        sdp: sdpLines.join(‘\r\n’),

      });

      return sdp;

}

Like seen in code example, the bitrate manipulation is done by adding/changing the “x-google-start/min/max-bitrate=” values in SDP content.

Recovering from errors

A WebRTC session establishing algorithm is a multi-step process where each step must be in sync with all the previous ones in order to work (more on that in our second WebRTC article). If something goes wrong in one of the steps (for example there is a network error, sync error or such), we are often stuck at an unstable state where session is not established, and all the user experiences is a black screen on the video replay or silence on the audio replay. To tackle that, WebRTC introduced the concept of “ICE restart” – a method which allows a WebRTC application to easily request ICE candidate gathering to be redone on both ends of the connection. With this, we can save our WebRTC connection from some errors that would otherwise result in session failure. ICE restart can be done either by calling the “restartIce” peer connection method on the browsers that support it, or by sending a new offer with “iceRestart: true” parameter, like seen in the Typescript (Angular) code example below:

doIceRestart(peerConnection: RTCPeerConnection | any, messageSender: MessageSender): void {

    try {

      // try using new restartIce method

      peerConnection.restartIce();

    } catch(error) {

      // if it is not supported, use the old implementation

      peerConnection.createOffer({

        iceRestart: true

      })

      .then((sdp: RTCSessionDescriptionInit) => {

        peerConnection.setLocalDescription(sdp);

        messageSender.sendMessage(sdp);

      });

    }

}

Troubleshooting

If ICE restart doesn’t help the case, we need to find a way to analyze the peer connection and point out what (and where) things go wrong. To help the developers with that, WebRTC creates a well-organized set of stats that are gathered during each WebRTC session. Those stats can be previewed live in one of two ways: via browser’s WebRTC stats page or via “getStats” WebRTC API.

Browser’s WebRTC stats page

Almost each Internet browser that supports WebRTC has its own WebRTC stats page, which allows the users to access WebRTC stats data in real-time. For Google Chrome and Opera, WebRTC stats page can be found at chrome://webrtc-internals/, while for Firefox it’s at about:webrtc. Each page is fairly similar, containing all the data about ICE candidates, audio/video streams, connection state changes etc. An example of a WebRTC stats page can be seen here:

WebRTC stats page WebRTC advanced options and possibilities

The “getStats” WebRTC API

All the stats that are shown in the browser’s WebRTC stats page can also be obtained by the WebRTC app using WebRTC APIs from the code. For that, WebRTC developers created “getStats” API that gives us data reports from which we can take all WebRTC stats data that we need, and act upon it. An example of Typescript (Angular) “getStats” method call, which takes all peer connection stats and displays is in the console, can be seen below:

logStats(peerConnection: RTCPeerConnection, type: ‘inbound’ | ‘outbound’ | ‘all’) {

    peerConnection.getStats().then(stat => {

      stat.forEach(report => {

        switch(type) {

          case ‘inbound’:

            if (report.type === ‘inbound-rtp’) {

              console.log(report);

            }

            break;

          case ‘outbound’:

            if (report.type === ‘outbound-rtp’) {

              console.log(report);

            }

            break;

          default:

            console.log(report);

        }

      });

    });

}

Technology for modern web-based real-time communication systems

In this article we showed that even though WebRTC is simple to implement and use, not all real world use-cases are that straight-forward, but what’s important is that for those cases WebRTC also has the solution. Whether that includes using a not so well known WebRTC API like “iceRestart” or “getStats”, or doing a bit of SDP munging, WebRTC covers almost any complex request that comes across, and that’s why it is and will stay the technology to go for in modern web-based real-time communication systems!

What's your reaction?