back to all posts

Intro to WebRTC

reading time

~ 5 min read

last Modified

published

WebRTC

WebRTC (Web Real-Time Communication) is a peer-to-peer protocol that enables real-time communication in the browser without needing any additional plugins. It contains several APIs to make video streaming, audio streaming, screen sharing, file sharing, data sharing, and much more possible.

Peer-to-Peer

Here, Peer-to-peer means the direct connection from the browser to the browser without needing an intermediate server that acts as a central authority. However, during the initial setup (ie this browser user in India is connecting with someone from the US) The browser needs to know who’s browser to connect to. For that, a signaling server is being used.

Connect Peers

Signalling Server can be WebSocket server or http long polling or you can use any already existing SDK available. The spec of WebRTC does not limit what protocol can be used as a signaling server. The role of signaling server (web socket for example) is that it sends data for initial connection to occur between the users and it does not need to know what content of data users are sending to each other. Ok, where does it fits, this signaling server??

WebRTC uses Offer/Answer model; Suppose if User A wants to connect to User B. User A sends offer to User B. And User B creates an Answer in response to the offer sent by User A. This has to be sent between the users via any medium for connection initialization. This is where the web socket (Signalling server) comes in and helps A and B to negotiate with each other and establish a connection. After the connection, it does not need a web socket server anymore. User A and B can directly communicate with each other.

Psuedo Code for signaling server

socket.on('offer', (offer: any) => {
socket.except(socket.id).emit('offer', offer);
})

// Handle 'answer' messages
socket.on('answer', (answer: any) => {
socket.except(socket.id).emit('answer', answer);
});

// Handle 'ice-candidate' messages
socket.on('iceCandidate', (candidate: any) => {
socket.except(socket.id).emit('iceCandidate', candidate);
});

The signalling server can be simple websocket server that handles the initial offer/answer/iceCandidate functionality to establish peer-to-peer connection

Interactive Connectivity Establishment (ICE)

WebRTC contains a collection of protocols; STUN and TURN protocols for connection establishment (to discover the user on the internet so that other users using the browser communicate with each other since one user might be behind different NAT or firewall) These STUN AND TURN protocol comes under ICE Framework. ICE takes care of connectivity between users. ICE is framework to allow web browser to connect with peers. And Restart ICE happens when the user changes the network, for example: from Wifi to a 4G network like thatIC, The ICE Process begin again to connect the two devices.

Session Traversal Utilities for NAT (STUN) - It is a protocol that is used in WebRTC to discover the public IP of the device which is behind the NAT or Firewall.

Traversal Using Relays around NAT (TURN) - Sometimes direct peer-to-peer communication is not possible even after using STUN, this acts as a relay server (intermediate between User A and B) to stream video/audio, etc. It adds some extra latency since it is not direct.

There are several options available for self-hosting STUN/TURN servers as well as public ones (from Google STUNs). But public ones are best avoided when commercial use for protecting user data and privacy.

When developing on local, STUN is optional requirement since our testing devices are all under same network.

Example

To better understand let’s take an example! We are creating a video calling app in which we can able to enter roomCode and call the user who has that roomCode.

It follows as:

  1. Get Permission to use media devices such as Video, Audio, Screen Sharing, etc using getUserMedia API from the browser.

  2. ICE starts its process to find the public IP of the devices which are trying to communicate. It uses STUN or TURN techniques to find the public IP of the users.

  3. User A types in the room code of User B and presses Connect.

  4. On Connect, It creates an Offer, and Send it to WebSocket Server which is the Signalling server.

  5. On WebSocket Server, receives the offer and the server uses some sort of storage containing room codes of the users. It scans thru it and finds that it belongs to User B.

  6. And now it sends this offer to User B

  7. On User B, It receives an offer and creates an answer if the user wants to answer. It sends the answer to the web socket server.

  8. Now the server got the answer from User B that they are accepting the connection.

  9. And it sends the answer back to User A.

  10. User A receives an answer and now it directly connects to User B.

  11. No need for a web socket anymore, the job of the signaling server is done. Now the browser of User A and B communicates with each other without any intermediate signaling server.

Visual Representaion of How it works:

the above steps in the visual representation with numbered step by step

WebRTC is ofcourse UDP based protocol since speed over reliability. And for Signaling Server, TCP is used to establish connection between peers :)

experimental code: link

More on use-cases

  • Webrtc can be used for sharing text/blob using RTCDataChannel, it can able to send and receive texts/file transfer using the channel in real-time.