more posts

Webrtc from scratch

17:21:32 04 January 2024 UTC

P2P is no free lunch, but it’s surprisingly simple once you figure it out. You need to establish a good signaling server that can handle peer connections, a nice framework for peer messaging, and a good understand of concurrent applications is a must. The solution I’m going to walk you through is designed as a teaching aid rather than best practices: we will cover a simple signaling server written in go, how messages are processed, and how to handle the connection establishment processes in basic client side app, once we cover how it all works I’ll also mention things that should be considered when integrating Webrtc into your applications.

You will need a basic understanding of webrtc and a basic understanding of javascript

The server

a core part of webrtc is a signaling server, signaling is just having a server, in most cases this is a websocket server where peers connect and start forwarding messages to. Our server is written in go, a simple c-like language, here’s what we are working with for the initial setup:

 import (
	"fmt"
	"net/http"
	"github.com/gorilla/websocket"
)
var toWebSocket = websocket.Upgrader{}

func main() {
  http.HandleFunc("/",http.FileServer(http.Dir("./dist")).ServeHTTP)
  http.HandleFunc("/ws", handleSignaling)
	http.ListenAndServe(":8080", nil)
}


func handleSignaling(w http.ResponseWriter, r *http.Request) {
    conn, _ := toWebSocket.Upgrade(w,r,nil)
    defer conn.Close()
    for {} //hold connection
}

This is about as basic as you get. Now since we need to manage several connections to this server we will use a data structure designed to be accessed by multiple threads:

type Peers struct {
    newPeer sync.Mutex          //adds mutex inhertiance to struct
    oldPeers []*websocket.Conn
    connections int             //index for old peers
}
var peers Peers
const MAX_CONN = 5

func main() {
    peers = Peers{oldPeers: make([]*websocket.Conn, MAX_CONN),connections: 0}
    //....
}
func handleSignaling(w http.ResponseWriter, r *http.Request){
    //...
    peers.lock()
    peers.unlock()
    //...
}

This peers struct has a Mutex on it to make sure that new peers wait their turn. The idea here is that we keep track of old peers and onboard new peers one at a time. We will only focus on the handleSignaling function when talking about the server the logic for handling each new peer looks like this:

//...
    peers.lock()
    if peers.connections > 0 {
        //TODO onboard new peers with old
    }
    peers.oldPeers[peers.connections] = conn
    peers.connections++
    peers.newPeer.Unlock()
    
    for {
        if peers.connections == MAX_CONN {
            return
        }
    }
}

Its worth knowing that all p2p signaling is between 2 peers, you can parallelize this process, but it requires handling a lot of edge cases that will not be covered here. Signaling has a certain order to it. In order for a peer to establish a connection with the other they have to:1. Exchange SDP info (determine what data will be transferred between the two of them) 2. Collect ICE candidates from the other peer (finding a path through networks to the other peer)

One issue most tutorials will not cover is trickle ice, essentially ICE candidates are found asynchronously to establishing a common codices (sdp exchange) but a peer cannot collect ICE candidates until it has an official SDP offer, this is most likely due to it needing to tell the relay location what type of data it will be sending through. So we will do this one at a time starting first with the new peer and then moving to the old peer, note that we are sending json strings of the sdp object and ice candidate object: we need to read our socket connection, check if the message is a sdp offer if it’s not then it’s a candidate which we need to enqueue for the other peer to get when it’s ready. The code for both kinda look like this:

for {
    _, msg, _ := conn.ReadMessage()
    sMsg := string(msg)

    if strings.Contains(str, "offer") {
        oldPeer.WriteMessage(websocket.TextMessage,[]byte(sMsg))
    }
    else if /*check if we are ready for ice */ {
        canIce = true
    }
    else if strings.Contains(str, "candidate") {
       ice = append(ice, sMsg) 
    }

    if (canIce) {
       for _,ice := range ice {
           conn.WriteMessage(webSocket.TextMessage,[]byte(ice))
       }
       ice = ice[:0] //clear candidates so we dont add additional ones
    }
    //repeat but for the old peer/other peer
}

the only unique cases here is for determining when an old peer vs a new peer is ready for ice, The new peer needs to let the signaling server know it’s ready, and for the old peer we can just look for anwser in there response:

    //for the new peer
    else if str == "ready"{
        canIce = true
    }
    //for the old peer
    else if string.Contains(str,"answer"){
        conn.WriteMessage(websocket.TextMessage, []byte(sMsg))
        canIce = true
    }

The for loop for onboarding peers needs to be closed at some point, you could close it when one of the peers is done, but there is an edge case where a connection can be established before all the ice candidates are sent, this results in one peer still trying to single to the last peer while the next peer is trying to connect. So our for loop will be controlled by our peers telling us they are ready:

for !newDone || !oldDone {
   //...
   else if str == "done" {
        newDone = true;  
   } 
   //...
   else if str == "done" {
        oldDone = true;  
   } 
   //...
}

and with that we have a functioning signaling server, note that you should check out the repo for a complete application

The Client

Our client code revolves around configuring and managing multipleRTCPeerConnection objects, and then passing information between these objects and the signaling server. There are several good references online, I recommend reading the MDN doc’s on the api to get a better understanding. One important thing to know is that our client needs one minus the number of peers it plans to connect to stored on its local memory, this why mesh networks can only expand so far and companies often user their own peers on their own servers (yes webrtc is not restricted to the browser). With that out of the way our client looks something like this:

const channe els = [];
const peers = Array.from({ length: 4 }, createPeer);
var index = 0;
const singalling = new WebSocket("ws://localhost:8080/ws");
singalling.addEventListener("message", async function (event) {
  let data = JSON.parse(event.data);
  switch (data.type) {
    case "ready":
      //create a sdp offer and send it to the singlaing server
      break;
    case "offer":
      //send back a sdp anwser to signaling server
      break;
    case "answer":
      //processes anwser and let signaling server know we are ready 
      break;
    default:
      if (data.candidate) {
        //add a ice candidate to peer object
      }
      break;
  }
});


function createPeer() {
  let peer = new RTCPeerConnection({
    iceServers: [{ urls: "stun:stun.l.google.com:19302" }],
  });

  peer.addEventListener("iceconnectionstatechange", function (e) {
    if (peer.iceConnectionState === "connected") {
      //we are connected move on
      singalling.send("done");
      index++;
      
      //ice starts asyncronusly
      peers[index].addEventListener("icecandidate", function (event) {
        //send a ice candidate
      });
    }
  });

  //setting up channels
  return peer;
}

//setting up the first peer to gather ice
peers[index].addEventListener("icecandidate", function (event) {
   //send a ice candidate
});

The important things to note here is making sure the signaling server can get the ice candidates during the proper timing, you can still set up a network without all your ice candidates but in order to get consistency it’s best to take good care of making sure this happens at the correct times. When it comes to actually sending and receiving data all we need to do is keep track of all the channels we are using, you can send video, files, and just any kind of data over these channels:

function createPeer() {
  //...
  let local = peer.createDataChannel("chat");
  peer.addEventListener("datachannel", function ({ channel }) {
    channel.addEventListener("message", function (e) {
      //add e.data (our text to something in the dom)
    });
  });
  return peer;
}

document.getElementById("send").addEventListener("click", function () {
  for (let ch of channels) {
    if (ch.readyState == "open"){
        ch.send(`${input.value}`);
    }
  }
  //...
}

Nice and simple, make sure all your data is sent to everyone, and you are good to go.

Issues

Of course this solution isn’t perfect and is only meant to showcase how webrtc works, when using webrtc in your own app you need to take care of understanding what you need to look out for. The biggest problems are dealing with network edge cases like random disconnects and having your server getting flooded. The other issue is security, since anyone who joins the network needs to know the other members IP addresses this can result in doxing. All these issues need to be addressed.

Thank you for taking the time to read this feel free to make an issue on git, if you think this post needs improvement.