Building 1v1 and Multi‑Party WebRTC Calls: From Demo to Architecture

This article walks through creating a basic 1v1 WebRTC audio‑video demo with Vue, then expands to detailed code explanations for call setup, media handling, data channels, and explores multi‑party architectures (Mesh, SFU, MCU), discussing their trade‑offs, deployment challenges, and practical solutions for production environments.

大转转FE
大转转FE
大转转FE
Building 1v1 and Multi‑Party WebRTC Calls: From Demo to Architecture

Introduction

The series begins with an overview of WebRTC fundamentals from a front‑end perspective, explaining core concepts, APIs, and the role of real‑time communication in modern web applications. It then progresses to practical implementation, culminating in a comprehensive guide to building both simple and complex WebRTC solutions.

1v1 Audio‑Video Call Demo

Demo Background

A lightweight signaling server, based on the previous article, is used together with a Vue‑based user interface. Two local browser windows simulate users "Zhuangzhuang" and "Caihuoxia" to demonstrate peer‑to‑peer video communication.

Start two browser windows – each represents one user.

Zhuangzhuang view
Zhuangzhuang view

Initiate video call – the callee receives a permission prompt, accepts it, and the call starts.

Permission prompt
Permission prompt
Local video view
Local video view
Remote video view
Remote video view

Message exchange – the caller sends a text message that appears on both sides.

Message sent
Message sent
Message received
Message received

Toggle video mode – the following code switches the video track on or off.

// Switch video track enabled state
senders.find(s => s.track.kind === 'video').track.enabled = !send.track.enabled
Video mode after toggle
Video mode after toggle

The demo demonstrates the core WebRTC workflow, including peer connection establishment, media stream rendering, and a simple IM‑like data channel.

1v1 Call Feature Details

1.1 Feature Decomposition

Caller side – initializes a PeerConnection, obtains local media, adds tracks, renders the local preview, creates an offer, and sends it via the signaling server.

async initCallerInfo(callerId, calleeId) {
  // Initialize PeerConnection
  this.localRtcPc = new PeerConnection();

  // Get local audio/video stream
  const localStream = await this.getLocalUserMedia({ audio: true, video: true });

  // Add each track to the PeerConnection
  localStream.getTracks().forEach(track => this.localRtcPc.addTrack(track));

  // Render local stream to preview element
  await this.setDomVideoStream("localdemo", localStream);

  // Set up event callbacks
  this.onPcEvent(this.localRtcPc, callerId, calleeId);

  // Create and send offer
  const offer = await this.localRtcPc.createOffer();
  await this.localRtcPc.setLocalDescription(offer);
  this.linkSocket.emit("offer", { targetUid: calleeId, userId: callerId, offer });
}

Callback listeners – handle incoming tracks, ICE candidates, and renegotiation events.

onPcEvent(pc, localUid, remoteUid) {
  // Create data channel
  that.channel = pc.createDataChannel('chat');

  // Remote track handling
  pc.ontrack = function(event) {
    that.setRemoteDomVideoStream('remoteVideo', event.track);
  };

  // Negotiation needed
  pc.onnegotiationneeded = function(e) {
    console.log('Renegotiation', e);
  };

  // ICE candidate handling
  pc.onicecandidate = function(event) {
    if (event.candidate) {
      that.linkSocket.emit('candidate', {
        targetUid: remoteUid,
        userId: localUid,
        candidate: event.candidate,
      });
    } else {
      console.log('No more candidates');
    }
  };
}

Callee side – receives the offer, creates an answer, and sends it back.

async onRemoteOffer(fromUid, offer) {
  await this.localRtcPc.setRemoteDescription(offer);
  const answer = await this.localRtcPc.createAnswer();
  await this.localRtcPc.setLocalDescription(answer);
  this.linkSocket.emit('answer', { targetUid: fromUid, userId: getParams('userId'), answer });
}

Media stream changes during a call – using RTCRtpSender to enable/disable video or replace tracks for screen sharing.

// Toggle video/audio mode
const senders = this.localRtcPc.getSenders();
const send = senders.find(s => s.track.kind === 'video');
send.track.enabled = !send.track.enabled;

// Switch to screen share
const stream = await this.getShareMedia();
const [videoTrack] = stream.getVideoTracks();
const send = senders.find(s => s.track.kind === 'video');
send.replaceTrack(videoTrack);

Data channel (class‑IM) implementation – creates a reliable data channel for text messages.

// Create data channel
this.channel = pc.createDataChannel('my channel', { protocol: 'json', ordered: true });

pc.ondatachannel = function(ev) {
  console.log('Data channel created!');
  ev.channel.onopen = function() { console.log('Data channel opened'); };
  ev.channel.onmessage = function(ev) { console.log('Data channel message', ev.data); };
};

1.2 Production Issues and Solutions

Network problems – jitter, latency, packet loss. Solutions: adaptive bitrate, retransmission, STUN/TURN servers.

Device compatibility – diverse browsers and hardware. Solutions: extensive testing, dynamic resolution/frame‑rate adjustment, progressive enhancement.

Security – data protection. Solutions: end‑to‑end encryption (DTLS, AES), strong authentication and authorization.

Data storage – persisting chat history. Solutions: distributed databases, encrypted storage, regular backup and recovery.

WebRTC Multi‑Party Communication Architecture

Overview

When moving beyond 1v1 calls, three main architectures are used: Mesh (full mesh of P2P connections), SFU (Selective Forwarding Unit), and MCU (Multipoint Conferencing Unit). Each has distinct trade‑offs regarding bandwidth, server load, scalability, and latency.

Mesh Architecture

Structure – every participant establishes a direct P2P connection with every other participant.

Pros – no media processing on the server, minimal server bandwidth.

Cons – bandwidth and CPU usage grow quadratically with participants; unsuitable for large meetings.

Typical use case – small group calls.

Practical notes – monitor bandwidth limits and cumulative network latency.

Mesh architecture diagram
Mesh architecture diagram

SFU Architecture

Structure – clients send a single media stream to the server, which forwards it to other participants without mixing.

Pros – saves client bandwidth, scales to many participants.

Cons – server must handle forwarding of all streams, increasing load.

Typical use case – medium‑size conferences, online education, live streaming.

Practical notes – ensure sufficient server performance; monitor added latency from forwarding.

Common tools – Mediasoup, SRS.

SFU architecture diagram
SFU architecture diagram

MCU Architecture

Structure – a central server receives all streams, decodes, mixes them into a single composite stream, and re‑encodes for distribution.

Pros – light client load, unified view for all participants.

Cons – high server CPU and bandwidth requirements.

Typical use case – large conferences, online education, enterprise training.

Practical notes – higher server cost and potential processing latency.

Common tools – Jitsi, SRS.

MCU architecture diagram
MCU architecture diagram

Real‑World Application Cases

Live streaming – low‑latency delivery, multi‑stream forwarding, interactive features; tools: SRS, FFmpeg.

Multi‑person meetings – multi‑stream management, dynamic layout, noise suppression; tools: Mediasoup, Jitsi.

Online education – split‑screen, real‑time Q&A, high‑quality sync; tools: Agora SDK, Zoom SDK.

Remote collaboration – screen sharing, file sync, recording; tools: Microsoft Teams, Zoom, Slack‑WebRTC integration.

Choosing the right architecture depends on participant count, required features, server capacity, and bandwidth constraints. Mesh works for tiny groups, while SFU and MCU are better for medium to large scale scenarios.

Conclusion

The article provides a complete front‑end‑focused walkthrough of WebRTC, from a simple 1v1 demo to advanced multi‑party architectures, covering implementation details, code snippets, media handling, data channels, and production‑grade considerations such as network variability, device compatibility, security, and storage. Armed with this knowledge, developers can select and implement the most suitable WebRTC solution for their specific real‑time communication needs.

frontendMCUReal-time communicationWebRTCVideo CallMeshSFU
大转转FE
Written by

大转转FE

Regularly sharing the team's thoughts and insights on frontend development

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.