Architectures

WebRTC (Web Real-Time Communication), an open-source protocol developed by Google, is supported by all major browsers, eliminating the need for third-party audio and video streaming plugins.

However, it's important to understand that while WebRTC facilitates easy setup and direct communication for small-scale applications, it introduces complexities as businesses scale. WebRTC technology was designed for connecting Peer-2-Peer without intermediary servers meaning it can't handle large audiences on its own. There are a few different approaches to implementing scalable backends for WebRTC.

Mesh

Connects participants directly to each other, creating a web-like mesh network. Mesh infrastructures are ideal for: Small group video calls (2–4 participants); P2P file sharing applications; Low-latency gaming applications where direct connection is crucial

graph LR
    A[fa:fa-mobile User A] -->  |Stream A| B
    A  --> |Stream A| C
    A  --> |Stream A| D
    B[fa:fa-tablet User B] --> |Stream B| A
    B  --> |Stream B| C
    B  --> |Stream B| D
    C[fa:fa-desktop User C]  --> |Stream C| A
    C  --> |Stream C| B
    C  --> |Stream C| D
    D[fa:fa-mobile User D] -->  |Stream D| A
    D -->  |Stream D| B
    D -->  |Stream D| C

Selective Forwarding Unit (SFU)

Acts as a central hub, forwarding media streams between participants without modifying the content. SFU infrastructures are ideal for: Video conferencing for medium-sized groups (5–15 participants); Webinars where one-way communication (presenter to audience) is primary; Applications requiring efficient use of server resources.

graph TD
    S{SFU}
    A[fa:fa-mobile User A] --> |Stream A|S
    B[fa:fa-tablet User B] --> |Stream B|S
    C[fa:fa-desktop User C] --> |Stream C|S
    D[fa:fa-mobile User D] --> |Stream D|S
    E[fa:fa-mobile User E] --> |Stream E|S
    F[fa:fa-mobile User F] --> |Stream F|S
    S --> |B|A
    S --> |C|A
    S --> |D|A
    S --> |E|A
    S --> |F|A
    S --> |A|B
    S --> |C|B
    S --> |D|B
    S --> |E|B
    S --> |F|B
    S --> |A|C
    S --> |B|C
    S --> |D|C
    S --> |E|C
    S --> |F|C
    S --> |A|D
    S --> |B|D
    S --> |C|D
    S --> |E|D
    S --> |F|D
    S --> |A|E
    S --> |B|E
    S --> |C|E
    S --> |D|E
    S --> |F|E
    S --> |A|F
    S --> |B|F
    S --> |C|F
    S --> |D|F
    S --> |E|F

Multipoint Control Unit (MCU)

Centralized server that receives, processes (mixing, recording, layout), and distributes media streams among participants. MCU infrastructures are ideal for: Large conferences or webinars with many participants (15+); Applications requiring advanced features like stream recording, content mixing, or complex layouts; Scenarios with unreliable or varying network conditions on participant ends.

graph TD
    A[fa:fa-mobile A] --> |Stream A|M(MCU)
    B[fa:fa-tablet B] --> |Stream B|M
    M --> |Mixed Stream|C[fa:fa-desktop C]
    M --> |Mixed Stream|D[fa:fa-mobile D]
    M --> |Mixed Stream|E[fa:fa-desktop E]
    M --> |Mixed Stream|F[fa:fa-tablet E]
    M --> |Mixed Stream|G[fa:fa-mobile G]

Hybrid

Combines elements of all the other architectures to create a solution that adapts. For example, Mesh for small groups, MCU for large conferences. Hybrid architectures are generally used when you have varying numbers of participants in a single session; scenarios requiring scalability and advanced features like recording, layout management, or stream mixing.

graph TD;
    subgraph Broadscast
        D[User A] --> M(MCU)
        E[User B] --> M
        M --> J[C]  
        M --> K[D]
        M --> L[E] 
        M --> N[F]
        M --> O[G] 
        M --> P[H]
    end
    subgraph Large Group
        A[User A] --> S{SFU}
        B[User B] --> S
        C[User C] --> S
        Q[User D] --> S
        R[User E] --> S
        S --> A
        S --> A
        S --> B
        S --> B
        S --> C
        S --> C
        S --> Q
        S --> Q
        S --> R
        S --> R
    end    
    subgraph Small Group  
        H[User A] --> F[User B]
        H --> G[User C]
        F --> H
        F --> G
        G --> H
        G --> F
    end

Making the Choice

Choosing the right WebRTC architecture depends on your application's specific needs. Consider factors like the number of participants, latency requirements, security concerns, and cost before making a decision. With WebRTC there's no one-size-fits-all solution.

Here's where a Communication Platform as a Service (CPaaS) provider like Agora becomes invaluable. Agora providers offer pre-built, scalable solutions that remove the complexities and allow teams to focus on their core business.