WebRTC (Web Real-Time Communication), an open-source protocol developed by Google, is supported by all major browsers, eliminating the need for third-party audio and video streaming plugins.
However, it's important to understand that while WebRTC facilitates easy setup and direct communication for small-scale applications, it introduces complexities as businesses scale. WebRTC technology was designed for connecting Peer-2-Peer without intermediary servers meaning it can't handle large audiences on its own. There are a few different approaches to implementing scalable backends for WebRTC.
Connects participants directly to each other, creating a web-like mesh network. Mesh infrastructures are ideal for: Small group video calls (2–4 participants); P2P file sharing applications; Low-latency gaming applications where direct connection is crucial
graph LR
A[fa:fa-mobile User A] --> |Stream A| B
A --> |Stream A| C
A --> |Stream A| D
B[fa:fa-tablet User B] --> |Stream B| A
B --> |Stream B| C
B --> |Stream B| D
C[fa:fa-desktop User C] --> |Stream C| A
C --> |Stream C| B
C --> |Stream C| D
D[fa:fa-mobile User D] --> |Stream D| A
D --> |Stream D| B
D --> |Stream D| C
Acts as a central hub, forwarding media streams between participants without modifying the content. SFU infrastructures are ideal for: Video conferencing for medium-sized groups (5–15 participants); Webinars where one-way communication (presenter to audience) is primary; Applications requiring efficient use of server resources.
graph TD
S{SFU}
A[fa:fa-mobile User A] --> |Stream A|S
B[fa:fa-tablet User B] --> |Stream B|S
C[fa:fa-desktop User C] --> |Stream C|S
D[fa:fa-mobile User D] --> |Stream D|S
E[fa:fa-mobile User E] --> |Stream E|S
F[fa:fa-mobile User F] --> |Stream F|S
S --> |B|A
S --> |C|A
S --> |D|A
S --> |E|A
S --> |F|A
S --> |A|B
S --> |C|B
S --> |D|B
S --> |E|B
S --> |F|B
S --> |A|C
S --> |B|C
S --> |D|C
S --> |E|C
S --> |F|C
S --> |A|D
S --> |B|D
S --> |C|D
S --> |E|D
S --> |F|D
S --> |A|E
S --> |B|E
S --> |C|E
S --> |D|E
S --> |F|E
S --> |A|F
S --> |B|F
S --> |C|F
S --> |D|F
S --> |E|F
Centralized server that receives, processes (mixing, recording, layout), and distributes media streams among participants. MCU infrastructures are ideal for: Large conferences or webinars with many participants (15+); Applications requiring advanced features like stream recording, content mixing, or complex layouts; Scenarios with unreliable or varying network conditions on participant ends.
graph TD
A[fa:fa-mobile A] --> |Stream A|M(MCU)
B[fa:fa-tablet B] --> |Stream B|M
M --> |Mixed Stream|C[fa:fa-desktop C]
M --> |Mixed Stream|D[fa:fa-mobile D]
M --> |Mixed Stream|E[fa:fa-desktop E]
M --> |Mixed Stream|F[fa:fa-tablet E]
M --> |Mixed Stream|G[fa:fa-mobile G]
Combines elements of all the other architectures to create a solution that adapts. For example, Mesh for small groups, MCU for large conferences. Hybrid architectures are generally used when you have varying numbers of participants in a single session; scenarios requiring scalability and advanced features like recording, layout management, or stream mixing.
graph TD;
subgraph Broadscast
D[User A] --> M(MCU)
E[User B] --> M
M --> J[C]
M --> K[D]
M --> L[E]
M --> N[F]
M --> O[G]
M --> P[H]
end
subgraph Large Group
A[User A] --> S{SFU}
B[User B] --> S
C[User C] --> S
Q[User D] --> S
R[User E] --> S
S --> A
S --> A
S --> B
S --> B
S --> C
S --> C
S --> Q
S --> Q
S --> R
S --> R
end
subgraph Small Group
H[User A] --> F[User B]
H --> G[User C]
F --> H
F --> G
G --> H
G --> F
end
Choosing the right WebRTC architecture depends on your application's specific needs. Consider factors like the number of participants, latency requirements, security concerns, and cost before making a decision. With WebRTC there's no one-size-fits-all solution.
Here's where a Communication Platform as a Service (CPaaS) provider like Agora becomes invaluable. Agora providers offer pre-built, scalable solutions that remove the complexities and allow teams to focus on their core business.