Blog      Development      WebRTC Video Conferencing App: The Full Guide

WebRTC Video Conferencing App: The Full Guide

Custom SoftwareVideo

Share

web RTC video conferencing app

Video conferencing apps have become the new normal in modern workplaces. These tools serve as indispensable components for modern businesses allowing them to achieve maximum efficiency while practicing remote and hybrid working models. Video conferencing apps provide companies the possibility to overcome challenges fostering efficient communication and bringing the team closer together.

Zoom, Cisco WebEx, Teams by Microsoft, Meets by Google and Slack are the ones of the most popular options for video conferencing on the web and we were approached to develop its competitor.

Fortunately, there are many technologies and ready-made components which enable us to build a video conferencing application fairly quickly. 

In this article, we will have a look at the development of a video conferencing application based on WebRTC.

First section introduces the client’s requirements, which include common features, but also some less-common ones, for example streaming the video conferencing to the RTMP server or bridging WebRTC with Cisco SIP devices. We describe technological challenges in the next section including hardware and network requirements, software compatibility challenges, interfacing, end-to-end encryption and virtualization.

 Next, we describe our solution composed of existing technologies, customized and built from scratch, and for that, it is best to choose a reliable software house, for example Altamira.

Client’s requirements

The application is supposed to contain features we can find in common video conferencing applications, such as audio and video calls, including screen sharing, instant messaging, file sharing, recording of the call, high-quality video, custom layouts, changing video background and end-to-end encryption. Additionally, the client requested call-control features too, such as waiting rooms, muting and kicking users, and raising a hand.

video conferencing app
Example of the main features of video conferencing app

Discover the Layouts in Cisco Webex:

However, the client’s requirements also include less common features such as breakout sessions, the ability to stream calls using an RTMP server, and Cisco SIP protocol compatibility, including some of Cisco devices’ support.

Choosing the right technology is an important design decision during video conferencing application development because it impacts the performance and usability of the app, especially because the application processes live data from the users, and therefore, it has high requirements on architecture. For example, the WebRTC server has to support 100 participants in a video conference.

Traffic limits and costs

Continuing with the example, given that we have 100 participants in a call who stream HD-quality video which is 250 KBps and one client can see 10 video streams simultaneously on their screen, the client should have download traffic of 2.5 MBps and upload 0.25 MBps and the server should be able to handle to download 25 MBps and upload 250 MBps (per 100 users). As a result, we need to take into account the data transfer charges of our server provider when designing such a solution.

Data stream compatibility and transcoding

In order to save traffic, it is possible to resize the stream by either lowering the quality or lowering the resolution. We can achieve it by transcoding.

What is transcoding? Transcoding is a process of decoding, reformatting, and re-encoding video in order to decrease the size of a video stream or to make formats compatible with each other to be able to preview them on a device (which does not support the format).

Transcoding can help us also to bridge different formats used by Cisco SIP devices using Cisco SIP protocol. However, transcoding requires a GPU which costs more, monthly costs can be higher about $1 000 US dollars per server or more depending on GPU. 

Regarding technology, we have good experience with GStreamer and FFMPEG technology to develop a transcoding layer for our applications.

However, there is a big question about the feasibility of transcoding which we need to ask. Is transcoding even worth implementing if it adds support for some additional devices?

Hardware requirements

GPUs are performance-cost-wise more effective than CPUs. For example, using a 24-Core Xeon E5-2690 (2.60GHz) CPU, it is possible to transcode 22 input streams (720p) with 88 output streams, but with GPU NVIDIA Tesla M60 16 GB, it is possible to transcode 70 streams input with 280 streams output, which is 3 or more than using CPU.
We are able to calculate specific network and hardware requirements depending on the suspected number of users and their streams and up-scale or down-scale based on immediate load spikes using Amazon AWS. 

Compatibility and signaling

Devices use various formats for encoding streams because they are written by different companies (or people), for example, Cisco. However, devices can negotiate all information regarding the communication “session” among each other using the SIP protocol.

SIP is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telephone systems, as well as mobile phone calling over LTE (VoLTE).

WebRTC is another protocol (or standard) next to SIP, which needs to be integrated with each other to be able to make calls from WebRTC to SIP and vice versa. The translation matrix of signaling messages from SIP to WebRTC needs to be designed and implemented using which SIP and WebRTC servers can communicate.

Security and end-to-end encryption

Some SIP servers do not implement security measures and cannot be facing the Internet because they would be hacked. For example, one of the most known PBX Asterisk.

What is PBX? PBX is a system that connects telephone extensions to the public switched telephone network (PSTN) and provides internal communication for a business. An IP PBX is a PBX system with IP connectivity and may provide additional audio, video, or instant messaging communication utilizing the TCP/IP protocol stack.

Therefore, it is important to implement additional security measures which expose and protect PBX servers from attacks, such as port knocking or intrusion prevention software frameworks such as fail2ban.

End-to-end encryption is a very tricky case because clients (i.e. phones or end devices) need to communicate with each other to set up end-to-end encryption between each other and exchange their keys. We were facing the question: “Is it possible to establish an end-to-end encryption between a Cisco device and WebRTC client and even multiple participants?”

Want to get more tech insights?

PBX dockerization

Conveniently, we can package all our software using Docker. However, such containerization of software is tricky if we package specifically PBX software, because we need to explicitly define which ports will be exposed in the Docker configuration; enabling PBX to communicate with the phones using the exposed ports.

The case is particularly tricky with PBX, because ports (using which PBX communicates) are negotiated dynamically. This means that you do not know which ports PBX uses for communication and you need to expose every port.

Our solution is based on a couple of open-source existing products which we describe below, these products cover both client-side and server-side and their integration.

Client side

The client side is a client application that can be in your Android or iOS phone or simply in a web browser – or even a SIP phone, using which, you are able to join a video conference or perform an audio call.

Our preferred choice for the client application is Jitsi. It has a low-level JavaScript API for providing a customized UI for Jitsi Meet.

Jitsi is a collection of free and open-source multiplatform voice, video conferencing, and instant messaging applications for the web platform, Windows, Linux, macOS, iOS, and Android.

web rtc video conferencing app

Jitsi actually includes much more than just a client. It is also a WebRTC and SIP server and SFU unit for multiparty conferences.

Another client to consider is Linphone – open source client application for all platforms including mobile ones such as Android and iPhone.

WebRTC server

You can choose from a couple of options. In the table below, you can see the most common WebRTC servers.

 

Mode

Web UI

Mobile APP

JS Library

Mobile SDK

Serverside Recording

Horizontal Scalability (out of the box)

Jitsi

SFU

yes

yes

yes

yes

yes

yes

Janus

SFU

x

x

yes

yes

yes

x

Kurento

SFU/MCU

yes

yes

yes

yes

yes

x

Mediasoup

SFU

x

x

yes

x

yes

x

Medooze

SFU/MCU

x

x

yes

yes

yes

x

Did you know? You can configure SIP-first servers as WebRTC servers. For example Asterisk. Here is the TUTORIAL.

SIP server

 

Asterisk

SIP Foundry

Elastix

FreeSWITCH

PBXInAFlash

FreePBX

OpenSIPs

Kamailio

Free To Use

Yes

Yes

No

Yes

Yes

Yes

Yes

Yes

Web-based GUI/Config

Yes

Yes

Yes

No (Third-party add-ons)

Yes

Yes

No

No (Third-party add-ons)

Video Support

Yes

Yes

Yes

Yes

No

Yes

Yes

Yes

IM/Chat

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Presence

Yes

Yes

Yes

Yes

No

Yes

Yes

Yes

Auto Attendant/IVR

Yes

Yes

Yes

Yes

Yes

Yes

No

No

Integration Support

Yes

Yes

Yes

Yes

Yes

Yes

No

Yes

Mobile Apps

Yes

Yes

Yes

No

No

Yes

No

No

WebRTC Support

Yes

No

Yes

Yes

No

Yes

No

Yes

Tip:

Did you know that you can build a SIP trunk between SIP servers and interface clients connected between multiple SIP servers – or even with devices having SIM cards resp. assign a SIM card to one of your SIP numbers?

Tip II:

To control cameras we can add another channel to SIP communication. Here is the example using Linphone, asterisk, and controlling with a camera with a keypad.

WebRTC end-to-end encryption

Conveniently, WebRTC clients can implement end-to-end encryption using so-called insertable-streams. Demo available here.

In the case of SIP-first PBX, the end-points (phones) have to support ZRTP which is a protocol based on which end-points can negotiate exchanging encryption keys and set up encrypted connections.

Please note that SRTP, even when deployed with SIP/TLS support, does not provide end-to-end encryption. The PBX is a trusted third party and can act as a man-in-the-middle to intercept traffic. Currently only ZRTP-enabled technology provides end-to-end encryption.

Conclusions

Even though there are many open-source software products available, developing a software solution similar to Zoom, Cisco WebEx, Teams by Microsoft, Meets by Google, and/or Slack is challenging. No wonder these companies spend tens (some hundreds) of millions of dollars on development, according to the statistics.

However, relying on the experienced software development partner, you will successfully overcome the possible challenges and get considerable benefits for your business at the best price-quality ratio. Altamira’s team, having deep expertise in the field will become the stable and reliable tech partner for making your business ambitions a reality.

Leave a Comment

Why you can trust Altamira

At Altamira, trust is built on expertise. We deliver content that addresses our industry's core challenges because we understand them deeply. We aim to provide you with relevant insights and knowledge that go beyond the surface, empowering you to overcome obstacles and achieve impactful results. Apart from the insights, tips, and expert overviews, we are committed to becoming your reliable tech partner, putting transparency, IT expertise, and Agile-driven approach first.

Editorial policy
Sign up for the latest Altamira news

Looking forward to your message!

  • Our experts will get back to you within 24h for free consultation.
  • All information provided is kept confidential and under NDA.