Design and Implementation of WRTC Real‑Time Audio/Video and IP‑Telephony Solution
This article describes the background, architecture, signaling flow, client and server components, and code details of TEG's WRTC solution for real‑time audio/video calls and IP‑telephony, including WebRTC fundamentals, SIP integration, and DTMF support.
Background – In the mobile‑internet era many services require audio/video communication; TEG built a WebRTC‑based solution (WRTC) for businesses such as 58, Ganji, and Anjuke, also adding voice‑to‑IP‑phone capability to handle offline users.
WebRTC Overview – WebRTC is a large open‑source framework used by many companies (Alibaba Cloud, NetEase Cloud, Qiniu Cloud). Its modules (APM, JitterBuffer, Camera/Camera2, soft‑codec framework, etc.) provide rich audio/video processing capabilities that are worth studying.
Audio/Video Call Architecture – The initial architecture connects the client directly to a FreeSwitch SIP gateway for IP‑phone calls. To reduce client complexity, the SIP handling was moved to the server side, allowing the server to bridge audio streams via SIP and optionally record calls.
Call Flow – The caller and callee exchange signaling through a Room/Signaling service; if a direct call fails, the server initiates an IP‑phone call through the carrier. The flow includes room management, SDP offer/answer exchange, and media negotiation.
Client Side
1. Room Management – Methods for requesting room info, joining a room, and notifying busy status are provided (see code snippet).
/**
@brief 请求RoomInfo(后台需要进行身份验证,并分配roomId等)
@param completeHandler 回调block
@since v1.0.0
*/
+ (void)requestRoomInfo:(CompleteHandler)completeHandler;
/**
@brief 加入房间
@param roomid 房间的id
@param params 参数字典
@param completeHandler 回调返回
@since v1.1.1
*/
+ (void)joinToRoom:(NSString *)roomId
Parameters:(NSDictionary *)params
Complete:(CompleteHandler)completeHandler;
/**
@brief 通知此时处于忙状态
@param roomId 第三方呼叫发来的roomId
@since v1.0.0
*/
+ (void)notifyBusy:(NSString *)roomId;2. Signaling Management – WebSocket is used for SDP and ICE candidate exchange; audio uses OPUS, video uses H264.
3. SDP Example – An offer SDP for a one‑to‑one audio call is shown.
offer
...
a=mid:audio
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=sendrecv
a=rtcp-mux
a=rtpmap:111 opus/48000/2
...4. Status Management – The signaling server tracks call states such as busy, refuse, cancel, enabling analytics and stability improvements.
Server Side
Key services include room management, signaling, and ICE (STUN/TURN) handling. NAT traversal types are described, and when symmetric NAT is detected the server forces relay mode.
The server also converts WebRTC offers to SIP INVITE messages for the carrier gateway, handling 100 trying, 180/183 ringing, and 200 OK responses to complete the two‑stage media negotiation.
DTMF Support – To enable dual‑tone multi‑frequency dialing for IP‑phone extensions, the PeerConnection class was extended with an insertDtmf method.
bool PeerConnection::insertDtmf(const std::string& audio_track_id, const int ext_number, const int duration){
//判断是否支持发送DTMF信号
bool canInsertDtmf = session_->CanInsertDtmf(audio_track_id);
if (canInsertDtmf) {
//WebRTCSession对象发送DTMF
isInsert = session_->InsertDtmf(audio_track_id,ext_number,duration);
}
return isInsert;
}Conclusion – The WRTC solution now provides stable one‑to‑one audio/video and IP‑phone capabilities; future work will integrate short‑video SDK features, multi‑party calls, and further optimizations on both capture and rendering sides.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.