Mastering SIP, SDP, RTP, and BFCP: A Deep Dive into VoIP Call Setup and Screen Sharing
This article provides a comprehensive technical overview of SIP signaling, SDP negotiation, RTP/RTCP media transport, and the Binary Floor Control Protocol (BFCP), including their interactions, message flows, deployment models, and practical screen‑sharing implementations for video‑conference systems.
1. SIP Protocol Basics
1.1 SIP Overview
SIP (Session Initiation Protocol) is an application‑layer protocol used to initiate, manage, and terminate voice/video sessions over IP networks. It defines User Agents (UAC/UAS) and servers, transports messages via UDP, TCP, or TLS, and carries media negotiation data in SDP within INVITE and other requests.
1.2 SIP Signaling Flow (INVITE/ACK/BYE/REGISTER)
REGISTER lets a client inform a SIP server of its current address; the server replies with authentication challenges (401/407) and, upon successful verification, returns 200 OK. Call setup relies on INVITE (carrying an SDP offer) and ACK (confirming the session). Media streams are then exchanged via RTP/RTCP. INFO, OPTIONS, UPDATE, RE‑INVITE can modify the session, while BYE terminates it.
1.3 SIP and SDP Interaction
SIP carries SDP payloads that describe media parameters such as codecs, ports, and transport protocols. The Offer/Answer model works as follows: the caller sends an INVITE with an SDP offer, the callee replies with 200 OK containing an SDP answer, and ACK finalizes the signaling. RTP then transports the agreed media.
v=0
o=alice 2890844526 2890844526 IN IP4 alice.example.com
s=VideoCall
c=IN IP4 203.0.113.1
t=0 0
m=audio 49170 RTP/AVP 0 101
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
m=video 51372 RTP/AVP 99
a=rtpmap:99 H264/90000
...1.4 RTP/RTCP Role in Media Transport
RTP (RFC 3550) carries time‑stamped audio/video packets, providing sequence numbers and synchronization sources but relying on lower‑layer protocols (usually UDP) for reliability and congestion control. RTCP periodically sends control reports (SR, RR, SDES) that convey packet loss, jitter, and other statistics, enabling participants to adapt bitrate or take corrective actions.
2. BFCP Detailed Analysis
2.1 Purpose and Architecture
BFCP (Binary Floor Control Protocol, RFC 8855) manages shared resources (e.g., speaking rights, screen sharing) in multi‑party conferences. A central Floor Control Server coordinates requests from Floor Participants and actions from the Floor Chair. BFCP operates over UDP or reliable transports (TCP/TLS) and uses binary encoding for efficiency.
2.2 Basic Flow and Message Types
FloorRequest (value 1): participant asks for a floor.
FloorRelease (value 2): participant releases a previously granted floor.
FloorRequestStatus (value 4): server reports the status of a request (pending, accepted, etc.).
FloorQuery / FloorStatus (values 7/8): query current floor state.
ChairAction (value 9) and ChairActionAck (10): chair grants or denies a request.
Hello / HelloAck (11/12): handshake to negotiate capabilities.
Error (13): indicates protocol errors.
FloorRequestStatusAck / FloorStatusAck (14/15): client acknowledges status messages.
Goodbye / GoodbyeAck (16/17): terminates the BFCP session.
2.3 BFCP Session Establishment Example
BFCP is typically negotiated inside SDP as an additional media line. Example SDP fragments for a SIP INVITE:
m=application 5900 TCP/TLS/BFCP *
a=floorctrl:c-only # client role
a=confid:1234 # conference ID
a=userid:100 # user ID
a=floorid:1 # requested floor resourceThe answer SDP mirrors these attributes, possibly selecting the server role (s-only) and confirming the same conference and user IDs. After SDP negotiation, the BFCP client sends a Hello message, receives HelloAck, and then proceeds with FloorRequest, FloorRequestStatus, and other primitives as needed.
Typical request‑release sequence:
FloorRequest from participant (Transaction ID = 123).
FloorRequestStatus (pending) from server.
FloorRequestStatus (accepted, queue position 1).
FloorRequestStatus (granted).
FloorRelease using the granted Floor‑Request‑ID.
FloorRequestStatus (released) confirming release.
2.5 Deployment Modes: MCU vs. SFU
In a centralized MCU architecture, a single conference control unit hosts the BFCP server, handling all floor‑control traffic. In a distributed SFU setup, media streams are forwarded peer‑to‑peer, while BFCP may run on a dedicated server or on one of the participants, coordinating only the control messages.
3. Screen‑Sharing Implementation in Video Conferencing
3.1 RTP‑Based Screen Sharing Flow
Screen content is captured, encoded as a video stream, and transmitted via RTP on a separate port. In SIP/BFCP environments, the screen share appears as an additional “presentation” media line in SDP, negotiated after the participant obtains floor control through BFCP.
3.2 Audio/Video and Mouse/Touch Event Coordination
Audio continues on the primary RTP channel, while the screen‑share video uses its own RTP stream. Interactive features (mouse clicks, touch events) can be carried over a WebRTC data channel or a SIP data channel, often encrypted and timestamped to stay synchronized with video frames.
4. Conclusion
The article dissected the SIP signaling process, RTP/RTCP media transport, BFCP floor‑control mechanisms, and practical screen‑sharing techniques, offering engineers with a VoIP background a thorough understanding of how these protocols interoperate and how to configure or troubleshoot them in real‑world video‑conference deployments.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
