Skip to Content

What communication is used in WhatsApp?

WhatsApp is one of the most popular messaging apps in the world, with over 2 billion users globally. It allows people to communicate in a variety of ways, primarily through text messaging, voice calls, video calls, and media sharing. So what types of communication does WhatsApp utilize to enable all of these features?

Text Messaging

The core functionality of WhatsApp is text messaging. Users can send and receive messages to individuals or groups. This works similarly to standard SMS text messaging but without any character limits. WhatsApp messages use the internet to transmit so you can send long texts, emojis, links, and more.

Behind the scenes, WhatsApp utilizes Extensible Messaging and Presence Protocol (XMPP) for its text messaging capabilities. XMPP is an open XML-based communication protocol for real-time messaging, presence, and request-response services. It enables the near real-time exchange of structured data between any two or more endpoints on a network.

With XMPP, each WhatsApp user has a unique XMPP address (JID). When you send a message, it gets routed to the recipient’s JID where the WhatsApp app on their device receives it instantly to display. WhatsApp servers mediate the transmission of XMPP messages between users while also storing them for backup purposes.

End-to-End Encryption

WhatsApp also implements end-to-end encryption for text messaging. This means the messages are fully encrypted from when you hit send until the message is received and decrypted by the recipient. Not even WhatsApp itself can read the encrypted message contents.

WhatsApp applies the Signal Protocol for its end-to-end encryption. This uses a combination of asymmetric cryptography for key exchange, symmetric encryption for message encryption, and signed cryptographic hashes for integrity protection.

Each user has their own private encryption and decryption keys stored locally on their device. When you send a message, it gets encrypted with the recipient’s public key. The message can then only be decrypted by the recipient’s paired private key. This ensures third parties cannot access the messages in transit.

Voice and Video Calls

In addition to texting, WhatsApp also allows users to make voice and video calls to connect in real-time. For these call features, WhatsApp utilizes Voice over IP (VoIP) technology and the Secure Real-time Transport Protocol (SRTP).

Voice over IP

VoIP stands for Voice over Internet Protocol and refers to making voice calls over the internet rather than traditional phone lines. With WhatsApp calls, your voice data gets digitized into IP packets that can be transmitted using WhatsApp’s servers.

To enable this, WhatsApp utilizes Session Initiation Protocol (SIP) for signaling and session management of voice calls. SIP handles the setup and teardown of call sessions between users. Once a call is established, the voice data gets encoded via Opus codec and streamed over Real-time Transport Protocol (RTP).

Secure Real-time Transport Protocol

SRTP is an extension of RTP that provides encryption, message authentication, and integrity verification of voice and video streams. WhatsApp leverages SRTP to secure real-time transport of VoIP calls.

SRTP utilizes algorithms like AES and HMAC-SHA1 to encrypt the voice and video data. It also relies on ZRTP key exchange to share session encryption keys between users. This prevents VoIP calls from being intercepted and listened to by third parties.

Media Sharing

In addition to calls and texts, WhatsApp supports sharing photos, videos, documents, and other media. When sending media files, WhatsApp needs to reliably and efficiently transfer potentially large files between users.

File Transfer Protocol

To enable media sharing capabilities, WhatsApp implements a customized version of the File Transfer Protocol (FTP). FTP is a standard network protocol used for transferring files from one host to another over the internet.

WhatsApp media transfer uses a variant of FTP optimized for mobile use. Media files get broken down into TCP or UDP packets for transmission. These packets get reassembled back into media files on the recipient’s device. Transfer rates are also throttled based on detected network conditions to maintain reliability.

Media Compression

Given that media files can be very large, WhatsApp utilizes compression techniques to reduce transfer sizes. Photos and videos get compressed in real-time when sending, then decompressed when received.

WhatsApp uses advanced video codecs like H.264 to compress videos efficiently before packetizing. For photos, techniques like chroma subsampling and quantization matrix adjustments are applied. This compression allows large media files to be delivered smoothly even on slower connections.

Media Download

To facilitate smooth media transfer, WhatsApp also supports downloading and later viewing. When receiving media, the user can choose to download it immediately over the internet or save it to the device for later viewing. Media files saved to the device utilize the phone’s local storage.

This allows users to conserve mobile data and prevent stalls from slower connections. Once the media is downloaded to the device, it can be viewed at any time without needing an internet connection.

Group Messaging

WhatsApp supports messaging groups with up to 256 participants. Group messaging involves some additional technical considerations and protocols to manage all the users.

Group Management

WhatsApp groups utilize XMPP extensions to handle group control functions. This includes group creation, user invites, administrator privileges, join/leave notices, and other group settings.

There is a Groupchat XEP extension that defines how XMPP messages get routed in group conversations. There are also extensions like MUC for managing multi-user chat rooms.

Message Relay

To relay messages to all group participants, WhatsApp has to transmit each message to every group member individually. Doing sequential unicast transmissions to each recipient does not scale well.

Instead, WhatsApp implements optimization techniques like omnipoint transmission. This allows server resources to be conserved by routing messages simultaneously using IP multicast over the WhatsApp server network.

Read Receipts

Within groups, WhatsApp provides read receipts that indicate when messages have been read by recipients. To track this for groups, WhatsApp maintains data per user about the last read receipt time for that group.

As users read new group messages beyond that timestamp, an XMPP extension called Message Processing Hints is used to send confirmation receipts back to the server. The server compiles this data to update read statuses.

Conclusion

WhatsApp provides a rich messaging experience using a diverse set of protocols and technologies. For text messaging, it uses XMPP with end-to-end encryption for security. Voice and video calls utilize SIP setup with SRTP media encryption. Media sharing involves FTP transfer with compression. Groups leverage XMPP extensions for management and optimizations like omnipoint multicasting. These advanced communication techniques enable WhatsApp to be a ubiquitous and high-performance messaging platform.