Design WhatsApp

WhatsApp is a secure and popular messaging app with over 2 billion users worldwide. On average, users spend 19.4 hours monthly, sending more than 100 billion messages daily—a 54% rise since 2018.

As system designers, we must think about how fast users are growing.

WhatsApp is a good example to study.

Some key questions are:

How is WhatsApp designed?
How does it actually work?
What components make up the system?
How can it support billions of users at the same time?
How does it keep user data safe and secure?

Functional Requirements

Conversations:

Support one-to-one chats.
Support group chats.

Acknowledgment:

Show message status: sent, delivered, and read.

Sharing:

Allow sharing of images, videos, and audio files.

Chat Storage:

Store chat messages even if the user is offline.
Deliver messages once the user comes online.

Push Notifications:

Notify users about new messages when they are offline.
Deliver notifications as soon as they are online.

Non Functional Requirement

Low Latency

Messages should be delivered quickly with minimal delay.

Consistency

Messages must appear in the same order they were sent.
Chat history should stay the same across all devices.

Availability

The system should always be accessible.
In some cases, availability may be sacrificed to maintain consistency.

Security

All messages must use end-to-end encryption.
Only the sender and receiver can read the content, not even WhatsApp.

Scalability

The system must handle a growing number of users.
It should support billions of messages per day.

Resource estimation

WhatsApp is the world’s most widely used messaging app.

It has over 2 billion users globally.

Users exchange more than 100 billion messages per day.

Storage estimation

100 billion message, lets consider 90 billion text, 5 billion image, 5 billion video shared every day
Let say average message size is 50 char which is 50*2 = 100 Bytes
100 bytes * 100 billion = 10 TB (remember 1 billion * 1 KB = 1 TB)
Image average size is 2 Mb: 2 Mb * 5 Billion = 10 PB ( 1 MB * 1 Billion = 1 PB)
Image average size 100 Mb: 100 MB * 5 Billion = 500 PB ( 1 MB * 1 Billion = 1 PB)
Total Storage ~ 511 PB

Bandwidth Estimation

100 billion message, let say shared to 2 people minimum
200 billion query / day
200 Billion / 86400 = 2300K QFPS (1 Billion req /day = 11.5K QPS)
Standard Server capacity is 64 K QFPS typically 32-core CPU, 128 GB RAM, NVMe SSDs, 10 Gbps network, plus caching and load balancing.
No of Server = 2300 / 64 = 35 Server

API Design

sendMessage(message_ID, sender_ID, receiver_ID, type, text=none, media_object=none, document=none)

getMessage(user_Id)

uploadFile(file_type, file)

downloadFile(user_id, file_id)

High-level Design

Detailed Design

What’s missing from the high-level design?

How do clients and servers create a communication channel?
How can the design scale to billions of users?
Where and how is user data stored?
How do we identify the correct receiver for a message?

Let’s dive into the high-level design and examine each component in detail.

When you open WhatsApp on your phone, something interesting happens behind the scenes. Your device doesn’t just send messages blindly—it first establishes a persistent connection with a WebSocket server using the WebSocket protocol. Unlike traditional HTTP, this connection stays open, allowing instant, two-way communication.

But here’s the catch: one server cannot handle billions of people chatting at the same time. That’s why WhatsApp has many WebSocket servers, each responsible for keeping connections alive. Every online user gets a port, and the information about which user is connected to which server and port is carefully maintained in a central place called the WebSocket Manager, which sits on top of a Redis cluster. Think of Redis as a super-fast phone directory that helps find where each user is currently connected.

Sending and Receiving Messages

Now, imagine User A wants to send a message to User B. Here’s how the story unfolds:

User A sends the message to their WebSocket server.
That server checks with the WebSocket Manager to find where User B is connected.
- If User B is online, the manager immediately points to their server.
- If User B is offline, the message takes a different route.
At the same time, the message is also stored in the Message Service, which sits on top of a special distributed database called Mnesia.
- Mnesia is designed for fast lookups, high fault tolerance, and quick deletion of old messages.
- Messages are stored temporarily (FIFO order) and deleted once delivered—or after 30 days if undelivered.
If B is online, their WebSocket server picks up the message and delivers it instantly. If B is offline, they’ll receive it when they come back online, often via a push notification.

To make this smoother, each WebSocket server keeps a small cache of recent connections so it doesn’t always need to ask the WebSocket Manager. For example, if A and B are chatting continuously, the servers already know each other’s location.

Sharing Media Files

Text is light, but media—like photos, videos, and documents—are heavy. To handle this, WhatsApp uses a dedicated Asset Service. Here’s the process:

The media file is compressed and encrypted on the sender’s phone.
It is uploaded to blob storage via the Asset Service.
- To avoid duplication, a hash is generated. If the file already exists, WhatsApp just reuses the existing copy.
The Asset Service generates a unique file ID and passes this ID to the receiver via the Message Service.
The receiver then downloads the media directly from storage using the ID.
If a particular file is requested too often, the Asset Service caches it in a CDN for faster delivery.

Group Messages

Groups are trickier. Not everyone in a group is online at the same time. Here’s how WhatsApp handles it:

The Group Message Handler queries the Group Service, gets the list of members, and then delivers the message to each user, just like a WebSocket server would.

When User A sends a group message, it goes first to the Message Service.

The message is then pushed into Kafka, which acts like a message bus. In Kafka terms:

The group = a topic.

Senders = producers.

Group members = consumers.

The Group Service (on top of a MySQL cluster with Redis caching) maintains full group details: IDs, members, icons, status, etc.

Validate Non Functional Requirement

The system is designed to meet key non-functional requirements: low latency, consistency, availability, security, and scalability.

Low Latency
- Use geographically distributed WebSocket servers with caching.
- Add Redis cache clusters on top of MySQL clusters.
- Use CDNs for fast delivery of media and documents.
Consistency
- Ensure message order with a FIFO messaging queue.
- Use a Sequencer to assign IDs and maintain causality.
- Store offline messages in the Mnesia database queue and deliver in order when users reconnect.
Availability
- Deploy enough WebSocket servers with data replication.
- Re-create sessions via load balancer if a WebSocket server fails.
- Use Mnesia cluster with primary-secondary replication for durability and availability.
Security
- Apply end-to-end encryption so only sender and receiver can read messages.
Scalability
- One server can handle ~10 million connections.
- Add or remove servers dynamically as load changes.

Trade-offs in WhatsApp’s Design

Even though the system meets functional and non-functional requirements, there are two key trade-offs:

Consistency vs. Availability
- CAP Theorem: During network failures, a system can guarantee either consistency or availability, but not both.
- WhatsApp’s Choice:
  - Message order is very important.
  - Prioritize consistency (messages must stay in order).
  - Accept reduced availability in rare failure cases.
Latency vs. Security
- Low Latency: Users expect real-time message delivery.
- Security Requirement: End-to-end encryption ensures messages are private.
- Trade-off:
  - Encryption/decryption of text, images, videos, and audio adds processing time.
  - This may increase latency, especially for large multimedia files.
- WhatsApp’s Choice:
  - Prioritize security over ultra-low latency.
  - Accept slight delays for safe message transmission.

Summary

We designed a WhatsApp messenger system.
Steps we covered:
- Identified functional and non-functional requirements.
- Estimated key resources (storage, bandwidth, servers).
- Designed both high-level and detailed architecture.
- Explained components and their roles in the system.
- Evaluated how the system meets non-functional requirements.
- Discussed important trade-offs (consistency vs. availability, latency vs. security).
Key takeaway:
- General-purpose servers can be optimized for large-scale systems.

Functional Requirements

Non Functional Requirement