Design Problems
Designing Instagram
A photo-sharing platform for billions: media uploads, a personalized feed, and search. The defining decision is how you fan out a post to followers' timelines.
Updated 2026-06-09
Designing Instagram
Instagram lets users upload photos and videos, caption them, follow others, and engage through likes and comments — all served as a fast, personalized feed. We'll focus on the core loop: posting media, generating feeds, and search. (Direct messaging, Reels, and notifications are out of scope.)
1. Requirements
Functional: upload photos/videos (incl. multi-image carousels), captions, follow/unfollow, like/comment/share, a personalized feed of followed accounts, search by username and hashtag.
Non-functional: low feed latency (~100ms), 99.99% availability, eventual consistency (a small delay before a new post appears is fine), scale to millions of concurrent users and billions of posts, and durability — uploaded media must never be lost.
2. Capacity estimate
- 2B monthly users, ~500M daily.
- ~100M uploads/day → ~200M writes/day (media + metadata).
- Feed reads: 500M users × ~100 posts ≈ 50B feed reads/day; ~80% from cache → ~10B DB reads/day. The system is overwhelmingly read-heavy.
- Media storage: 80% photos (~1MB) + 20% videos (~10MB) ≈ 280 TB/day.
- Metadata: ~500 bytes/post → ~90 TB/year.
3. High-level design
Clients ─▶ LB / API Gateway ─▶ ┌─ User Service ─────────── profiles, follows
├─ Post Service ─────────── uploads, metadata
├─ Feed Service ─────────── precomputed timelines
├─ Engagement Service ───── likes/comments/shares
└─ Search Service ───────── Elasticsearch
│
Kafka (decouples services, fans out events)
│
Object Storage (S3) + CDN · Redis (hot feeds/counts)
- API Gateway / LB — single entry point; auth, rate limiting, routing.
- Post Service — coordinates media upload to object storage and writes metadata; emits a "new post" event to Kafka.
- Feed Service — precomputes and caches user feeds in Redis for fast reads.
- Engagement Service — writes likes/comments/shares asynchronously via Kafka.
- Search Service — indexes users/posts/hashtags in Elasticsearch.
- Object Storage + CDN — media lives in S3; a CDN serves it close to users.
4. Database design
A platform like this mixes storage engines by workload:
- Relational (PostgreSQL/MySQL) for structured data:
Users,Posts,Media(metadata only, not the files),Comments,Shares,Followers. - NoSQL (Cassandra/DynamoDB) for a denormalized feed table — precomputed timelines, updated asynchronously via Kafka and cached in Redis.
- Graph DB (Neo4j/Neptune) for relationship queries — mutual friends, "people you may know," influencer ranking — without painful SQL joins.
- Elasticsearch for full-text and prefix search over profiles, posts, hashtags.
- Object storage (S3) for the media files themselves, replicated across data centres and fronted by a CDN.
5. Photo/video upload — pre-signed URLs
Don't stream media through your backend. Instead:
1. Client requests upload ─▶ API Gateway ─▶ Post Service
2. Post Service returns pre-signed URLs (one per file)
3. Client uploads files DIRECTLY to S3, in parallel
4. Client confirms with media URLs ─▶ Post Service saves metadata
5. Post Service emits "new post" event ─▶ Kafka ─▶ Feed Service
Direct-to-storage uploads cut backend load and enable fast parallel transfers.
6. Newsfeed generation — the core decision
Users follow both ordinary accounts and celebrities, so the feed mixes two strategies.
Fan-out on write (push) — for normal users
When a user posts, push the post into each follower's cached timeline right away.
User A posts ─▶ Kafka ─▶ Feed Service ─▶ LPUSH into each of A's 500 followers' Redis feeds
Follower opens feed ─▶ LRANGE ─▶ instant (pre-computed)
- Pros: reads are super fast — feeds are already assembled.
- Cons: a celebrity with millions of followers causes massive write amplification — one post copied into millions of timelines.
Fan-out on read (pull) — for celebrities
For huge accounts, don't pre-push. Instead, fetch celebrity posts at read time and merge them with the user's precomputed feed.
User requests feed ─▶ Feed Service:
• normal posts ◀── Redis (precomputed)
• celebrity posts ◀── hot cache / DB, fetched now
• merge in real time ─▶ serve
- Pros: avoids the write storm — scalable.
- Cons: slightly higher read latency; relies on caching.
The hybrid model is the answer: push for normal accounts, pull for celebrities, merged at read time. This is the central insight of feed design.
7. Search
- Indexing: when a post/user is created, the service writes metadata to its DB and emits a Kafka event; the Search Service consumes it and updates Elasticsearch.
- Querying: check Redis for recent/cached queries first; on a miss, hit Elasticsearch (full-text, prefix matching, ranking by engagement), then cache the result back in Redis.
8. Engagement (likes / comments / shares)
The Engagement Service emits a Kafka event and updates the DB asynchronously, so the user's action returns instantly. For hot posts, cache like/share counts and top comments to avoid hammering the DB.
9. Scalability, availability, durability
- Scalability — distribute data across nodes (Cassandra/DynamoDB); shard by
user_id,post_id,follower_id; keep services independent and lean on message queues (Kafka) for async throughput. - Availability — replicate DBs across regions, deploy across availability zones, use automatic failover (leader-follower) and circuit breakers for graceful degradation.
- Durability — store media in S3 (multi-location replication), use multi-region DB replication and backups, and write-ahead logging / event sourcing so state can be rebuilt.
Takeaways
| Topic | Key idea |
|---|---|
| Upload | Pre-signed URLs — client uploads straight to S3 |
| Feed | Hybrid fan-out: push for normal users, pull for celebrities |
| Storage | Polyglot — relational + NoSQL feed + graph + Elasticsearch + S3 |
| Engagement | Decouple via Kafka; cache counts for hot posts |
| Reads | ~80% served from cache — optimise the read path |
The system is read-heavy, so the design pours effort into precomputing and caching feeds while keeping writes asynchronous. The make-or-break choice is the fan-out pattern.
