Design Problems
Designing a URL Shortener
Turn a long URL into a short, shareable code — and serve billions of redirects per day. A read-heavy system that teaches ID generation, caching, and graceful scaling.
Updated 2026-06-09
Designing a URL Shortener
A URL shortener takes a long link and hands back a compact alias (short.ly/a1b2c3).
When someone opens the short link, the service redirects them to the original
destination. Shorter links are easier to share, look cleaner, and let you attach
analytics. Think tinyurl.com or bit.ly.
It's a favourite interview problem because the core loop is tiny but the system is read-heavy at scale — every interesting decision is about making the redirect path fast while keeping writes simple and reliable.
1. Clarifying the requirements
Pin the scope before designing. A typical exchange lands on:
- ~10M new links/day, 100:1 read-to-write ratio (so ~1B redirects/day).
- Codes use Base62 characters (
a–z,A–Z,0–9). - Custom aliases are a nice-to-have; links can expire (default + override); basic click counts are wanted, full analytics aren't.
Functional: shorten a URL, redirect on access, optional custom aliases, optional expiry, basic click counts.
Non-functional: high availability (99.99%), redirect latency under ~50ms p99, each long URL maps to a stable code, durable mappings, scales to the 100:1 read skew.
2. Back-of-the-envelope
| Quantity | Estimate |
|---|---|
| Write QPS | 10M / 86,400 ≈ 115/s avg, ~350/s peak |
| Read QPS | 1B / 86,400 ≈ 11,500/s avg, ~35,000/s peak |
| Storage/record | ~300 bytes (code + long URL + user + metadata) |
| Storage/year | 3.65B links × 300B ≈ 1.1 TB/yr (~5.5 TB over 5 years) |
Code length with Base62: 62^6 ≈ 56.8B codes, 62^7 ≈ 3.5T. At 3.65B links/year,
7 characters buys decades of headroom.
The headline number is the 100:1 skew — optimise the read path aggressively, keep the write path simple.
3. Core APIs
POST /shorten { long_url, custom_alias?, expiry_date?, user_id? }
→ { short_url, long_url, expiry_date, created_at }
409 if alias taken · 400 if URL invalid
GET /{short_code} → 301/302 redirect to long_url
410 if expired · 404 if unknown
GET /analytics/{short_code} → { click_count, ... }
4. High-level design
Because reads dominate writes 100:1, split the write service from the read service so each scales independently.
Writing a short URL:
Client ──POST /shorten──▶ Load Balancer ──▶ Write Service
│ validate URL
│ generate unique code
▼
Database (short_code → long_url)
│
◀── return short URL ──
Redirecting (the hot path):
GET /abc123 ─▶ LB ─▶ Redirection Service ─▶ Redis ──hit──▶ check expiry ─▶ 302
│ miss
▼
Database ─▶ fill cache ─▶ 302
The redirection service is stateless and horizontally scalable; the cache absorbs the bulk of the traffic so the database only sees misses.
5. SQL vs NoSQL
We store billions of records, almost all access is a simple key lookup by short_code, reads vastly outnumber writes, and there are no joins. That profile points at a NoSQL key-value/wide-column store (DynamoDB, Cassandra) for horizontal scale and availability.
URL mappings: short_code (PK) · long_url · user_id · created_at ·
expires_at · click_count · is_custom. Secondary index on user_id to list a
user's links.
6. Unique code generation
The heart of the design. A good scheme is unique, compact, fast, and works across distributed servers. Three approaches, simple → scalable:
Approach 1 — Hash + encode (deterministic)
Canonicalize the URL (lowercase host, strip default ports, normalize trailing
slash), hash it (MD5/SHA-256), take the first ~6 bytes, and Base62-encode them.
Base62 (no +//) is URL-safe, unlike Base64.
- Deterministic: the same URL always yields the same code, so duplicates are free and any server can compute a code with no coordination.
- Collisions are inevitable once you truncate. On a unique-constraint violation, re-hash with a salt or append a suffix and retry — but that re-introduces a DB lookup on the write path, negating the stateless benefit.
Approach 2 — Global counter
Hand out a monotonically increasing integer from a central store (Redis INCR is
atomic), then Base62-encode it. INCR of 1,000,000,000 → 15ftgG (still under 7
chars).
- Pros: guaranteed unique, no collision logic, sub-ms.
- Cons: the counter is a bottleneck and single point of failure, and sequential IDs are guessable.
- Fixes: sharded counters — give each shard a range, e.g.
global_id = (shard_id << 20) | local_counter, so 1,024 shards each own ~1M IDs; and obfuscation — pass IDs through a reversible Feistel/XOR transform so codes look random.
Approach 3 — Distributed ID (Snowflake)
A 64-bit ID assembled per worker, no central counter:
| 41 bits timestamp | 10 bits worker ID | 12 bits sequence |
~69 yrs from a up to 1,024 4,096 IDs per
custom epoch workers ms per worker
Workers get their 10-bit ID from ZooKeeper/etcd at startup. IDs are generated in-memory (no network hop) and are roughly time-sortable (k-sortable), which makes them efficient primary keys and great for time-range queries. The catch: clock skew can break uniqueness, so NTP sync is mandatory.
| Strategy | Pros | Cons | Best for |
|---|---|---|---|
| Hashing | Deterministic, stateless | Collisions need resolution | Deduping identical URLs |
| Global counter | Unique, dead simple | Central bottleneck, guessable | Small/medium traffic |
| Distributed ID | Scalable, k-sortable | Complex, clock-sensitive | Large distributed systems |
7. Fast redirects
301 vs 302
- 301 Moved Permanently — the browser caches the destination and never comes back. Fastest for repeat visits, but you lose analytics and the ability to change or expire the link.
- 302 Found — the browser hits you every time. Slightly more load, but you keep click tracking and control.
Default to 302: the analytics and updatability are worth the extra reads for any real shortener.
The read waterfall
Try the fastest, closest layer first; fall through only on a miss:
Browser cache → CDN / edge → Redis (app cache) → Database (source of truth)
(301 only) ~10–50ms sub-ms hit key lookup, then fill cache
On a cache miss the app reads the DB, writes the result to Redis with a TTL, and the CDN caches the 302 for the next user (cache-aside).
8. Custom aliases
Accept an optional custom_alias; fall back to auto-generation if absent.
- Validate against
^[a-zA-Z0-9_-]+$, enforce length limits, and reject a blocklist of reserved routes (/api,/admin,/login) — all before any DB call. - Race condition: two users can both check "is
summer-salefree?", both see no, and both insert. Don't solve this in app logic — put aUNIQUEconstraint on the alias column and let the database be the arbiter. The second insert fails cleanly.
9. Link expiration
| Approach | How | Trade-off |
|---|---|---|
| Passive | Check expires_at at redirect time | Real-time accuracy, but rows linger forever |
| Active | Background job deletes expired rows | Clean DB, but scans are heavy and not real-time |
| Hybrid (recommended) | Passive check on reads + low-frequency cleanup job | Accurate and bounded growth |
Cache consistency: if a link expires in 5 min but its cache TTL is 24h, you'll
serve dead redirects. Always set cache_ttl = min(remaining_lifetime, default_ttl).
10. Analytics — click counts
A click = a successful redirect (filter bots / duplicate hits). Never block the redirect on the write — decouple it.
- Direct increment —
UPDATE … SET click_count = click_count + 1. Simple, always fresh, but a DB write per redirect causes row contention. Low scale only. - Buffered counting (recommended) — increment a Redis counter per click, flush aggregates to the DB periodically. Far fewer writes; UI lags slightly.
- Event streaming — emit a click event into Kafka → Flink → ClickHouse for real-time, sliceable analytics. Most powerful, most operational overhead.
11. High availability
- Multi-region: read replicas per region, latency-based DNS routing; e.g.
DynamoDB Global Tables or Cassandra
LOCAL_QUORUM. - Graceful degradation: if the DB is unreachable, keep serving from cache and queue writes for retry — return slightly-stale data rather than errors.
Takeaways
| Topic | Key idea |
|---|---|
| ID generation | Counter (simple) vs Snowflake (scalable) |
| Caching | Essential for a read-heavy load; cache-aside waterfall |
| Database | NoSQL for simple key-value access at scale |
| Expiration | Hybrid: passive check + lazy cleanup, with aligned cache TTLs |
| Analytics | Decouple from the redirect path via buffered counting |
The system is read-heavy (100:1) — optimise the read path hard, keep writes simple and durable. Many of the primitives here (consistent hashing for partitioning, caching strategies for the read path) show up in every large system.
