- DevOps Weekly
- Posts
- System Design 101: Understanding Database Sharding
System Design 101: Understanding Database Sharding
In this article, we'll explore Database Sharding—what it is, why it's crucial, and how companies use it to handle massive amounts of data. We'll use real-world examples from Spotify and Zoom to show how sharding solves scalability challenges.

Hello “👋”
Welcome to another week, another opportunity to become a great DevOps and Software Engineer
Today’s issue is brought to you by DevOpsWeekly→ A great resource for devops and backend engineers. We offer next-level devops and backend engineering resources.
PS: Before we dive into the topic of today, I have some very exciting news to share with you:
I am launching a platform called Mentoraura in March-designed to help you break into tech, grow your career, and become a world-class engineer.
Mentoraura is for:
✅ Beginners who want a structured, no-BS path to mastering DevOps and software engineering
✅ Career changers looking for practical, hands-on mentorship to transition into tech ✅ Engineers who want to level up their skills and build real-world expertise
With Mentoraura, I’ll guide you through solving real business challenges, mastering in-demand technologies, and becoming a highly valuable engineer in the industry.
🔥 Join the waitlist now and be the first to access the platform when it launches! 👉 mentoraura.com
In our last episode, we discussed Caching-how storing frequently accessed data in memory helps systems like YouTube and Twitter deliver lightning-fast performance. We saw how caching reduces database load and improves user experience.
Now, let's dive into Sharding, the technique that lets databases scale horizontally by splitting data across multiple machines.
What is Database Sharding?
Imagine Spotify's music library stored in a single database table with billions of songs. Every time you searched for a track, the system would need to scan this enormous table, making searches painfully slow.
Sharding solves this by breaking the database into smaller, more manageable pieces called shards. Each shard contains a subset of the data (like organizing a library by sections). For example:
Shard 1: Songs A-M
Shard 2: Songs N-Z
When you search for "Ed Sheeran," Spotify only needs to query the shard containing "S" artists, making your search instant.
Why is Sharding Important?
Let's look at Zoom, which hosts millions of concurrent meetings worldwide. Without sharding:
All meeting data (participants, chat logs, recordings) would be in one database
During peak hours, the database would become a bottleneck
Users might experience lag or dropped calls
With sharding:
Meeting data is split by region (e.g., North America, Europe, Asia)
When you join a call, Zoom queries only the relevant regional shard
The system scales seamlessly as more users join
How Companies Implement Sharding
Horizontal Partitioning: Splitting data by rows (e.g., user IDs 1-10M in Shard A, 10M-20M in Shard B)
Vertical Partitioning: Splitting by columns (e.g., user profiles in Shard A, payment data in Shard B)
Directory-Based: Using a lookup service to track which shard contains which data
Hash-Based: Applying a hash function to distribute data evenly (e.g., hashing user emails to assign to shards)
Real-World Example: Spotify
Challenge: Store 100M+ songs while keeping search fast
Solution:
Shards organized by artist name ranges
"Ed Sheeran" queries only hit the "S" shard
Results return in milliseconds instead of seconds
Impact: Instant search results, even with massive catalog growth
Real-World Example: Zoom
Challenge: Handle metadata for millions of concurrent meetings
Solution:
Shards organized by geographic region
Meeting in Japan? Only the Asia shard is queried
No single point of failure
Impact: Smooth performance even during global usage spikes
Tradeoffs to Consider
✔ Pros:
Enables near-infinite scalability
Improves query performance
Isolates failures (one shard going down doesn't crash everything)
✖ Cons:
Complex to implement and maintain
Joins across shards are challenging
Requires careful planning for even data distribution
Sharding is how Spotify serves 100M+ songs and Zoom hosts endless meetings without breaking a sweat. It transforms impossible scalability challenges into manageable solutions.
Next week, we'll explore Content Delivery Networks (CDNs)—how Netflix streams 4K video globally without buffering. Stay tuned!
P.S. If you found this helpful, share it with a friend or colleague who’s on their DevOps journey. Let’s grow together!
Got questions or thoughts? Reply to this newsletter-we’d love to hear from you!
See you on Next Week.
Remember to get Salezoft→ A great comprehensive cloud-based platform designed for business management, offering solutions for retail, online stores, barbershops, salons, professional services, and healthcare. It includes tools for point-of-sale (POS), inventory management, order management, employee management, invoicing, and receipt generation.
Weekly Backend and DevOps Engineering Resources
DevOps and Backend Engineering Basics by Akum Blaise Acha
DevOps Weekly, Explained by Akum Blaise Acha
Simplifying Operating System for Backend DevOps Engineers by Akum Blaise Acha
Why Engineers Should Embrace the Art of Writing by Akum Blaise Acha
From Good to Great: Backend Engineering by Akum Blaise Acha
Web Servers for Backend and DevOps Engineering by Akum Blaise Acha
Reply