Choosing the Right Database

The right tool for the right job

The Database Selection Challenge

Choosing the right database is one of the most critical decisions in system design. The wrong choice can lead to performance problems, scaling issues, and development headaches.

Decision Framework

Step 1: Analyze Your Data Structure

Questions to Ask:

Is your data structured (tables) or unstructured (documents)?
Do you have fixed relationships or flexible structures?
Do you need to store nested/hierarchical data?

Step 2: Analyze Query Patterns

Questions to Ask:

Do you need complex JOINs or simple lookups?
Are queries mostly by key or by complex conditions?
Do you need to traverse relationships?

Step 3: Analyze Scale Requirements

Questions to Ask:

How many records do you expect?
What’s your expected QPS (queries per second)?
Do you need horizontal scaling?

Step 4: Analyze Consistency Requirements

Questions to Ask:

Do you need ACID transactions?
Can you tolerate eventual consistency?
Is data accuracy critical or is speed more important?

Decision Matrix

Use Case	Data Structure	Query Pattern	Scale	Consistency	Recommended
E-commerce	Structured	Complex JOINs	Medium	Strong	SQL (PostgreSQL)
Social Media	Semi-structured	Simple lookups	Large	Eventual	Document DB (MongoDB)
Caching	Simple	Key lookup	Large	Eventual	Key-Value (Redis)
Social Network	Relationships	Graph queries	Large	Eventual	Graph DB (Neo4j)
Time-Series	Structured	Column queries	Large	Eventual	Column-Family (Cassandra)
Financial	Structured	Complex queries	Medium	Strong	SQL (PostgreSQL)
Content Management	Semi-structured	Document queries	Medium	Eventual	Document DB (MongoDB)
Session Storage	Simple	Key lookup	Large	Eventual	Key-Value (Redis)

Real-World Examples

Understanding how major companies choose databases helps illustrate the decision process:

E-commerce Platform: Polyglot Persistence

The Challenge: E-commerce platforms have diverse data needs: structured orders, flexible products, fast sessions, complex search.

The Solution: Amazon uses multiple databases:

PostgreSQL: User accounts, orders, payments (ACID transactions, complex queries)
DynamoDB: Product catalog, inventory (high scale, simple queries)
Elasticsearch: Product search (full-text search, faceted search)
Redis: Shopping cart, sessions (fast lookups, temporary data)

Why Multiple Databases?

Different data has different requirements
Right tool for each job
Optimized performance

Example: User shopping flow:

Login → PostgreSQL (user account)
Browse products → DynamoDB (product catalog)
Search → Elasticsearch (product search)
Add to cart → Redis (shopping cart)
Checkout → PostgreSQL (order, payment)

Impact: Optimized for each use case. Fast performance. Handles millions of users.

The Challenge: Social media platforms need flexible content (posts vary) and relationship queries (friends, followers).

The Solution: Facebook uses hybrid approach:

MySQL: User accounts, core data (ACID transactions)
MongoDB: Posts, comments, content (flexible schema)
Neo4j: Friend relationships, recommendations (graph traversals)
Cassandra: Messages, notifications (time-series, high scale)

Why Hybrid?

Posts have flexible structure → Document DB
Friend relationships → Graph DB
Messages over time → Column-Family DB

Impact: Handles billions of users. Fast relationship queries. Flexible content storage.

Financial Platform: SQL Database

The Challenge: Financial platforms need strong consistency, complex queries, and ACID transactions.

The Solution: Banks use SQL databases:

PostgreSQL/Oracle: All financial data (accounts, transactions, balances)
Why SQL? ACID transactions, complex queries, strong consistency

Example: Money transfer:

Begin transaction
Debit account A
Credit account B
Commit transaction
All or nothing (atomicity)

Impact: Zero data loss. Strong consistency. Critical for financial systems.

Content Platform: Document Database

The Challenge: Content platforms store articles, blog posts, user-generated content with varying structures.

The Solution: Medium uses MongoDB:

MongoDB: Articles, posts, user profiles (flexible schema)
Why Document DB? Content varies, nested data, no JOINs needed

Example: Article document:

1
{
2
  "id": 123,
3
  "title": "Article",
4
  "author": {...},
5
  "content": "...",
6
  "tags": [...],
7
  "comments": [...]
8
}

Impact: Flexible content storage. Fast reads. Scales horizontally.

Real-World Examples

Example 1: E-Commerce Platform

Why SQL?

Structured product/order/user data
Need complex queries (reports, analytics)
ACID transactions for checkout
Relationships between entities

Why Document DB?

Flexible post structure (text, images, videos)
Simple queries (get user’s posts)
Need to scale horizontally
Fast reads more important than complex queries

Example 3: Recommendation Engine

Why Graph DB?

Complex relationships (users, products, purchases)
Need to traverse relationships
“Find similar users” queries
Relationship queries are primary use case

Polyglot Persistence

Polyglot persistence means using multiple database types in the same system. Different parts use different databases optimized for their needs.

Example: E-commerce platform

PostgreSQL: Orders, payments, inventory (structured, ACID)
MongoDB: Product catalogs, reviews (flexible schema)
Redis: Shopping cart, sessions, cache (fast lookups)

LLD ↔ HLD Connection

How database choice affects your class design:

1
from abc import ABC, abstractmethod
2
from typing import Optional
3

4
class UserRepository(ABC):
5
    """Abstract repository - database agnostic"""
6
    @abstractmethod
7
    def find_by_id(self, user_id: int) -> Optional['User']:
8
        pass
9

10
    @abstractmethod
11
    def save(self, user: 'User') -> 'User':
12
        pass
13

14
class SQLUserRepository(UserRepository):
15
    """SQL implementation"""
16
    def __init__(self, db_connection):
17
        self.db = db_connection
18

19
    def find_by_id(self, user_id: int) -> Optional['User']:
20
        # SQL query
21
        cursor = self.db.execute("SELECT * FROM users WHERE id = ?", (user_id,))
22
        row = cursor.fetchone()
23
        return User.from_row(row) if row else None
24

25
    def save(self, user: 'User') -> 'User':
26
        # SQL insert/update
27
        self.db.execute(
28
            "INSERT INTO users (id, name, email) VALUES (?, ?, ?)",
29
            (user.id, user.name, user.email)
30
        )
31
        return user
32

33
class MongoDBUserRepository(UserRepository):
34
    """MongoDB implementation"""
35
    def __init__(self, mongo_collection):
36
        self.collection = mongo_collection
37

38
    def find_by_id(self, user_id: int) -> Optional['User']:
39
        # MongoDB query
40
        doc = self.collection.find_one({"_id": user_id})
41
        return User.from_document(doc) if doc else None
42

43
    def save(self, user: 'User') -> 'User':
44
        # MongoDB insert/update
45
        self.collection.replace_one(
46
            {"_id": user.id},
47
            user.to_document(),
48
            upsert=True
49
        )
50
        return user

1
import java.util.Optional;
2

3
public interface UserRepository {
4
    // Abstract repository - database agnostic
5
    Optional<User> findById(Integer userId);
6
    User save(User user);
7
}
8

9
public class SQLUserRepository implements UserRepository {
10
    // SQL implementation
11
    private Connection connection;
12

13
    public Optional<User> findById(Integer userId) {
14
        try (PreparedStatement stmt = connection.prepareStatement(
15
            "SELECT * FROM users WHERE id = ?"
16
        )) {
17
            stmt.setInt(1, userId);
18
            ResultSet rs = stmt.executeQuery();
19
            if (rs.next()) {
20
                return Optional.of(mapToUser(rs));
21
            }
22
        }
23
        return Optional.empty();
24
    }
25

26
    public User save(User user) {
27
        // SQL insert/update
28
        // ...
29
        return user;
30
    }
31
}
32

33
public class MongoDBUserRepository implements UserRepository {
34
    // MongoDB implementation
35
    private MongoCollection<Document> collection;
36

37
    public Optional<User> findById(Integer userId) {
38
        Document doc = collection.find(eq("_id", userId)).first();
39
        return doc != null ? Optional.of(mapToUser(doc)) : Optional.empty();
40
    }
41

42
    public User save(User user) {
43
        // MongoDB insert/update
44
        collection.replaceOne(eq("_id", user.getId()), user.toDocument(),
45
                              new ReplaceOptions().upsert(true));
46
        return user;
47
    }
48
}

Key Takeaways

What’s Next?

Now that you understand database selection, let’s dive deep into indexing strategies to optimize database performance:

Next up: Database Indexing Strategies — Learn about B-trees, LSM trees, and inverted indexes for designing searchable entities.

Request a feature or report an issue

Choosing the Right Database

The Database Selection Challenge

Decision Framework

Step 1: Analyze Your Data Structure

Step 2: Analyze Query Patterns

Step 3: Analyze Scale Requirements

Step 4: Analyze Consistency Requirements

Decision Matrix

Real-World Examples

E-commerce Platform: Polyglot Persistence

Social Media: Document + Graph Database

Financial Platform: SQL Database

Content Platform: Document Database

Real-World Examples

Example 1: E-Commerce Platform

Example 2: Social Media Feed

Example 3: Recommendation Engine

Polyglot Persistence

LLD ↔ HLD Connection

Mapping Domain Models to Storage

Key Takeaways

What’s Next?