Relational Databases Deep Dive

The foundation of structured data storage

What is a Relational Database?

A relational database organizes data into tables (relations) with rows (tuples) and columns (attributes). Tables are connected through relationships defined by keys.

ACID Properties: The Foundation

ACID is an acronym for four essential properties that guarantee reliable database transactions:

A: Atomicity

Atomicity means a transaction is all-or-nothing. Either all operations succeed, or all fail (rollback).

Example: Transferring money between accounts. If deducting from Account A succeeds but adding to Account B fails, both operations are rolled back. You can’t have money disappear!

C: Consistency

Consistency ensures the database always remains in a valid state. All constraints, rules, and relationships are maintained.

Example: If a table has a constraint “age must be > 0”, the database will reject any transaction that tries to insert a negative age, keeping the database consistent.

I: Isolation

Isolation ensures concurrent transactions don’t interfere with each other. Each transaction sees a consistent snapshot of the data.

Example: Two users updating the same account balance simultaneously. Isolation ensures each transaction sees a consistent view and updates don’t interfere.

D: Durability

Durability guarantees that once a transaction is committed, it persists even if the system crashes.

Example: After you commit a bank transfer, even if the server crashes immediately after, your transaction is safely stored and will be restored when the system recovers.

Real-World Examples

Understanding how major companies use relational databases helps illustrate ACID properties in practice:

ACID in Financial Systems: Banking Transactions

The Challenge: Banks process millions of transactions daily. A single transaction failure could mean lost money or incorrect balances.

The Solution: Banks use ACID transactions extensively:

Atomicity: Transfer $100 from Account A to Account B. Either both succeed (debit A, credit B) or both fail. No partial transfers.
Consistency: Account balances always sum correctly. No negative balances allowed (constraints).
Isolation: Two transfers happening simultaneously don’t interfere. Each sees consistent state.
Durability: Once committed, transaction persists even if database crashes.

Example: User transfers $100:

Begin transaction
Debit Account A: $1000 → $900
Credit Account B: $500 → $600
Commit transaction

If database crashes between steps 3 and 4, transaction rolls back. Account A stays at $1000, Account B stays at $500. No money lost.

Impact: Zero data loss incidents. All transactions are atomic and durable. Critical for financial systems.

Normalization: E-commerce Product Catalog

The Challenge: E-commerce platforms store product data. Without normalization, product information is duplicated across orders, causing inconsistencies.

The Solution: Amazon uses normalized database design:

Products table: Product ID, name, price, description (single source of truth)
Orders table: Order ID, user ID, product ID (references product)
Order Items table: Order ID, product ID, quantity (normalized)

Why Normalize? If product price changes:

Without normalization: Must update price in millions of order records (inconsistent, slow)
With normalization: Update price once in Products table, all orders reference it (consistent, fast)

Example: Product “iPhone 15” price changes from $999 to $899:

Normalized: Update 1 row in Products table
Denormalized: Update millions of rows in Orders table

Impact: Price updates instant and consistent. Reduced storage by 70% (no duplication). Faster queries.

The Challenge: Social media platforms need to find users quickly. Searching 2 billion users without indexes would be extremely slow.

The Solution: Facebook uses extensive indexing:

Primary index: User ID (fast user lookups)
Unique index: Email (fast login lookups)
Composite index: (name, location) for search
Full-text index: User posts for search

Example: User searches for “John Smith in New York”:

Without index: Scan 2 billion users (hours)
With composite index: Direct lookup (milliseconds)

Impact: User search: 50ms (with index) vs 10+ seconds (without index). 200x faster. Handles billions of users efficiently.

PostgreSQL: GitHub’s Database Choice

The Challenge: GitHub stores millions of repositories, users, and code. Needs strong consistency, complex queries, and reliability.

The Solution: GitHub uses PostgreSQL:

ACID transactions: Ensures repository operations are atomic
Complex queries: JOINs across repositories, users, issues, pull requests
Reliability: Zero data loss, handles millions of operations daily

Why PostgreSQL?

Strong ACID guarantees (critical for code repositories)
Complex queries (find repositories by language, stars, forks)
Mature and reliable (handles GitHub’s scale)

Impact: Handles 100+ million repositories. Zero data loss incidents. Complex queries execute in milliseconds.

MySQL: Facebook’s Early Choice

The Challenge: Facebook started with MySQL for user data, posts, and relationships. Needed to scale to billions of users.

The Solution: Facebook used MySQL extensively:

User data: Profiles, friends, posts
ACID transactions: Ensures data consistency
Replication: Read replicas for scaling reads

Why MySQL?

Mature and reliable
Good performance for structured data
Strong ACID guarantees

Impact: Handled billions of users. Strong consistency for user data. Foundation for Facebook’s growth.

Database Normalization

Normalization is the process of organizing data to reduce redundancy and eliminate anomalies. It involves splitting tables and creating relationships.

Why Normalize?

Normal Forms

Normalization follows normal forms (NF), each eliminating specific types of redundancy:

Normalization Example

Before Normalization (Problems):

order_id	customer_name	customer_email	product_name	price	quantity
1	Alice	[email protected]	Laptop	1000	1
2	Alice	[email protected]	Mouse	20	2
3	Bob	[email protected]	Keyboard	50	1

Problems:

Customer info repeated (redundancy)
If Alice changes email, need to update multiple rows
If we delete order #2, we lose Alice’s info

After Normalization (3NF):

Customers Table:

customer_id	name	email
1	Alice	[email protected]
2	Bob	[email protected]

Orders Table:

order_id	customer_id	product_name	price	quantity
1	1	Laptop	1000	1
2	1	Mouse	20	2
3	2	Keyboard	50	1

Benefits:

Customer info stored once
Update email in one place
Delete order without losing customer info

Database Indexing

An index is a data structure that speeds up data retrieval. It’s like a book’s index—instead of scanning every page, you look up the index to find the exact location.

How Indexes Work

Analogy: Finding a word in a dictionary:

Without index: Read every page (full table scan)
With index: Use alphabetical index to jump directly (index lookup)

Types of Indexes

Index Trade-offs

Aspect	Benefit	Cost
Read Speed	Much faster queries	Slower writes (must update index)
Storage	-	Extra storage space
Maintenance	-	Index must be maintained

LLD ↔ HLD Connection

How relational database concepts affect your class design:

Entity Classes Map to Tables

Repository Pattern for Data Access

ACID Transactions in Code

Key Takeaways

What’s Next?

Now that you understand relational databases, let’s explore how isolation levels control what concurrent transactions can see:

Next up: Database Isolation Levels — Learn how read committed, repeatable read, and serializable isolation levels prevent race conditions.

Request a feature or report an issue