SQL, NoSQL, CRUD, joins, indexing, replication, backup, cloud databases — everything a DevOps engineer needs to know about databases.
12
Chapters
50+
Queries
100%
Free
01🗄️
Introduction to Databases
Where All Data Lives
A database is an organized collection of data. Every application stores data — user accounts, orders, products, logs. As a DevOps engineer, you need to understand databases because you deploy them, back them up, replicate them, monitor them, and troubleshoot them when they go down at 3 AM.
📊
Relational (SQL)
Data in tables with rows and columns. Like Excel spreadsheets with relationships. MySQL, PostgreSQL, Oracle, SQL Server.
📦
Document (NoSQL)
Data in JSON-like documents. Flexible schema. MongoDB, CouchDB.
Optimized for time-stamped data. Metrics, logs, IoT. InfluxDB, TimescaleDB, Prometheus.
02⚖️
SQL vs NoSQL
When to Use Which
Feature
SQL (Relational)
NoSQL (Document)
Data Structure
Tables with fixed schema
Flexible JSON documents
Schema
Defined upfront (strict)
Schema-less (flexible)
Relationships
Strong (joins, foreign keys)
Weak (denormalized)
Scaling
Vertical (bigger server)
Horizontal (more servers)
ACID
Full ACID compliance
Eventually consistent (usually)
Best For
Financial, e-commerce, ERP
Social media, real-time, IoT
Examples
MySQL, PostgreSQL, Oracle
MongoDB, Cassandra, DynamoDB
DevOps Rule of Thumb
Most production apps use SQL (MySQL/PostgreSQL) as the primary database. NoSQL (MongoDB/Redis) is added when specific needs arise — caching, session storage, or document-heavy workloads. Start with SQL unless you have a specific reason for NoSQL.
03📝
SQL Basics
The Language of Databases
SQL (Structured Query Language) is how you talk to relational databases. Every DevOps engineer should know basic SQL for troubleshooting, checking data, and running maintenance queries.
SQL-- Create a database
CREATE DATABASE ecommerce;
USE ecommerce;
-- Create a table
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(150) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
product VARCHAR(200),
amount DECIMAL(10,2),
status ENUM("pending", "paid", "shipped", "delivered"),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id)
);
04✏️
CRUD Operations
Create, Read, Update, Delete
SQL-- CREATE (Insert data)
INSERT INTO users (name, email) VALUES ("Suresh", "suresh@example.com");
INSERT INTO orders (user_id, product, amount, status) VALUES (1, "Docker eBook", 499.00, "paid");
-- READ (Query data)
SELECT * FROM users; -- All users
SELECT name, email FROM users WHERE id = 1; -- Specific user
SELECT * FROM orders WHERE status = "pending"; -- Pending orders
SELECT COUNT(*) FROM orders; -- Count orders
SELECT status, COUNT(*) FROM orders GROUP BY status; -- Count per status
-- UPDATE (Modify data)
UPDATE orders SET status = "shipped" WHERE id = 1;
UPDATE users SET email = "new@email.com" WHERE id = 1;
-- DELETE (Remove data)
DELETE FROM orders WHERE id = 1;
DELETE FROM users WHERE created_at < "2023-01-01"; -- Old users
05🔗
Joins & Relationships
Connect Tables Together
Joins combine data from multiple tables. This is the power of relational databases — store data once, connect it everywhere.
SQL-- INNER JOIN — only matching rows from both tables
SELECT users.name, orders.product, orders.amount
FROM users
INNER JOIN orders ON users.id = orders.user_id;
-- Shows: Suresh | Docker eBook | 499.00
-- LEFT JOIN — all rows from left table, matching from right
SELECT users.name, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
GROUP BY users.id;
-- Shows ALL users, even those with 0 orders
-- Real-world: Find users who spent more than 1000
SELECT users.name, SUM(orders.amount) as total_spent
FROM users
INNER JOIN orders ON users.id = orders.user_id
GROUP BY users.id
HAVING total_spent > 1000
ORDER BY total_spent DESC;
Join Type
What It Returns
When to Use
INNER JOIN
Only matching rows from both tables
Most common — get related data
LEFT JOIN
All left table + matching right
Show all users even without orders
RIGHT JOIN
All right table + matching left
Rarely used (just swap tables and use LEFT)
FULL OUTER JOIN
All rows from both tables
Show everything (MySQL does not support this natively)
06⚡
Indexes & Performance
Make Queries Fast
Without indexes, the database reads EVERY row to find your data (full table scan). With an index, it jumps directly to the right row — like a book index vs reading every page. A missing index on a production table can make a 10ms query take 30 seconds.
SQL-- Create an index
CREATE INDEX idx_user_email ON users(email);
CREATE INDEX idx_order_status ON orders(status);
CREATE INDEX idx_order_date ON orders(created_at);
-- Composite index (multiple columns)
CREATE INDEX idx_order_user_status ON orders(user_id, status);
-- Check query performance
EXPLAIN SELECT * FROM orders WHERE user_id = 1 AND status = "pending";
-- Shows whether the query uses an index or does a full table scan
-- Show all indexes on a table
SHOW INDEX FROM orders;
✓Index columns used in WHERE, JOIN, and ORDER BY clauses
✓Composite indexes: put the most-filtered column FIRST
✓Too many indexes slow down INSERT/UPDATE (index must be updated)
✓Use EXPLAIN before and after adding index to verify improvement
✓Never add indexes blindly — profile first, index second
✓Store backups in a DIFFERENT location (S3, another region)
✓Test restore process monthly — a backup you cannot restore is useless
✓Keep 14 days of daily backups + 4 weekly + 3 monthly
✓Enable binary logging for point-in-time recovery (MySQL)
08🔄
Replication & High Availability
No Single Point of Failure
Replication copies data from one database server (primary) to one or more servers (replicas). If the primary dies, a replica takes over. This is how production databases achieve 99.99% uptime.
🏠
Primary (Master)
Handles all WRITES (INSERT, UPDATE, DELETE). Sends changes to replicas. There is only ONE primary.
📋
Replica (Slave)
Handles READ queries. Receives changes from primary. You can have 1, 5, or 50 replicas.
🔄
Multi-AZ (Standby)
Exact copy in another data center. Auto-failover if primary goes down. For HIGH AVAILABILITY, not read scaling.
Type
Purpose
Failover
Read Scaling
Primary-Replica
Read scaling
Manual (promote replica)
Yes — read from replicas
Multi-AZ (Standby)
High availability
Automatic (<60 seconds)
No — standby is idle
Multi-Region
Disaster recovery
Manual (DNS failover)
Yes — read from nearest region
09☁️
Cloud Databases
RDS, Aurora, DynamoDB
Cloud databases are managed by AWS/Azure/GCP — they handle patches, backups, replication, and failover. You focus on your data, not database administration.
Service
Type
Best For
Key Feature
RDS
SQL (MySQL, PostgreSQL, etc.)
General purpose
Multi-AZ, automated backups, read replicas
Aurora
SQL (MySQL/PostgreSQL compatible)
High performance
5x faster than MySQL, auto-scaling storage
DynamoDB
NoSQL (Key-Value)
Serverless, high scale
Single-digit millisecond latency, auto-scales
ElastiCache
Redis/Memcached
Caching
In-memory, microsecond latency
DocumentDB
MongoDB-compatible
Document store
MongoDB API on managed infrastructure
10🍃
MongoDB Basics
Document Database
MongoDB stores data as JSON-like documents instead of table rows. No fixed schema — each document can have different fields. Great for rapidly changing data models and nested/hierarchical data.
Redis stores data IN MEMORY — reads take microseconds instead of milliseconds. Used for caching (store frequent query results), sessions (user login state), rate limiting, and queues. Think of it as a super-fast sticky note board.
REDIS# Connect to Redis
redis-cli
# Basic key-value operations
SET user:1:name "Suresh"
GET user:1:name # "Suresh"
SET session:abc123 "user_id:1" EX 3600 # Expires in 1 hour
# Increment (great for counters)
INCR page:views # 1
INCR page:views # 2
GET page:views # "2"
# Lists (for queues)
LPUSH jobs "deploy-order-service"
LPUSH jobs "build-frontend"
RPOP jobs # "deploy-order-service" (FIFO)
# Real-world caching pattern:
# 1. App checks Redis: GET product:123
# 2. If found (cache hit) → return immediately (1ms)
# 3. If not found (cache miss) → query MySQL → store in Redis
# 4. SET product:123 "{data}" EX 300 (cache for 5 minutes)
12💼
Interview Questions
Database Q&A
❓
SQL vs NoSQL?
SQL: structured, ACID, joins, fixed schema (MySQL, PostgreSQL). NoSQL: flexible, scalable, denormalized (MongoDB, Redis). SQL for transactions, NoSQL for scale and flexibility.
❓
What is ACID?
Atomicity (all or nothing), Consistency (valid state), Isolation (concurrent transactions safe), Durability (committed = permanent). Ensures data integrity.
❓
What is an index?
Data structure that speeds up queries. Like a book index — jump to the right page instead of reading every page. Without index = full table scan (slow).
❓
Primary vs Replica?
Primary handles writes, replicas handle reads. Replication copies data from primary to replicas. Provides read scaling and failover capability.
❓
Multi-AZ vs Read Replica?
Multi-AZ: standby in another AZ for automatic failover (HA). Read Replica: copy for read queries (performance). Multi-AZ for availability, replicas for scale.
❓
What is Redis used for?
In-memory cache, session storage, rate limiting, queues. Microsecond reads. Cache frequent MySQL queries in Redis to reduce database load by 80%.
❓
How to backup MySQL?
mysqldump for logical backups. Binary logs for point-in-time recovery. Automate with cron. Store in S3. Test restore monthly.
❓
INNER JOIN vs LEFT JOIN?
INNER: only matching rows from both tables. LEFT: all rows from left table + matching from right (NULLs for non-matching). LEFT JOIN when you need all records from one side.
❓
What is connection pooling?
Reuse database connections instead of creating new ones per request. Reduces connection overhead. Tools: PgBouncer (PostgreSQL), ProxySQL (MySQL). Essential for high-traffic apps.
❓
MongoDB vs MySQL?
MySQL: structured data, joins, ACID, transactions. MongoDB: flexible schema, nested documents, horizontal scaling. MySQL for e-commerce/finance, MongoDB for content/social/IoT.