Database Fundamentals to Production

Databases Complete Guide

SQL, NoSQL, CRUD, joins, indexing, replication, backup, cloud databases — everything a DevOps engineer needs to know about databases.

Chapters

50+

Queries

100%

Free

01🗄️

Introduction to Databases

Where All Data Lives

A database is an organized collection of data. Every application stores data — user accounts, orders, products, logs. As a DevOps engineer, you need to understand databases because you deploy them, back them up, replicate them, monitor them, and troubleshoot them when they go down at 3 AM.

📊

Relational (SQL)

Data in tables with rows and columns. Like Excel spreadsheets with relationships. MySQL, PostgreSQL, Oracle, SQL Server.

📦

Document (NoSQL)

Data in JSON-like documents. Flexible schema. MongoDB, CouchDB.

⚡

Key-Value

Simple key=value pairs. Extremely fast. Redis, DynamoDB, Memcached.

📈

Time-Series

Optimized for time-stamped data. Metrics, logs, IoT. InfluxDB, TimescaleDB, Prometheus.

02⚖️

SQL vs NoSQL

When to Use Which

Feature	SQL (Relational)	NoSQL (Document)
Data Structure	Tables with fixed schema	Flexible JSON documents
Schema	Defined upfront (strict)	Schema-less (flexible)
Relationships	Strong (joins, foreign keys)	Weak (denormalized)
Scaling	Vertical (bigger server)	Horizontal (more servers)
ACID	Full ACID compliance	Eventually consistent (usually)
Best For	Financial, e-commerce, ERP	Social media, real-time, IoT
Examples	MySQL, PostgreSQL, Oracle	MongoDB, Cassandra, DynamoDB

DevOps Rule of Thumb

Most production apps use SQL (MySQL/PostgreSQL) as the primary database. NoSQL (MongoDB/Redis) is added when specific needs arise — caching, session storage, or document-heavy workloads. Start with SQL unless you have a specific reason for NoSQL.

03📝

SQL Basics

The Language of Databases

SQL (Structured Query Language) is how you talk to relational databases. Every DevOps engineer should know basic SQL for troubleshooting, checking data, and running maintenance queries.

SQL-- Create a database CREATE DATABASE ecommerce; USE ecommerce; -- Create a table CREATE TABLE users ( id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(100) NOT NULL, email VARCHAR(150) UNIQUE NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE TABLE orders ( id INT AUTO_INCREMENT PRIMARY KEY, user_id INT NOT NULL, product VARCHAR(200), amount DECIMAL(10,2), status ENUM("pending", "paid", "shipped", "delivered"), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (user_id) REFERENCES users(id) );

04✏️

CRUD Operations

Create, Read, Update, Delete

SQL-- CREATE (Insert data) INSERT INTO users (name, email) VALUES ("Suresh", "suresh@example.com"); INSERT INTO orders (user_id, product, amount, status) VALUES (1, "Docker eBook", 499.00, "paid"); -- READ (Query data) SELECT * FROM users; -- All users SELECT name, email FROM users WHERE id = 1; -- Specific user SELECT * FROM orders WHERE status = "pending"; -- Pending orders SELECT COUNT(*) FROM orders; -- Count orders SELECT status, COUNT(*) FROM orders GROUP BY status; -- Count per status -- UPDATE (Modify data) UPDATE orders SET status = "shipped" WHERE id = 1; UPDATE users SET email = "new@email.com" WHERE id = 1; -- DELETE (Remove data) DELETE FROM orders WHERE id = 1; DELETE FROM users WHERE created_at < "2023-01-01"; -- Old users

05🔗

Joins & Relationships

Connect Tables Together

Joins combine data from multiple tables. This is the power of relational databases — store data once, connect it everywhere.

SQL-- INNER JOIN — only matching rows from both tables SELECT users.name, orders.product, orders.amount FROM users INNER JOIN orders ON users.id = orders.user_id; -- Shows: Suresh | Docker eBook | 499.00 -- LEFT JOIN — all rows from left table, matching from right SELECT users.name, COUNT(orders.id) as order_count FROM users LEFT JOIN orders ON users.id = orders.user_id GROUP BY users.id; -- Shows ALL users, even those with 0 orders -- Real-world: Find users who spent more than 1000 SELECT users.name, SUM(orders.amount) as total_spent FROM users INNER JOIN orders ON users.id = orders.user_id GROUP BY users.id HAVING total_spent > 1000 ORDER BY total_spent DESC;

Join Type	What It Returns	When to Use
INNER JOIN	Only matching rows from both tables	Most common — get related data
LEFT JOIN	All left table + matching right	Show all users even without orders
RIGHT JOIN	All right table + matching left	Rarely used (just swap tables and use LEFT)
FULL OUTER JOIN	All rows from both tables	Show everything (MySQL does not support this natively)

06⚡

Indexes & Performance

Make Queries Fast

Without indexes, the database reads EVERY row to find your data (full table scan). With an index, it jumps directly to the right row — like a book index vs reading every page. A missing index on a production table can make a 10ms query take 30 seconds.

SQL-- Create an index CREATE INDEX idx_user_email ON users(email); CREATE INDEX idx_order_status ON orders(status); CREATE INDEX idx_order_date ON orders(created_at); -- Composite index (multiple columns) CREATE INDEX idx_order_user_status ON orders(user_id, status); -- Check query performance EXPLAIN SELECT * FROM orders WHERE user_id = 1 AND status = "pending"; -- Shows whether the query uses an index or does a full table scan -- Show all indexes on a table SHOW INDEX FROM orders;

✓Index columns used in WHERE, JOIN, and ORDER BY clauses

✓Composite indexes: put the most-filtered column FIRST

✓Too many indexes slow down INSERT/UPDATE (index must be updated)

✓Use EXPLAIN before and after adding index to verify improvement

✓Never add indexes blindly — profile first, index second

07💾

Backup & Restore

Never Lose Data

TERMINAL# MySQL Backup (mysqldump) mysqldump -u root -p ecommerce > backup_2024-06-01.sql mysqldump -u root -p --all-databases > full_backup.sql # Compressed backup (saves 70-90% space) mysqldump -u root -p ecommerce | gzip > backup_2024-06-01.sql.gz # Restore mysql -u root -p ecommerce < backup_2024-06-01.sql gunzip < backup_2024-06-01.sql.gz | mysql -u root -p ecommerce # PostgreSQL Backup pg_dump ecommerce > backup.sql pg_dump -Fc ecommerce > backup.dump # Custom format (smaller) # Restore PostgreSQL psql ecommerce < backup.sql pg_restore -d ecommerce backup.dump # Automate daily backup (cron) # 0 2 * * * mysqldump -u root ecommerce | gzip > /backups/db_$(date +\%Y\%m\%d).sql.gz

✓Automate backups with cron (daily at minimum)

✓Store backups in a DIFFERENT location (S3, another region)

✓Test restore process monthly — a backup you cannot restore is useless

✓Keep 14 days of daily backups + 4 weekly + 3 monthly

✓Enable binary logging for point-in-time recovery (MySQL)

08🔄

Replication & High Availability

No Single Point of Failure

Replication copies data from one database server (primary) to one or more servers (replicas). If the primary dies, a replica takes over. This is how production databases achieve 99.99% uptime.

🏠

Primary (Master)

Handles all WRITES (INSERT, UPDATE, DELETE). Sends changes to replicas. There is only ONE primary.

📋

Replica (Slave)

Handles READ queries. Receives changes from primary. You can have 1, 5, or 50 replicas.

🔄

Multi-AZ (Standby)

Exact copy in another data center. Auto-failover if primary goes down. For HIGH AVAILABILITY, not read scaling.

Type	Purpose	Failover	Read Scaling
Primary-Replica	Read scaling	Manual (promote replica)	Yes — read from replicas
Multi-AZ (Standby)	High availability	Automatic (<60 seconds)	No — standby is idle
Multi-Region	Disaster recovery	Manual (DNS failover)	Yes — read from nearest region

09☁️

Cloud Databases

RDS, Aurora, DynamoDB

Cloud databases are managed by AWS/Azure/GCP — they handle patches, backups, replication, and failover. You focus on your data, not database administration.

Service	Type	Best For	Key Feature
RDS	SQL (MySQL, PostgreSQL, etc.)	General purpose	Multi-AZ, automated backups, read replicas
Aurora	SQL (MySQL/PostgreSQL compatible)	High performance	5x faster than MySQL, auto-scaling storage
DynamoDB	NoSQL (Key-Value)	Serverless, high scale	Single-digit millisecond latency, auto-scales
ElastiCache	Redis/Memcached	Caching	In-memory, microsecond latency
DocumentDB	MongoDB-compatible	Document store	MongoDB API on managed infrastructure

10🍃

MongoDB Basics

Document Database

MongoDB stores data as JSON-like documents instead of table rows. No fixed schema — each document can have different fields. Great for rapidly changing data models and nested/hierarchical data.

MONGODB// Connect to MongoDB mongosh // Create database and collection use ecommerce // Insert document (no schema needed!) db.users.insertOne({ name: "Suresh", email: "suresh@example.com", skills: ["Docker", "Kubernetes", "AWS"], address: { city: "Guntur", state: "AP" } }) // Query db.users.find({ name: "Suresh" }) db.users.find({ "address.city": "Guntur" }) db.users.find({ skills: "Docker" }) // Update db.users.updateOne( { name: "Suresh" }, { $push: { skills: "Terraform" } } ) // Delete db.users.deleteOne({ name: "Suresh" })

11⚡

Redis & Caching

In-Memory Speed

Redis stores data IN MEMORY — reads take microseconds instead of milliseconds. Used for caching (store frequent query results), sessions (user login state), rate limiting, and queues. Think of it as a super-fast sticky note board.

REDIS# Connect to Redis redis-cli # Basic key-value operations SET user:1:name "Suresh" GET user:1:name # "Suresh" SET session:abc123 "user_id:1" EX 3600 # Expires in 1 hour # Increment (great for counters) INCR page:views # 1 INCR page:views # 2 GET page:views # "2" # Lists (for queues) LPUSH jobs "deploy-order-service" LPUSH jobs "build-frontend" RPOP jobs # "deploy-order-service" (FIFO) # Real-world caching pattern: # 1. App checks Redis: GET product:123 # 2. If found (cache hit) → return immediately (1ms) # 3. If not found (cache miss) → query MySQL → store in Redis # 4. SET product:123 "{data}" EX 300 (cache for 5 minutes)

12💼

Interview Questions

Database Q&A

SQL vs NoSQL?

SQL: structured, ACID, joins, fixed schema (MySQL, PostgreSQL). NoSQL: flexible, scalable, denormalized (MongoDB, Redis). SQL for transactions, NoSQL for scale and flexibility.

What is ACID?

Atomicity (all or nothing), Consistency (valid state), Isolation (concurrent transactions safe), Durability (committed = permanent). Ensures data integrity.

What is an index?

Data structure that speeds up queries. Like a book index — jump to the right page instead of reading every page. Without index = full table scan (slow).

Primary vs Replica?

Primary handles writes, replicas handle reads. Replication copies data from primary to replicas. Provides read scaling and failover capability.

Multi-AZ vs Read Replica?

Multi-AZ: standby in another AZ for automatic failover (HA). Read Replica: copy for read queries (performance). Multi-AZ for availability, replicas for scale.

What is Redis used for?

In-memory cache, session storage, rate limiting, queues. Microsecond reads. Cache frequent MySQL queries in Redis to reduce database load by 80%.

How to backup MySQL?

mysqldump for logical backups. Binary logs for point-in-time recovery. Automate with cron. Store in S3. Test restore monthly.

INNER JOIN vs LEFT JOIN?

INNER: only matching rows from both tables. LEFT: all rows from left table + matching from right (NULLs for non-matching). LEFT JOIN when you need all records from one side.

What is connection pooling?

Reuse database connections instead of creating new ones per request. Reduces connection overhead. Tools: PgBouncer (PostgreSQL), ProxySQL (MySQL). Essential for high-traffic apps.

MongoDB vs MySQL?

MySQL: structured data, joins, ACID, transactions. MongoDB: flexible schema, nested documents, horizontal scaling. MySQL for e-commerce/finance, MongoDB for content/social/IoT.