← Home🐳 Docker
🐧 Complete Linux for DevOps

Linux Complete Guide

Every command a DevOps engineer needs β€” from basic navigation to advanced system administration, user management, networking, and shell scripting with real examples and outputs.

18
Chapters
120+
Commands
100%
Free
01🐧

Why Linux for DevOps?

96% of Servers Run Linux

Linux powers 96% of the world's servers, 100% of the top 500 supercomputers, and nearly every cloud instance on AWS, Azure, and GCP. As a DevOps engineer, you will spend most of your time on Linux terminals β€” deploying apps, debugging issues, writing scripts, and managing infrastructure. Linux is not optional β€” it IS your daily work environment.
Getting Help β€” The man Command
TERMINAL# man = manual. Shows documentation for ANY command man ls # Full manual for ls command man grep # Full manual for grep man -k "copy" # Search all manuals for keyword "copy" # Quick help (shorter than man) ls --help # Brief usage info curl --help # Flags and options # Common man sections: # man 1 command β†’ user commands # man 5 config β†’ config file formats # man 8 admin β†’ admin commands # Pro tip: press / to search inside man, q to quit
πŸ’‘ When Stuck

ALWAYS try man <command> or <command> --help first. 90% of your questions are answered right there. This is how experienced engineers learn new commands β€” they read the manual, not Google.

02πŸ“

Linux Directory Structure

Where Everything Lives

Linux organizes everything in a single tree starting from / (root). Unlike Windows with C: D: drives, Linux has ONE tree. Every file, device, process, and configuration has a specific place. Understanding this structure is essential for troubleshooting.
DirectoryWhat Lives HereDevOps Examples
/Root β€” the top of everythingStarting point of the entire filesystem
/homeUser home directories~/ or /home/suresh β€” your personal files, SSH keys, .bashrc
/rootRoot user's home directoryOnly accessible by root. NOT the same as /
/etcConfiguration filesnginx.conf, sshd_config, hosts, resolv.conf, crontab
/varVariable data β€” things that changeLogs (/var/log), databases, mail, cache, PID files
/var/logAll system and app logssyslog, auth.log, nginx/access.log, journal/
/tmpTemporary files β€” cleared on rebootBuild artifacts, temp downloads, session data
/optThird-party softwareJenkins, Prometheus, custom apps installed outside package manager
/usrUser programs and libraries/usr/bin (commands), /usr/lib (libraries), /usr/share (docs)
/usr/localManually compiled/installed softwareSoftware you built from source goes here
/binEssential user commandsls, cp, mv, cat, grep β€” always available even in recovery
/sbinSystem admin commandsiptables, fdisk, mount, systemctl β€” need root
/procVirtual filesystem β€” running processes/proc/cpuinfo (CPU), /proc/meminfo (RAM), /proc/PID/
/devDevice filessda (disk), null, random, tty (terminals)
/mntTemporary mount pointsMount external drives, NFS shares here
/srvService dataWeb server files, FTP data
πŸ’‘ Everything is a File

In Linux, EVERYTHING is treated as a file β€” regular files, directories, devices (/dev/sda), processes (/proc/1234), even network sockets. This is a core Linux design principle and frequently asked in interviews.

03🧭

Navigation & Listing

Move Around Like a Pro

These are the commands you'll type 100 times a day. Master them until they're muscle memory.
TERMINAL# Where am I right now? $ pwd /home/suresh/projects # Go to a directory $ cd /var/log # Go to absolute path $ cd .. # Go up one level (parent directory) $ cd ~ # Go to home directory (/home/suresh) $ cd - # Go to PREVIOUS directory (like browser back button) $ cd # Same as cd ~ (go home) # List files $ ls # Basic listing app.js node_modules package.json README.md $ ls -la # ALL files (including hidden) with details total 48 drwxr-xr-x 5 suresh suresh 4096 Jun 1 10:30 . drwxr-xr-x 8 suresh suresh 4096 May 28 09:15 .. -rw-r--r-- 1 suresh suresh 245 Jun 1 10:30 .env -rw-r--r-- 1 suresh suresh 1024 Jun 1 10:15 app.js drwxr-xr-x 50 suresh suresh 4096 Jun 1 10:00 node_modules -rw-r--r-- 1 suresh suresh 532 Jun 1 10:00 package.json # Breakdown of ls -la output: # drwxr-xr-x = permissions (d=directory, rwx=owner, r-x=group, r-x=others) # 5 = number of links # suresh = owner # suresh = group # 4096 = size in bytes # Jun 1 = last modified date # . = current directory name $ ls -lh # Human-readable sizes (1.5K, 4.2M, 1G) $ ls -lt # Sort by modification time (newest first) $ ls -lS # Sort by size (largest first) $ ls -R # Recursive β€” list subdirectories too # Tree view (install: sudo apt install tree) $ tree -L 2 # Show 2 levels deep . β”œβ”€β”€ src β”‚ β”œβ”€β”€ app.js β”‚ └── routes β”œβ”€β”€ package.json └── Dockerfile
04πŸ“„

File Operations

Create, Copy, Move, View & Count

Create files, copy them, move them, view their contents, and count things. These operations form the backbone of every DevOps task.
Create & Delete
TERMINAL# Create empty file $ touch config.yml # Create file with content $ echo "server_port: 8080" > config.yml # Overwrite (>) $ echo "debug: true" >> config.yml # Append (>>) # Create directories $ mkdir logs # Single directory $ mkdir -p deploy/staging/configs # Create nested (parent + child) # Delete $ rm file.txt # Delete file $ rm -r old-folder/ # Delete directory recursively $ rm -rf build/ # Force delete (no confirmation) # WARNING: rm -rf has NO undo. Double-check before running!
Copy & Move
TERMINAL# Copy $ cp app.conf app.conf.backup # Copy file $ cp -r src/ src-backup/ # Copy directory recursively $ cp -p file.txt /backup/ # Preserve permissions and timestamps # Move / Rename $ mv old-name.txt new-name.txt # Rename file $ mv config.yml /etc/myapp/ # Move to another directory $ mv logs/*.log /archive/ # Move all .log files
View File Contents
TERMINAL# Print entire file $ cat config.yml server_port: 8080 debug: true # View with line numbers $ cat -n app.js 1 const express = require('express'); 2 const app = express(); 3 app.listen(3000); # View long files (scrollable) $ less /var/log/syslog # Scroll up/down, search with /keyword, quit with q $ more /var/log/syslog # Older, simpler version of less # First/Last lines $ head -20 access.log # First 20 lines $ tail -20 access.log # Last 20 lines $ tail -f /var/log/syslog # FOLLOW β€” shows new lines as they appear in real-time # This is THE most used DevOps command for debugging! # Ctrl+C to stop tail -f
Count β€” wc (word count)
TERMINAL$ wc -l access.log # Count LINES 15234 access.log $ wc -w README.md # Count WORDS 342 README.md $ wc -c app.jar # Count BYTES (file size) 45678901 app.jar # Count specific patterns $ grep -c "ERROR" app.log # Count lines containing ERROR 47 $ cat access.log | grep "500" | wc -l # Count 500 errors 12
06πŸ—œοΈ

Compression & Archives

tar, gzip, zip β€” Pack and Unpack

Compressing files saves storage and speeds up transfers. In DevOps, you'll compress log files, create backups, and package applications for deployment.
tar β€” The Swiss Army Knife
TERMINAL# CREATE archive (tar = tape archive) $ tar -cvf backup.tar /opt/myapp/ # Create tar (no compression) # c=create, v=verbose, f=filename $ tar -czvf backup.tar.gz /opt/myapp/ # Create tar + gzip compression # z=gzip compression $ tar -cjvf backup.tar.bz2 /opt/myapp/ # Create tar + bzip2 (smaller but slower) # EXTRACT archive $ tar -xvf backup.tar # Extract tar $ tar -xzvf backup.tar.gz # Extract tar.gz $ tar -xzvf backup.tar.gz -C /opt/restore/ # Extract to specific directory # LIST contents without extracting $ tar -tzvf backup.tar.gz # See what's inside # Memory trick: tar -czvf = Create Ze Vucking File # tar -xzvf = eXtract Ze Vucking File
gzip, gunzip, zip, unzip
TERMINAL# gzip β€” compress single files (replaces original) $ gzip access.log # Creates access.log.gz, removes original $ gunzip access.log.gz # Decompress back to access.log $ gzip -k access.log # Keep original file (-k = keep) $ gzip -9 large-file.log # Maximum compression (-9) # zip β€” create ZIP archives (Windows-compatible) $ zip backup.zip file1.txt file2.txt # Zip specific files $ zip -r project.zip myproject/ # Zip entire directory recursively $ unzip project.zip # Extract $ unzip project.zip -d /opt/restore/ # Extract to specific directory $ unzip -l project.zip # List contents without extracting
πŸ’‘ When to Use What

tar.gz for Linux backups and deployments (most common). zip for cross-platform sharing (Windows compatibility). gzip for compressing single files (log rotation). bzip2 for maximum compression (large archives).

07πŸ”’

Permissions & Ownership

Who Can Read, Write, and Execute

Every file in Linux has an owner, a group, and three sets of permissions: read (r), write (w), execute (x). Understanding permissions is the difference between a secure server and a hacked server.
TERMINAL# Permission format explained: $ ls -l app.sh -rwxr-xr-- 1 suresh devops 1024 Jun 1 10:30 app.sh β”‚β”‚β”‚ β”‚β”‚β”‚ β”‚β”‚β”‚ β”‚β”‚β”‚ β”‚β”‚β”‚ └── Others: r-- (read only) β”‚β”‚β”‚ └──── Group: r-x (read + execute) └────── Owner: rwx (read + write + execute) # Numeric (octal) representation: # r=4, w=2, x=1 β†’ add them up # rwx = 4+2+1 = 7 # r-x = 4+0+1 = 5 # r-- = 4+0+0 = 4 # So rwxr-xr-- = 754
chmod β€” Change Permissions
TERMINAL# Numeric method (most common) $ chmod 755 deploy.sh # rwxr-xr-x (owner:all, group+others:read+exec) $ chmod 644 config.yml # rw-r--r-- (owner:rw, rest:read-only) $ chmod 600 secrets.env # rw------- (owner ONLY β€” for passwords, keys) $ chmod 700 .ssh/ # rwx------ (private directory) $ chmod 400 my-key.pem # r-------- (SSH key β€” read-only by owner) # Symbolic method $ chmod +x script.sh # Add execute permission for everyone $ chmod u+w file.txt # Add write for owner (u=user/owner) $ chmod g-w file.txt # Remove write from group $ chmod o-rwx secret.txt # Remove all permissions from others # Recursive β€” apply to all files in directory $ chmod -R 755 /opt/myapp/
OctalPermissionUse For
755rwxr-xr-xScripts, executables, directories
644rw-r--r--Config files, HTML, regular files
600rw-------Secrets, .env files, private keys
700rwx------Private directories, .ssh/
400r--------SSH .pem key files
777rwxrwxrwxNEVER use in production!
chown β€” Change Ownership
TERMINAL# Change owner $ sudo chown suresh file.txt # Change owner AND group $ sudo chown suresh:devops file.txt # Change group only $ sudo chgrp docker /var/run/docker.sock # Recursive (entire directory) $ sudo chown -R www-data:www-data /var/www/ # Common DevOps scenario: $ sudo chown -R deploy:deploy /opt/myapp/ # App user owns the app directory
⚠️ 777 Permissions

chmod 777 gives EVERYONE full access to read, write, and execute. This is a massive security risk. If an interviewer hears you suggest 777, the interview is over. Use specific permissions like 755 for executables, 644 for files.

08πŸ‘₯

User & Group Management

Create Users, Manage Access

Every process, every file, every service runs as a specific user. As a DevOps engineer, you create users for applications, add team members, manage group permissions, and control who can run Docker or access specific servers.
User Management Commands
TERMINAL# Create a new user $ sudo useradd -m -s /bin/bash deploy # -m = create home directory (/home/deploy) # -s = set shell to bash # Set password $ sudo passwd deploy New password: **** # Create user with specific UID and home $ sudo useradd -m -u 1500 -s /bin/bash -d /opt/jenkins jenkins # Modify existing user $ sudo usermod -aG docker suresh # Add suresh to docker group $ sudo usermod -aG sudo suresh # Add to sudo group (admin access) $ sudo usermod -s /bin/zsh suresh # Change shell to zsh $ sudo usermod -L suresh # Lock account (disable login) $ sudo usermod -U suresh # Unlock account # Delete user $ sudo userdel deploy # Delete user (keep home dir) $ sudo userdel -r deploy # Delete user AND home directory # Switch user $ su - deploy # Switch to deploy user $ sudo -u deploy whoami # Run command as deploy user # Who am I? $ whoami # Current username $ id # UID, GID, groups uid=1000(suresh) gid=1000(suresh) groups=1000(suresh),27(sudo),999(docker)
Group Management
TERMINAL# Create a group $ sudo groupadd devops # Add user to group (IMPORTANT: use -aG, not just -G) $ sudo usermod -aG devops suresh # -a = APPEND to groups (without -a, it REPLACES all groups!) # -G = supplementary group # Real-world: Add user to Docker group $ sudo usermod -aG docker suresh # Now suresh can run docker commands without sudo # MUST log out and back in for group change to take effect! # List groups for a user $ groups suresh suresh : suresh sudo docker devops # See all members of a group $ getent group docker docker:x:999:suresh,deploy # Remove user from group $ sudo gpasswd -d suresh docker
What Happens When You Install a Service?
TERMINAL# When you install nginx, mysql, jenkins, etc.: $ sudo apt install nginx # Linux automatically: # 1. Creates a system user (no login, no home directory) $ grep nginx /etc/passwd nginx:x:33:33:Nginx web server:/var/lib/nginx:/usr/sbin/nologin # /usr/sbin/nologin = this user CANNOT log in (security!) # 2. Creates a system group $ grep nginx /etc/group nginx:x:33: # 3. Sets file ownership $ ls -la /var/log/nginx/ -rw-r----- 1 www-data adm access.log -rw-r----- 1 www-data adm error.log # Why? Security principle of least privilege: # Nginx runs as the nginx/www-data user # If someone hacks nginx, they only have nginx's limited permissions # They can NOT access other users' files or system commands
Important User Files
FileWhat It ContainsExample Line
/etc/passwdAll user accountssuresh:x:1000:1000:Suresh:/home/suresh:/bin/bash
/etc/shadowEncrypted passwordssuresh:$6$xyz...:19200:0:99999:7:::
/etc/groupAll groups and membersdocker:x:999:suresh,deploy
/etc/sudoersWho can use sudosuresh ALL=(ALL:ALL) ALL
πŸ’‘ Interview Classic

\"When you install a service like nginx or MySQL, Linux creates a dedicated system user with /usr/sbin/nologin shell. This is for security β€” if the service is compromised, the attacker only gets that user's limited permissions, not root access.\"

09⚑

Process Management

Monitor, Kill & Control Running Programs

Every running program is a process with a unique PID (Process ID). DevOps engineers need to find runaway processes, kill hung services, monitor CPU/memory usage, and run background tasks.
View Processes
TERMINAL# List all processes $ ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.1 168940 11420 ? Ss Jun01 0:05 /sbin/init nginx 1234 0.1 0.5 52340 41200 ? S 10:30 0:12 nginx: worker suresh 5678 2.3 1.2 743200 98432 ? Sl 10:45 1:30 java -jar app.jar # Find a specific process $ ps aux | grep nginx $ ps aux | grep java # Process tree (who started whom) $ pstree -p systemd(1)─┬─nginx(1230)─┬─nginx(1231) β”‚ └─nginx(1232) β”œβ”€sshd(800)───sshd(1500)───bash(1501) └─java(5678) # Real-time monitoring $ top # Real-time process viewer $ htop # Better version (install: apt install htop) # In top/htop: press M to sort by memory, P by CPU, q to quit
Kill Processes
TERMINAL# Graceful kill (SIGTERM β€” asks nicely) $ kill 5678 # Send SIGTERM to PID 5678 # Force kill (SIGKILL β€” no mercy) $ kill -9 5678 # Forcefully terminate # Kill by name $ pkill nginx # Kill all nginx processes $ pkill -f "java -jar app.jar" # Kill by full command line match # Kill all processes of a user $ pkill -u deploy # Kill all processes owned by deploy user # What signal numbers mean: # kill -15 = SIGTERM (graceful, default) # kill -9 = SIGKILL (force, last resort) # kill -1 = SIGHUP (reload config, like nginx reload)
Background & Foreground
TERMINAL# Run in background $ ./long-running-script.sh & # & puts it in background [1] 12345 # Job number and PID # Run and survive logout $ nohup ./script.sh > output.log 2>&1 & # nohup = don't stop when terminal closes # > output.log = redirect stdout # 2>&1 = redirect stderr to same place # & = run in background # Job control $ jobs # List background jobs $ fg %1 # Bring job 1 to foreground $ bg %1 # Send job 1 to background # Ctrl+Z = pause current foreground job
10πŸ”„

Service Management β€” systemd

Start, Stop, Enable & Create Services

systemd manages all services on modern Linux. Every time you install nginx, docker, or jenkins, systemd controls it. You use systemctl to start, stop, enable, and check services.
Essential systemctl Commands
TERMINAL$ sudo systemctl start nginx # Start service NOW $ sudo systemctl stop nginx # Stop service NOW $ sudo systemctl restart nginx # Stop + Start (brief downtime) $ sudo systemctl reload nginx # Reload config without stopping (zero downtime!) $ sudo systemctl status nginx # Check if running, see recent logs ● nginx.service - A high performance web server Active: active (running) since Mon 2024-06-01 10:30:00 IST; 2h ago Main PID: 1234 (nginx) Tasks: 3 (limit: 4677) Memory: 8.5M $ sudo systemctl enable nginx # Start automatically on boot $ sudo systemctl disable nginx # Don't start on boot $ sudo systemctl is-active nginx # Just check: active or inactive $ sudo systemctl is-enabled nginx # Check: enabled or disabled # List all services $ systemctl list-units --type=service --state=running
Create Your Own Service
SERVICE FILE# /etc/systemd/system/myapp.service [Unit] Description=My Node.js Application After=network.target # Start after network is ready Wants=postgresql.service # Prefer PostgreSQL to be running [Service] Type=simple User=deploy # Run as deploy user (not root!) Group=deploy WorkingDirectory=/opt/myapp ExecStart=/usr/bin/node server.js Restart=always # Auto-restart if it crashes RestartSec=5 # Wait 5 seconds before restart EnvironmentFile=/opt/myapp/.env # Load environment variables StandardOutput=journal # Send logs to journalctl StandardError=journal [Install] WantedBy=multi-user.target # Start when system boots normally # After creating the file: $ sudo systemctl daemon-reload # Tell systemd about new service $ sudo systemctl start myapp # Start it $ sudo systemctl enable myapp # Start on boot $ journalctl -u myapp -f # Watch logs in real-time
πŸ’‘ Restart=always

This is the most important line in a service file for DevOps. If your app crashes at 3 AM, systemd automatically restarts it in 5 seconds. No pager alert, no manual intervention. Production apps should ALWAYS have Restart=always.

11πŸ“¦

Package Management

Install, Update & Remove Software

Package managers download, install, update, and remove software with all dependencies handled automatically. Know BOTH apt (Ubuntu/Debian) and yum/dnf (RHEL/CentOS) β€” you'll encounter both in the field.
TERMINAL# ═══ APT (Ubuntu / Debian) ═══ $ sudo apt update # Refresh package list (always do first!) $ sudo apt install nginx # Install $ sudo apt install nginx=1.24.0-1 # Install specific version (pin!) $ sudo apt remove nginx # Remove (keep config files) $ sudo apt purge nginx # Remove + delete config files $ sudo apt autoremove # Remove unused dependencies $ sudo apt upgrade # Upgrade ALL packages $ sudo apt search redis # Search for packages $ apt list --installed # List what's installed $ apt show nginx # Show package details # ═══ YUM/DNF (RHEL / CentOS / Amazon Linux) ═══ $ sudo yum update # Update all packages $ sudo yum install nginx # Install $ sudo yum remove nginx # Remove $ sudo yum list installed # List installed $ sudo yum search redis # Search $ sudo dnf install nginx # dnf = modern replacement for yum
πŸ’‘ Pin Versions in Production

Always install specific versions in production: apt install nginx=1.24.0-1. Without pinning, apt upgrade might update nginx to a version with breaking changes. Your CI/CD pipeline should pin every dependency.

12πŸ’Ύ

Disk & Storage Management

Check Space, Mount Drives, Monitor I/O

Running out of disk space is one of the most common production incidents. Know how to check usage, find large files, mount external storage, and monitor disk I/O.
TERMINAL# ═══ Check Disk Space ═══ $ df -h # Disk usage of all mounted filesystems Filesystem Size Used Avail Use% Mounted on /dev/sda1 50G 32G 16G 67% / /dev/sdb1 100G 45G 55G 45% /data tmpfs 3.9G 0 3.9G 0% /dev/shm # ═══ Check Directory Sizes ═══ $ du -sh /var/log/* # Size of each item in /var/log 2.1G /var/log/journal 450M /var/log/nginx 120M /var/log/syslog $ du -sh /opt/myapp # Total size of a directory 1.2G /opt/myapp # ═══ Find Largest Files ═══ $ du -ah / | sort -rh | head -20 # Top 20 largest files/dirs $ find / -type f -size +100M -exec ls -lh {} \; # Files over 100MB # ═══ Disk I/O Statistics ═══ $ iostat -x 1 5 # Disk I/O stats every 1 sec, 5 times # Look for: %util (near 100% = bottleneck), await (high = slow disk) # ═══ List Block Devices ═══ $ lsblk # Show all disks and partitions NAME SIZE TYPE MOUNTPOINT sda 50G disk β”œβ”€sda1 49G part / └─sda2 1G part [SWAP] sdb 100G disk └─sdb1 100G part /data
Mount & Unmount
TERMINAL# Mount a new EBS volume (AWS) $ sudo mkdir /data # Create mount point $ sudo mount /dev/xvdf /data # Mount the volume $ df -h /data # Verify # Make it permanent (survives reboot) $ sudo blkid /dev/xvdf # Get UUID /dev/xvdf: UUID="abc-123" TYPE="ext4" $ sudo nano /etc/fstab # Add this line: UUID=abc-123 /data ext4 defaults,nofail 0 2 # Unmount $ sudo umount /data # Unmount (must not be in use) $ sudo umount -l /data # Lazy unmount (force)
13🌐

Networking

IPs, Ports, DNS & Troubleshooting

Network troubleshooting is 50% of DevOps debugging. When your app can't connect to the database, or users can't reach your website, these commands tell you exactly what's wrong.
Check IPs & Interfaces
TERMINAL$ ip addr show # Show all network interfaces and IPs $ ip addr show eth0 # Specific interface $ ip route show # Show routing table default via 10.0.0.1 dev eth0 # Default gateway $ hostname -I # Quick way to get your IP
Check Open Ports β€” ss and netstat
TERMINAL# ss = modern replacement for netstat (faster, always available) $ ss -tlnp # TCP listening ports with process names State Local Address:Port Process LISTEN 0.0.0.0:22 sshd LISTEN 0.0.0.0:80 nginx LISTEN 0.0.0.0:8080 java LISTEN 127.0.0.1:3306 mysqld # t=TCP, l=listening, n=numeric (don't resolve names), p=process # netstat (older but still used) $ netstat -tlnp # Same output as ss -tlnp $ netstat -an | grep ESTABLISHED # Active connections $ netstat -an | grep :8080 # Who's connected to port 8080? # What process is using a specific port? $ sudo lsof -i :8080 # Show process on port 8080 $ sudo fuser 8080/tcp # PID using port 8080
Test Connectivity β€” ping, nc, curl
TERMINAL# Ping β€” is the host reachable? $ ping -c 4 google.com # Send 4 packets $ ping -c 4 10.0.1.50 # Ping internal server # nc (netcat) β€” is the PORT open? $ nc -zv 10.0.1.50 3306 # Test if MySQL port is open Connection to 10.0.1.50 3306 port [tcp/mysql] succeeded! $ nc -zv db.server.com 5432 # Test PostgreSQL port # Test a range of ports $ nc -zv 10.0.1.50 8080-8090 # Scan ports 8080 to 8090 # curl β€” test HTTP endpoints $ curl -v https://api.myapp.com/health $ curl -I https://api.myapp.com # Headers only (check status code) HTTP/2 200 $ curl -o /dev/null -s -w "%{http_code}" https://myapp.com 200 # Just the status code # wget β€” download files $ wget https://releases.app.com/v2.1/app.tar.gz # DNS lookups $ dig myapp.com # Full DNS query $ nslookup myapp.com # Simple DNS lookup $ dig +short myapp.com # Just the IP 52.66.123.45 # Trace network path $ traceroute google.com # Show every hop to destination
πŸ’‘ Troubleshooting Order

1) ping β€” can I reach the host? 2) nc -zv host port β€” is the port open? 3) curl β€” does the HTTP endpoint respond? 4) ss -tlnp on the server β€” is the service actually listening? This sequence solves 90% of connectivity issues.

14πŸ”₯

Firewall β€” UFW & iptables

Control Network Access

Firewalls block unwanted traffic. UFW is the simple frontend, iptables is the powerful backend. In cloud environments, you usually use Security Groups, but knowing Linux firewalls is essential for on-premises servers.
TERMINAL# ═══ UFW (Ubuntu β€” Simple) ═══ $ sudo ufw enable # Turn on firewall $ sudo ufw status verbose # Check status and rules $ sudo ufw allow 22/tcp # Allow SSH $ sudo ufw allow 80/tcp # Allow HTTP $ sudo ufw allow 443/tcp # Allow HTTPS $ sudo ufw allow from 10.0.0.0/8 # Allow entire internal network $ sudo ufw deny from 203.0.113.50 # Block specific IP $ sudo ufw delete allow 80/tcp # Remove a rule # ═══ iptables (Advanced β€” Low Level) ═══ $ sudo iptables -L -n # List all rules $ sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT # Allow HTTP $ sudo iptables -A INPUT -p tcp --dport 22 -s 10.0.0.0/8 -j ACCEPT # SSH from internal only $ sudo iptables -A INPUT -j DROP # Drop everything else (add LAST!)
15πŸ“œ

Shell Scripting

Automate Everything with Bash

Every DevOps engineer writes bash scripts for deployments, backups, health checks, cleanup, and monitoring. A well-written script replaces hours of manual work.
BASH SCRIPT#!/bin/bash # deploy.sh β€” Production deployment script set -euo pipefail # Exit on error, undefined vars, pipe fails # set -e = stop if ANY command fails # set -u = stop if you use an undefined variable # set -o pipefail = catch errors in pipes APP_DIR="/opt/myapp" BACKUP_DIR="/opt/backups" DATE=$(date +%Y%m%d_%H%M%S) echo "[$(date)] Starting deployment..." # Step 1: Backup current version if [ -d "$APP_DIR" ]; then echo "Creating backup..." tar -czf "$BACKUP_DIR/app_$DATE.tar.gz" "$APP_DIR" fi # Step 2: Pull latest code cd "$APP_DIR" git pull origin main # Step 3: Build npm ci npm run build # Step 4: Restart service sudo systemctl restart myapp # Step 5: Health check sleep 5 if curl -sf http://localhost:3000/health > /dev/null; then echo "[$(date)] Deployment successful!" else echo "[$(date)] HEALTH CHECK FAILED! Rolling back..." tar -xzf "$BACKUP_DIR/app_$DATE.tar.gz" -C / sudo systemctl restart myapp exit 1 fi
Key Bash Concepts
BASH# Variables NAME="suresh" echo "Hello $NAME" # Output: Hello suresh echo "Path is ${HOME}/projects" # Use {} for clarity # Conditionals if [ -f "/opt/app/config.yml" ]; then echo "Config file exists" elif [ -d "/opt/app" ]; then echo "Directory exists but no config" else echo "Nothing exists" fi # Tests: -f file exists, -d directory exists, -z string is empty # -eq equal, -ne not equal, -gt greater than # Loops for server in web1 web2 web3; do echo "Deploying to $server..." ssh deploy@$server 'cd /opt/app && git pull && systemctl restart app' done # While loop while ! curl -sf http://localhost:8080/health; do echo "Waiting for app to start..." sleep 2 done echo "App is ready!" # Functions function deploy() { local app_name=$1 echo "Deploying $app_name" ssh deploy@server "systemctl restart $app_name" } deploy "order-service" deploy "user-service"
⚠️ set -euo pipefail

ALWAYS start scripts with this. Without it, errors are silently ignored β€” a failing command doesn't stop the script. Your deployment continues with corrupt state. This line has saved millions of production incidents.

16πŸ“

Log Management

Find Problems Before Users Complain

Logs are the first thing you check when something breaks. Linux has a standard location for all logs, and systemd has journalctl for structured log queries.
Important Log Locations
Log FileWhat It ContainsWhen to Check
/var/log/syslogGeneral system messagesSystem issues, service failures
/var/log/auth.logAuthentication attemptsSSH logins, sudo usage, failed logins
/var/log/kern.logKernel messagesHardware errors, driver issues, OOM kills
/var/log/nginx/access.logNginx HTTP requestsTraffic analysis, 404s, slow requests
/var/log/nginx/error.logNginx errorsConfig errors, upstream failures
/var/log/apt/history.logPackage installationsWhat was installed/updated and when
journalctl β€” systemd Log Viewer
TERMINAL# View logs for a specific service $ journalctl -u nginx # All nginx logs $ journalctl -u nginx --since today # Today's nginx logs $ journalctl -u nginx --since "1 hour ago" # Last hour $ journalctl -u nginx -f # Follow in real-time (like tail -f) $ journalctl -u nginx -n 50 # Last 50 lines # View system-wide $ journalctl -b # Logs since last boot $ journalctl -p err # Only errors and above $ journalctl --since "2024-06-01 10:00" --until "2024-06-01 11:00" # Search across all logs $ journalctl | grep "Out of memory" # Find OOM kills $ journalctl -u myapp --no-pager | grep ERROR
Log Rotation β€” logrotate
LOGROTATE# /etc/logrotate.d/myapp /var/log/myapp/*.log { daily # Rotate every day rotate 14 # Keep 14 rotated files compress # Compress old logs with gzip delaycompress # Don't compress yesterday's (in case needed) missingok # Don't error if log file is missing notifempty # Don't rotate empty files postrotate # Run after rotation systemctl reload myapp endscript }
πŸ’‘ Always tail -f First

When debugging a live issue, your first command should be: tail -f /var/log/myapp/error.log. Watch the errors flow in real-time while you reproduce the problem. This is the #1 debugging technique for DevOps engineers.

17πŸ”§

System Administration

Cron Jobs, System Info & Performance

Daily sysadmin tasks: schedule automated jobs, check system performance, monitor resources, and manage hostnames/time.
Cron β€” Schedule Automated Tasks
CRONTAB# Edit crontab (task scheduler) $ crontab -e # Format: minute hour day month weekday command # β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ minute (0-59) # β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ hour (0-23) # β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of month (1-31) # β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ month (1-12) # β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of week (0-7, Sun=0 or 7) # * * * * * command # Examples: 0 2 * * * /opt/scripts/backup.sh # Every day at 2:00 AM */5 * * * * curl -sf http://localhost:8080/health # Every 5 minutes 0 0 * * 0 /opt/scripts/weekly-cleanup.sh # Every Sunday at midnight 0 9 * * 1-5 /opt/scripts/daily-report.sh # Mon-Fri at 9 AM # List scheduled jobs $ crontab -l # View cron logs $ grep CRON /var/log/syslog
System Information
TERMINAL# System overview $ uname -a # Kernel version, architecture Linux web1 5.15.0-1043-aws #48-Ubuntu SMP x86_64 GNU/Linux $ uptime # How long running + load average 10:30:15 up 45 days, 3:21, 2 users, load average: 0.75, 0.82, 0.68 # Load average: 1-min, 5-min, 15-min # On 4-core: load 4.0 = fully used, >4.0 = overloaded $ free -h # Memory usage total used free shared buff/cache available Mem: 7.7G 3.2G 1.1G 120M 3.4G 4.1G Swap: 2.0G 100M 1.9G # "available" = actual usable memory (includes reclaimable cache) # CPU info $ nproc # Number of CPU cores 4 $ lscpu # Detailed CPU information # System performance snapshot $ vmstat 1 5 # Virtual memory stats every 1 sec, 5 times # Watch: r (run queue), si/so (swap in/out β€” should be 0)
18πŸ’Ό

Interview Questions

40+ Linux Q&A for DevOps

These Linux questions are asked in every DevOps interview β€” from freshers to senior positions.
Commands & Files
❓
Difference between find and grep?
find searches for FILES by name, size, date, permissions. grep searches INSIDE files for text patterns. find locates the file, grep reads its content.
❓
find -mtime vs -ctime?
mtime = content modification time (file was edited). ctime = metadata change time (permissions, ownership changed). mtime -7 = modified in last 7 days.
❓
How to find large files?
find / -type f -size +100M lists files larger than 100MB. Combine with du -sh /var/log/* | sort -rh | head to find largest directories.
❓
head vs tail?
head -20 shows first 20 lines. tail -20 shows last 20 lines. tail -f follows a file in real-time (essential for debugging live logs).
Users & Permissions
❓
chmod 755 vs 644?
755 (rwxr-xr-x): for scripts and directories. 644 (rw-r--r--): for config files and regular files. 600 (rw-------): for secrets and private keys.
❓
How to add user to docker group?
sudo usermod -aG docker username. The -a flag APPENDS to groups. Without -a, it REPLACES all groups (dangerous!). Must log out and back in.
❓
What happens when you install a service?
Linux creates a system user with /usr/sbin/nologin shell, creates a group, sets file ownership. Security: if service is hacked, attacker only gets limited permissions.
❓
What is /etc/passwd vs /etc/shadow?
passwd: user accounts (username, UID, shell). shadow: encrypted passwords (only readable by root). Separated for security.
Processes & Services
❓
How to find which process uses a port?
ss -tlnp | grep :8080 or sudo lsof -i :8080 or sudo fuser 8080/tcp. Shows the PID and process name.
❓
kill vs kill -9?
kill (SIGTERM): asks process to shut down gracefully (save data, close connections). kill -9 (SIGKILL): force kills immediately (no cleanup). Always try kill first.
❓
nohup purpose?
nohup command & runs a process that survives when you close the terminal. Without nohup, background processes die when your SSH session disconnects.
❓
systemctl reload vs restart?
reload: reads new config without stopping service (zero downtime). restart: stops and starts (brief downtime). Always use reload for Nginx in production.
Networking & Troubleshooting
❓
ss vs netstat?
ss is modern and faster (uses kernel netlink). netstat is older, being deprecated. Use ss -tlnp to show listening TCP ports with process names.
❓
nc (netcat) usage?
nc -zv host port tests if a port is open. Essential for checking: can my app server reach the database on port 3306? Faster than telnet.
❓
How to troubleshoot connectivity?
1) ping host (reachable?), 2) nc -zv host port (port open?), 3) curl endpoint (HTTP working?), 4) ss -tlnp on server (service listening?).
❓
What is /etc/hosts?
Local DNS override file. Maps hostnames to IPs. Checked BEFORE DNS servers. Use for testing: add 10.0.1.50 api.myapp.com to test against a specific server.
Storage & Disk
❓
df vs du?
df shows filesystem-level usage (how full is /dev/sda1). du shows directory-level usage (how big is /var/log). df for overview, du for drilling down.
❓
How to mount a volume?
mkdir /data, mount /dev/xvdf /data, add to /etc/fstab for permanent mount. In AWS, after attaching EBS volume, you must format (mkfs) and mount it.
❓
iostat purpose?
Shows disk I/O statistics. %util near 100% = disk bottleneck. High await = slow disk. Essential for diagnosing slow database or application performance.
❓
What is swap?
Virtual memory on disk. When RAM is full, Linux moves inactive pages to swap. High swap usage = need more RAM. Check with free -h and swapon --show.
βœ“Master cd, ls, grep, find, cat, tail -f β€” they're 80% of your daily work
βœ“Know chmod numbers: 755 (scripts), 644 (files), 600 (secrets)
βœ“Always use set -euo pipefail in bash scripts
βœ“Use ss -tlnp (not netstat) for port checking
βœ“Know systemctl start/stop/restart/reload/enable/status
βœ“Know user management: useradd, usermod -aG, groups, /etc/passwd
βœ“Know journalctl -u service -f for live log monitoring
βœ“Know cron syntax: minute hour day month weekday command
βœ“Know tar -czvf (create) and tar -xzvf (extract)
βœ“Understand Linux directory structure (/etc, /var/log, /opt, /proc)