From your first script to production automation — variables, loops, functions, error handling, real-world scripts, and interview prep.
15
Chapters
50+
Scripts
100%
Free
01📜
Introduction to Shell Scripting
Why Every DevOps Engineer Writes Scripts
Shell scripting is writing a series of commands in a file that the computer executes one by one. Instead of typing 10 commands manually every day, you put them in a script and run it with one command. That is shell scripting — automating repetitive tasks. Every DevOps engineer writes shell scripts daily for deployments, backups, monitoring, and cleanup.
Your First Script
BASH#!/bin/bash
# my_first_script.sh — The shebang line tells Linux which shell to use
echo "Hello! I am a shell script"
echo "Today is: $(date)"
echo "You are logged in as: $(whoami)"
echo "Current directory: $(pwd)"
# Make it executable and run:
# chmod +x my_first_script.sh
# ./my_first_script.sh
What is #!/bin/bash?
The first line of every script. Called "shebang" or "hashbang". It tells Linux: "use the bash shell to run this file." Without it, Linux does not know which interpreter to use. Always include it.
02📦
Variables
Store and Use Data
Variables are like labeled boxes — you put a value inside and use the label to get it back. In bash, you create a variable with NAME=value (no spaces around =) and access it with $NAME.
BASH#!/bin/bash
# Variables — NO spaces around the = sign!
APP_NAME="order-service"
VERSION="2.1.0"
PORT=8080
DEPLOY_DIR="/opt/apps"
echo "Deploying $APP_NAME version $VERSION on port $PORT"
echo "Install directory: $DEPLOY_DIR"
# Command output as variable
CURRENT_DATE=$(date +%Y-%m-%d)
HOSTNAME=$(hostname)
FREE_MEM=$(free -m | awk "/Mem:/ {print \$4}")
echo "Date: $CURRENT_DATE"
echo "Host: $HOSTNAME"
echo "Free memory: ${FREE_MEM}MB"
# Read-only variable (cannot be changed)
readonly DB_HOST="prod-db.company.com"
# Environment variable (visible to child processes)
export API_KEY="abc123"
Type
Syntax
Example
String
NAME="value"
APP="nginx"
Number
COUNT=5
RETRIES=3
Command output
VAR=$(command)
TODAY=$(date)
Read-only
readonly VAR="val"
readonly DB="prod-db"
Environment
export VAR="val"
export PATH="/usr/local/bin:$PATH"
⚠️ No Spaces Around =
APP_NAME = "nginx" is WRONG (bash thinks APP_NAME is a command). APP_NAME="nginx" is correct. This is the #1 beginner mistake.
03🖥️
Input & Output
Echo, Read, Printf
Scripts need to show output (echo) and sometimes ask for input (read). These are the basic building blocks for interactive scripts.
BASH#!/bin/bash
# Echo — print output
echo "Simple message"
echo -n "No newline at end" # stays on same line
echo -e "Line 1\nLine 2" # \n = newline, \t = tab
# Printf — formatted output (more control)
printf "Name: %-15s Age: %d\n" "Suresh" 28
printf "Price: $%.2f\n" 49.9
# Read — get user input
echo -n "Enter server name: "
read SERVER_NAME
echo "You entered: $SERVER_NAME"
# Read with prompt (cleaner)
read -p "Enter environment (dev/prod): " ENV
read -sp "Enter password: " PASSWORD # -s = silent (hidden)
echo "" # newline after hidden input
# Read with timeout
read -t 10 -p "Confirm deploy? (y/n): " CONFIRM
# Waits max 10 seconds for input
04🔀
Conditionals
If, Elif, Else — Make Decisions
Conditionals let your script make decisions. "If the server is running, do nothing. If it is down, restart it." This is the brain of every automation script.
BASH#!/bin/bash
# Basic if-else
SERVER_STATUS=$(systemctl is-active nginx)
if [ "$SERVER_STATUS" = "active" ]; then
echo "Nginx is running — all good"
elif [ "$SERVER_STATUS" = "inactive" ]; then
echo "Nginx is down — restarting..."
sudo systemctl start nginx
else
echo "Nginx status unknown: $SERVER_STATUS"
fi
# File checks
if [ -f "/etc/nginx/nginx.conf" ]; then
echo "Config file exists"
fi
if [ -d "/var/log/nginx" ]; then
echo "Log directory exists"
fi
if [ ! -f "/tmp/deploy.lock" ]; then
echo "No deploy in progress — safe to deploy"
fi
# Number comparisons
DISK_USAGE=$(df / | tail -1 | awk "{print \$5}" | tr -d "%")
if [ "$DISK_USAGE" -gt 80 ]; then
echo "WARNING: Disk usage is ${DISK_USAGE}%!"
elif [ "$DISK_USAGE" -gt 60 ]; then
echo "Disk usage is moderate: ${DISK_USAGE}%"
else
echo "Disk usage OK: ${DISK_USAGE}%"
fi
Test
Meaning
Example
-f file
File exists
[ -f /etc/hosts ]
-d dir
Directory exists
[ -d /var/log ]
-z string
String is empty
[ -z "$VAR" ]
-n string
String is not empty
[ -n "$VAR" ]
-eq
Numbers equal
[ "$A" -eq 5 ]
-gt / -lt
Greater / less than
[ "$A" -gt 10 ]
=
Strings equal
[ "$A" = "yes" ]
!=
Strings not equal
[ "$A" != "no" ]
05🔄
Loops
Repeat Actions Automatically
Loops repeat commands multiple times. Deploy to 5 servers? Loop. Check 10 services? Loop. Process 100 log files? Loop. This is where scripts become powerful.
BASH#!/bin/bash
# For loop — iterate over a list
for SERVER in web1 web2 web3 db1; do
echo "Checking $SERVER..."
ping -c 1 -W 2 "$SERVER" > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo " $SERVER is UP"
else
echo " $SERVER is DOWN!"
fi
done
# For loop — iterate over files
for FILE in /var/log/*.log; do
SIZE=$(du -sh "$FILE" | cut -f1)
echo "$FILE: $SIZE"
done
# For loop — number range
for i in {1..5}; do
echo "Attempt $i of 5"
done
# While loop — repeat until condition is false
RETRIES=0
MAX_RETRIES=5
while [ $RETRIES -lt $MAX_RETRIES ]; do
curl -sf http://localhost:8080/health > /dev/null
if [ $? -eq 0 ]; then
echo "App is healthy!"
break
fi
RETRIES=$((RETRIES + 1))
echo "Retry $RETRIES/$MAX_RETRIES — waiting 5 seconds..."
sleep 5
done
# While read — process file line by line
while IFS= read -r LINE; do
echo "Processing: $LINE"
done < servers.txt
06🧩
Functions
Reusable Code Blocks
Functions let you write code once and call it many times. Instead of copying the same 10 lines in 5 places, put them in a function and call it. When you fix a bug, you fix it in one place.
BASH#!/bin/bash
# Define a function
log() {
echo "[$(date +"%Y-%m-%d %H:%M:%S")] $1"
}
# Use it
log "Starting deployment"
log "Building application"
log "Deployment complete"
# Output: [2024-06-01 10:30:15] Starting deployment
# Function with return value
is_service_running() {
local SERVICE=$1
systemctl is-active --quiet "$SERVICE"
return $? # returns 0 (true) or 1 (false)
}
if is_service_running "nginx"; then
log "Nginx is running"
else
log "Nginx is down — starting it"
sudo systemctl start nginx
fi
# Function with multiple parameters
deploy() {
local APP=$1
local ENV=$2
local VERSION=$3
log "Deploying $APP v$VERSION to $ENV"
# deployment logic here
}
deploy "order-service" "staging" "2.1.0"
deploy "user-service" "production" "1.5.2"
local keyword
Always use local for variables inside functions. Without local, the variable is GLOBAL and can accidentally overwrite variables outside the function. local APP=value keeps it contained.
07✂️
String Operations
Cut, Replace, Measure
BASH#!/bin/bash
STR="Hello World DevOps"
# Length
echo ${#STR} # 18
# Substring
echo ${STR:0:5} # Hello (from position 0, take 5 chars)
echo ${STR:6} # World DevOps (from position 6 to end)
# Replace
FILE="app-v1.2.3.tar.gz"
echo ${FILE%.tar.gz} # app-v1.2.3 (remove suffix)
echo ${FILE##*.} # gz (get extension)
echo ${FILE/v1/v2} # app-v2.2.3.tar.gz (replace first match)
# Default values (super useful!)
ENV=${1:-"development"} # Use $1 if provided, otherwise "development"
echo "Environment: $ENV"
DB_PORT=${DB_PORT:-5432} # Use env var if set, otherwise 5432
echo "DB Port: $DB_PORT"
08📋
Arrays
Work with Lists of Items
BASH#!/bin/bash
# Create array
SERVERS=("web1" "web2" "web3" "db1" "db2")
PORTS=(80 443 8080 3306)
# Access elements
echo ${SERVERS[0]} # web1 (first element)
echo ${SERVERS[2]} # web3 (third element)
echo ${SERVERS[@]} # all elements
echo ${#SERVERS[@]} # 5 (count)
# Loop through array
for SERVER in "${SERVERS[@]}"; do
echo "Pinging $SERVER..."
done
# Add element
SERVERS+=("web4")
# Real-world: deploy to multiple environments
ENVIRONMENTS=("dev" "staging" "production")
for ENV in "${ENVIRONMENTS[@]}"; do
echo "Deploying to $ENV..."
kubectl apply -f "k8s/$ENV/" 2>/dev/null
done
Without error handling, a failing command does not stop the script. Your deployment continues with corrupt state. Error handling is the difference between a toy script and a production script.
BASH#!/bin/bash
set -euo pipefail
# set -e → Stop if ANY command fails
# set -u → Stop if undefined variable is used
# set -o pipefail → Catch errors in piped commands
# Trap — run cleanup on exit (even on error)
cleanup() {
echo "Cleaning up temp files..."
rm -f /tmp/deploy_*.tmp
rm -f /tmp/deploy.lock
}
trap cleanup EXIT # Runs cleanup when script exits (success or failure)
trap cleanup ERR # Also runs on error
# Lock file — prevent two deploys at same time
LOCKFILE="/tmp/deploy.lock"
if [ -f "$LOCKFILE" ]; then
echo "ERROR: Deploy already in progress!"
exit 1
fi
touch "$LOCKFILE"
# Exit codes
check_service() {
if ! systemctl is-active --quiet "$1"; then
echo "ERROR: $1 is not running"
return 1 # return non-zero = failure
fi
return 0 # return 0 = success
}
check_service "nginx" || exit 1
⚠️ ALWAYS use set -euo pipefail
This single line has prevented millions of production disasters. Without it, a failed command is silently ignored and the script continues. Your database backup failed? Script keeps going and deletes the old backup. Now you have NO backup. set -euo pipefail stops this.
11🔤
Text Processing
awk, sed, cut, tr — Transform Data
BASH#!/bin/bash
# awk — extract columns (like a spreadsheet)
# Print 1st column (IP addresses) from access log
awk "{print \$1}" access.log | sort | uniq -c | sort -rn | head -10
# Print specific fields
echo "John:28:Engineer" | awk -F: "{print \$1, \$3}"
# Output: John Engineer
# sed — find and replace in files
sed -i "s/old-version/new-version/g" config.yml # replace in file
sed -i "s/DEBUG=true/DEBUG=false/" .env # toggle debug off
sed -n "10,20p" logfile.txt # print lines 10-20
# cut — extract by delimiter
echo "user:password:uid:gid" | cut -d: -f1,3
# Output: user:uid
# tr — translate/delete characters
echo "HELLO WORLD" | tr "A-Z" "a-z" # hello world
echo " extra spaces " | tr -s " " # extra spaces
# sort + uniq — count unique values
cat access.log | awk "{print \$1}" | sort | uniq -c | sort -rn
# Shows: count IP_address (most frequent first)
# Combine them (the UNIX philosophy)
# Find top 5 error types in last 1000 lines of log
tail -1000 app.log | grep "ERROR" | awk -F":" "{print \$4}" | sort | uniq -c | sort -rn | head -5
12🚀
Real-World Scripts
Production Automation
These are scripts that DevOps engineers use daily in production. Copy them, modify them, use them.
Deployment Script with Rollback
DEPLOYMENT#!/bin/bash
set -euo pipefail
APP="order-service"
DEPLOY_DIR="/opt/$APP"
BACKUP_DIR="/opt/backups"
DATE=$(date +%Y%m%d_%H%M%S)
log() { echo "[$(date +\"%H:%M:%S\")] $1"; }
log "Starting deployment of $APP"
# Step 1: Backup
log "Creating backup..."
tar -czf "$BACKUP_DIR/${APP}_${DATE}.tar.gz" "$DEPLOY_DIR" 2>/dev/null || true
# Step 2: Pull latest code
log "Pulling latest code..."
cd "$DEPLOY_DIR"
git pull origin main
# Step 3: Build
log "Building..."
npm ci && npm run build
# Step 4: Restart
log "Restarting service..."
sudo systemctl restart "$APP"
# Step 5: Health check
sleep 5
if curl -sf http://localhost:3000/health > /dev/null; then
log "Deployment successful!"
else
log "HEALTH CHECK FAILED! Rolling back..."
tar -xzf "$BACKUP_DIR/${APP}_${DATE}.tar.gz" -C /
sudo systemctl restart "$APP"
log "Rolled back to previous version"
exit 1
fi
Automated Backup Script
BACKUP#!/bin/bash
set -euo pipefail
# Run daily via cron: 0 2 * * * /opt/scripts/backup.sh
BACKUP_DIR="/opt/backups"
DATE=$(date +%Y%m%d)
RETENTION_DAYS=14
# Backup database
mysqldump -u root mydb | gzip > "$BACKUP_DIR/db_${DATE}.sql.gz"
# Backup application
tar -czf "$BACKUP_DIR/app_${DATE}.tar.gz" /opt/myapp/
# Upload to S3
aws s3 cp "$BACKUP_DIR/db_${DATE}.sql.gz" s3://my-backups/db/
aws s3 cp "$BACKUP_DIR/app_${DATE}.tar.gz" s3://my-backups/app/
# Delete old local backups
find "$BACKUP_DIR" -name "*.gz" -mtime +$RETENTION_DAYS -delete
echo "[$(date)] Backup complete"
13📅
Daily Useful Scripts
Copy and Use Immediately
Disk Space Alert
BASH#!/bin/bash
# disk_alert.sh — Alert if disk usage exceeds threshold
THRESHOLD=80
USAGE=$(df / | tail -1 | awk "{print \$5}" | tr -d "%")
if [ "$USAGE" -gt "$THRESHOLD" ]; then
echo "ALERT: Disk usage is ${USAGE}% (threshold: ${THRESHOLD}%)"
# Send alert to Slack/email here
fi
Service Health Monitor
BASH#!/bin/bash
# health_monitor.sh — Check multiple services
SERVICES=("nginx" "docker" "mysql")
for SVC in "${SERVICES[@]}"; do
if systemctl is-active --quiet "$SVC"; then
echo "[ OK ] $SVC"
else
echo "[FAIL] $SVC — attempting restart..."
sudo systemctl start "$SVC"
fi
done
BASH#!/bin/bash
# git_deploy.sh — Pull, build, deploy across multiple repos
REPOS=("frontend" "backend" "worker")
BASE_DIR="/opt/apps"
for REPO in "${REPOS[@]}"; do
echo "Updating $REPO..."
cd "$BASE_DIR/$REPO"
git pull origin main
if [ -f "package.json" ]; then
npm ci && npm run build
elif [ -f "pom.xml" ]; then
mvn clean package -DskipTests
fi
sudo systemctl restart "$REPO"
echo "$REPO updated and restarted"
done
14🏆
Best Practices
Write Scripts Like a Senior Engineer
✓Always start with #!/bin/bash and set -euo pipefail
✓Quote all variables: "$VAR" not $VAR (prevents word splitting on spaces)
✓Use local variables inside functions to avoid global pollution
✓Add a log() function for consistent timestamped output
✓Use trap cleanup EXIT for cleanup on exit (remove temp files, lock files)
✓Use lock files to prevent concurrent execution
✓Add comments explaining WHY, not WHAT (code shows what, comments show why)
✓Use meaningful variable names: DEPLOY_DIR not D, SERVER_NAME not S
✓Test scripts with bash -x script.sh (shows every command as it runs)
✓Use shellcheck (linter) to catch common bash mistakes
✓Store scripts in Git — version control your automation
✓Add --help flag to scripts for documentation
✓Use functions for any logic repeated more than twice
✓Handle edge cases: what if the file does not exist? What if the service is already stopped?
15💼
Interview Questions
Shell Scripting Q&A
❓
What is #!/bin/bash?
Shebang line — tells the OS which interpreter to use for the script. Without it, the OS does not know to use bash. Always the first line.
❓
set -euo pipefail?
set -e stops on error, -u stops on undefined variables, -o pipefail catches pipe failures. Production scripts MUST have this.
❓
$? meaning?
Exit code of the last command. 0 = success, non-zero = failure. Used in if conditions: if [ $? -eq 0 ]; then echo success; fi
❓
Single vs double quotes?
Double quotes "..." allow variable expansion ($VAR becomes its value). Single quotes '...' treat everything as literal text ($VAR stays as $VAR).
❓
$0 $1 $2 $@ $#?
$0=script name, $1=first argument, $2=second, $@=all arguments, $#=count of arguments. Called positional parameters.
❓
How to read a file?
while IFS= read -r line; do echo "$line"; done < file.txt — reads line by line safely handling spaces and special chars.
❓
cron syntax?
minute hour day month weekday command. Example: 0 2 * * * /opt/backup.sh runs at 2:00 AM every day.
❓
trap command?
Catches signals and runs cleanup. trap cleanup EXIT runs the cleanup function when script exits (success or error). Essential for temp file cleanup.
❓
Difference: > vs >>?
> overwrites the file (creates new). >> appends to existing file. echo "log" > file.txt = fresh. echo "log" >> file.txt = add to end.
❓
What is $() ?
Command substitution — runs the command inside and replaces with its output. TODAY=$(date) puts today date into the variable.
❓
How to debug scripts?
bash -x script.sh shows every command before execution. Add set -x at top of script for same effect. Remove with set +x.
❓
awk vs sed vs grep?
grep FINDS lines matching a pattern. sed REPLACES text in files. awk PROCESSES structured data (columns, fields). All three are essential.