DevOpsClicks
← Home
#!/bin/bash — Automate Everything

Shell Scripting Complete Guide

From your first script to production automation — variables, loops, functions, error handling, real-world scripts, and interview prep.

15
Chapters
50+
Scripts
100%
Free
01📜

Introduction to Shell Scripting

Why Every DevOps Engineer Writes Scripts

Shell scripting is writing a series of commands in a file that the computer executes one by one. Instead of typing 10 commands manually every day, you put them in a script and run it with one command. That is shell scripting — automating repetitive tasks. Every DevOps engineer writes shell scripts daily for deployments, backups, monitoring, and cleanup.
Your First Script
BASH#!/bin/bash # my_first_script.sh — The shebang line tells Linux which shell to use echo "Hello! I am a shell script" echo "Today is: $(date)" echo "You are logged in as: $(whoami)" echo "Current directory: $(pwd)" # Make it executable and run: # chmod +x my_first_script.sh # ./my_first_script.sh
What is #!/bin/bash?

The first line of every script. Called "shebang" or "hashbang". It tells Linux: "use the bash shell to run this file." Without it, Linux does not know which interpreter to use. Always include it.

02📦

Variables

Store and Use Data

Variables are like labeled boxes — you put a value inside and use the label to get it back. In bash, you create a variable with NAME=value (no spaces around =) and access it with $NAME.
BASH#!/bin/bash # Variables — NO spaces around the = sign! APP_NAME="order-service" VERSION="2.1.0" PORT=8080 DEPLOY_DIR="/opt/apps" echo "Deploying $APP_NAME version $VERSION on port $PORT" echo "Install directory: $DEPLOY_DIR" # Command output as variable CURRENT_DATE=$(date +%Y-%m-%d) HOSTNAME=$(hostname) FREE_MEM=$(free -m | awk "/Mem:/ {print \$4}") echo "Date: $CURRENT_DATE" echo "Host: $HOSTNAME" echo "Free memory: ${FREE_MEM}MB" # Read-only variable (cannot be changed) readonly DB_HOST="prod-db.company.com" # Environment variable (visible to child processes) export API_KEY="abc123"
TypeSyntaxExample
StringNAME="value"APP="nginx"
NumberCOUNT=5RETRIES=3
Command outputVAR=$(command)TODAY=$(date)
Read-onlyreadonly VAR="val"readonly DB="prod-db"
Environmentexport VAR="val"export PATH="/usr/local/bin:$PATH"
⚠️ No Spaces Around =

APP_NAME = "nginx" is WRONG (bash thinks APP_NAME is a command). APP_NAME="nginx" is correct. This is the #1 beginner mistake.

03🖥️

Input & Output

Echo, Read, Printf

Scripts need to show output (echo) and sometimes ask for input (read). These are the basic building blocks for interactive scripts.
BASH#!/bin/bash # Echo — print output echo "Simple message" echo -n "No newline at end" # stays on same line echo -e "Line 1\nLine 2" # \n = newline, \t = tab # Printf — formatted output (more control) printf "Name: %-15s Age: %d\n" "Suresh" 28 printf "Price: $%.2f\n" 49.9 # Read — get user input echo -n "Enter server name: " read SERVER_NAME echo "You entered: $SERVER_NAME" # Read with prompt (cleaner) read -p "Enter environment (dev/prod): " ENV read -sp "Enter password: " PASSWORD # -s = silent (hidden) echo "" # newline after hidden input # Read with timeout read -t 10 -p "Confirm deploy? (y/n): " CONFIRM # Waits max 10 seconds for input
04🔀

Conditionals

If, Elif, Else — Make Decisions

Conditionals let your script make decisions. "If the server is running, do nothing. If it is down, restart it." This is the brain of every automation script.
BASH#!/bin/bash # Basic if-else SERVER_STATUS=$(systemctl is-active nginx) if [ "$SERVER_STATUS" = "active" ]; then echo "Nginx is running — all good" elif [ "$SERVER_STATUS" = "inactive" ]; then echo "Nginx is down — restarting..." sudo systemctl start nginx else echo "Nginx status unknown: $SERVER_STATUS" fi # File checks if [ -f "/etc/nginx/nginx.conf" ]; then echo "Config file exists" fi if [ -d "/var/log/nginx" ]; then echo "Log directory exists" fi if [ ! -f "/tmp/deploy.lock" ]; then echo "No deploy in progress — safe to deploy" fi # Number comparisons DISK_USAGE=$(df / | tail -1 | awk "{print \$5}" | tr -d "%") if [ "$DISK_USAGE" -gt 80 ]; then echo "WARNING: Disk usage is ${DISK_USAGE}%!" elif [ "$DISK_USAGE" -gt 60 ]; then echo "Disk usage is moderate: ${DISK_USAGE}%" else echo "Disk usage OK: ${DISK_USAGE}%" fi
TestMeaningExample
-f fileFile exists[ -f /etc/hosts ]
-d dirDirectory exists[ -d /var/log ]
-z stringString is empty[ -z "$VAR" ]
-n stringString is not empty[ -n "$VAR" ]
-eqNumbers equal[ "$A" -eq 5 ]
-gt / -ltGreater / less than[ "$A" -gt 10 ]
=Strings equal[ "$A" = "yes" ]
!=Strings not equal[ "$A" != "no" ]
05🔄

Loops

Repeat Actions Automatically

Loops repeat commands multiple times. Deploy to 5 servers? Loop. Check 10 services? Loop. Process 100 log files? Loop. This is where scripts become powerful.
BASH#!/bin/bash # For loop — iterate over a list for SERVER in web1 web2 web3 db1; do echo "Checking $SERVER..." ping -c 1 -W 2 "$SERVER" > /dev/null 2>&1 if [ $? -eq 0 ]; then echo " $SERVER is UP" else echo " $SERVER is DOWN!" fi done # For loop — iterate over files for FILE in /var/log/*.log; do SIZE=$(du -sh "$FILE" | cut -f1) echo "$FILE: $SIZE" done # For loop — number range for i in {1..5}; do echo "Attempt $i of 5" done # While loop — repeat until condition is false RETRIES=0 MAX_RETRIES=5 while [ $RETRIES -lt $MAX_RETRIES ]; do curl -sf http://localhost:8080/health > /dev/null if [ $? -eq 0 ]; then echo "App is healthy!" break fi RETRIES=$((RETRIES + 1)) echo "Retry $RETRIES/$MAX_RETRIES — waiting 5 seconds..." sleep 5 done # While read — process file line by line while IFS= read -r LINE; do echo "Processing: $LINE" done < servers.txt
06🧩

Functions

Reusable Code Blocks

Functions let you write code once and call it many times. Instead of copying the same 10 lines in 5 places, put them in a function and call it. When you fix a bug, you fix it in one place.
BASH#!/bin/bash # Define a function log() { echo "[$(date +"%Y-%m-%d %H:%M:%S")] $1" } # Use it log "Starting deployment" log "Building application" log "Deployment complete" # Output: [2024-06-01 10:30:15] Starting deployment # Function with return value is_service_running() { local SERVICE=$1 systemctl is-active --quiet "$SERVICE" return $? # returns 0 (true) or 1 (false) } if is_service_running "nginx"; then log "Nginx is running" else log "Nginx is down — starting it" sudo systemctl start nginx fi # Function with multiple parameters deploy() { local APP=$1 local ENV=$2 local VERSION=$3 log "Deploying $APP v$VERSION to $ENV" # deployment logic here } deploy "order-service" "staging" "2.1.0" deploy "user-service" "production" "1.5.2"
local keyword

Always use local for variables inside functions. Without local, the variable is GLOBAL and can accidentally overwrite variables outside the function. local APP=value keeps it contained.

07✂️

String Operations

Cut, Replace, Measure

BASH#!/bin/bash STR="Hello World DevOps" # Length echo ${#STR} # 18 # Substring echo ${STR:0:5} # Hello (from position 0, take 5 chars) echo ${STR:6} # World DevOps (from position 6 to end) # Replace FILE="app-v1.2.3.tar.gz" echo ${FILE%.tar.gz} # app-v1.2.3 (remove suffix) echo ${FILE##*.} # gz (get extension) echo ${FILE/v1/v2} # app-v2.2.3.tar.gz (replace first match) # Default values (super useful!) ENV=${1:-"development"} # Use $1 if provided, otherwise "development" echo "Environment: $ENV" DB_PORT=${DB_PORT:-5432} # Use env var if set, otherwise 5432 echo "DB Port: $DB_PORT"
08📋

Arrays

Work with Lists of Items

BASH#!/bin/bash # Create array SERVERS=("web1" "web2" "web3" "db1" "db2") PORTS=(80 443 8080 3306) # Access elements echo ${SERVERS[0]} # web1 (first element) echo ${SERVERS[2]} # web3 (third element) echo ${SERVERS[@]} # all elements echo ${#SERVERS[@]} # 5 (count) # Loop through array for SERVER in "${SERVERS[@]}"; do echo "Pinging $SERVER..." done # Add element SERVERS+=("web4") # Real-world: deploy to multiple environments ENVIRONMENTS=("dev" "staging" "production") for ENV in "${ENVIRONMENTS[@]}"; do echo "Deploying to $ENV..." kubectl apply -f "k8s/$ENV/" 2>/dev/null done
09📁

File Operations

Read, Write, Check Files

BASH#!/bin/bash # Check if file/directory exists [ -f "config.yml" ] && echo "Config exists" || echo "Config missing!" [ -d "/opt/app" ] || mkdir -p /opt/app # Read file line by line while IFS= read -r LINE; do # Skip empty lines and comments [[ -z "$LINE" || "$LINE" =~ ^# ]] && continue echo "Server: $LINE" done < servers.txt # Write to file echo "Deployment log" > deploy.log # overwrite echo "$(date): Started" >> deploy.log # append # Create config file from script cat > /tmp/nginx.conf << EOF server { listen 80; server_name myapp.com; location / { proxy_pass http://localhost:3000; } } EOF # Find and process files find /var/log -name "*.log" -mtime +30 -exec gzip {} \; echo "Old logs compressed"
10🛡️

Error Handling

Write Bulletproof Scripts

Without error handling, a failing command does not stop the script. Your deployment continues with corrupt state. Error handling is the difference between a toy script and a production script.
BASH#!/bin/bash set -euo pipefail # set -e → Stop if ANY command fails # set -u → Stop if undefined variable is used # set -o pipefail → Catch errors in piped commands # Trap — run cleanup on exit (even on error) cleanup() { echo "Cleaning up temp files..." rm -f /tmp/deploy_*.tmp rm -f /tmp/deploy.lock } trap cleanup EXIT # Runs cleanup when script exits (success or failure) trap cleanup ERR # Also runs on error # Lock file — prevent two deploys at same time LOCKFILE="/tmp/deploy.lock" if [ -f "$LOCKFILE" ]; then echo "ERROR: Deploy already in progress!" exit 1 fi touch "$LOCKFILE" # Exit codes check_service() { if ! systemctl is-active --quiet "$1"; then echo "ERROR: $1 is not running" return 1 # return non-zero = failure fi return 0 # return 0 = success } check_service "nginx" || exit 1
⚠️ ALWAYS use set -euo pipefail

This single line has prevented millions of production disasters. Without it, a failed command is silently ignored and the script continues. Your database backup failed? Script keeps going and deletes the old backup. Now you have NO backup. set -euo pipefail stops this.

11🔤

Text Processing

awk, sed, cut, tr — Transform Data

BASH#!/bin/bash # awk — extract columns (like a spreadsheet) # Print 1st column (IP addresses) from access log awk "{print \$1}" access.log | sort | uniq -c | sort -rn | head -10 # Print specific fields echo "John:28:Engineer" | awk -F: "{print \$1, \$3}" # Output: John Engineer # sed — find and replace in files sed -i "s/old-version/new-version/g" config.yml # replace in file sed -i "s/DEBUG=true/DEBUG=false/" .env # toggle debug off sed -n "10,20p" logfile.txt # print lines 10-20 # cut — extract by delimiter echo "user:password:uid:gid" | cut -d: -f1,3 # Output: user:uid # tr — translate/delete characters echo "HELLO WORLD" | tr "A-Z" "a-z" # hello world echo " extra spaces " | tr -s " " # extra spaces # sort + uniq — count unique values cat access.log | awk "{print \$1}" | sort | uniq -c | sort -rn # Shows: count IP_address (most frequent first) # Combine them (the UNIX philosophy) # Find top 5 error types in last 1000 lines of log tail -1000 app.log | grep "ERROR" | awk -F":" "{print \$4}" | sort | uniq -c | sort -rn | head -5
12🚀

Real-World Scripts

Production Automation

These are scripts that DevOps engineers use daily in production. Copy them, modify them, use them.
Deployment Script with Rollback
DEPLOYMENT#!/bin/bash set -euo pipefail APP="order-service" DEPLOY_DIR="/opt/$APP" BACKUP_DIR="/opt/backups" DATE=$(date +%Y%m%d_%H%M%S) log() { echo "[$(date +\"%H:%M:%S\")] $1"; } log "Starting deployment of $APP" # Step 1: Backup log "Creating backup..." tar -czf "$BACKUP_DIR/${APP}_${DATE}.tar.gz" "$DEPLOY_DIR" 2>/dev/null || true # Step 2: Pull latest code log "Pulling latest code..." cd "$DEPLOY_DIR" git pull origin main # Step 3: Build log "Building..." npm ci && npm run build # Step 4: Restart log "Restarting service..." sudo systemctl restart "$APP" # Step 5: Health check sleep 5 if curl -sf http://localhost:3000/health > /dev/null; then log "Deployment successful!" else log "HEALTH CHECK FAILED! Rolling back..." tar -xzf "$BACKUP_DIR/${APP}_${DATE}.tar.gz" -C / sudo systemctl restart "$APP" log "Rolled back to previous version" exit 1 fi
Automated Backup Script
BACKUP#!/bin/bash set -euo pipefail # Run daily via cron: 0 2 * * * /opt/scripts/backup.sh BACKUP_DIR="/opt/backups" DATE=$(date +%Y%m%d) RETENTION_DAYS=14 # Backup database mysqldump -u root mydb | gzip > "$BACKUP_DIR/db_${DATE}.sql.gz" # Backup application tar -czf "$BACKUP_DIR/app_${DATE}.tar.gz" /opt/myapp/ # Upload to S3 aws s3 cp "$BACKUP_DIR/db_${DATE}.sql.gz" s3://my-backups/db/ aws s3 cp "$BACKUP_DIR/app_${DATE}.tar.gz" s3://my-backups/app/ # Delete old local backups find "$BACKUP_DIR" -name "*.gz" -mtime +$RETENTION_DAYS -delete echo "[$(date)] Backup complete"
13📅

Daily Useful Scripts

Copy and Use Immediately

Disk Space Alert
BASH#!/bin/bash # disk_alert.sh — Alert if disk usage exceeds threshold THRESHOLD=80 USAGE=$(df / | tail -1 | awk "{print \$5}" | tr -d "%") if [ "$USAGE" -gt "$THRESHOLD" ]; then echo "ALERT: Disk usage is ${USAGE}% (threshold: ${THRESHOLD}%)" # Send alert to Slack/email here fi
Service Health Monitor
BASH#!/bin/bash # health_monitor.sh — Check multiple services SERVICES=("nginx" "docker" "mysql") for SVC in "${SERVICES[@]}"; do if systemctl is-active --quiet "$SVC"; then echo "[ OK ] $SVC" else echo "[FAIL] $SVC — attempting restart..." sudo systemctl start "$SVC" fi done
Log Cleanup
BASH#!/bin/bash # cleanup_logs.sh — Delete old logs, compress recent ones LOG_DIR="/var/log/myapp" # Compress logs older than 1 day find "$LOG_DIR" -name "*.log" -mtime +1 ! -name "*.gz" -exec gzip {} \; # Delete compressed logs older than 30 days find "$LOG_DIR" -name "*.gz" -mtime +30 -delete echo "Cleaned up: $(du -sh $LOG_DIR | cut -f1) remaining"
Git Automation
BASH#!/bin/bash # git_deploy.sh — Pull, build, deploy across multiple repos REPOS=("frontend" "backend" "worker") BASE_DIR="/opt/apps" for REPO in "${REPOS[@]}"; do echo "Updating $REPO..." cd "$BASE_DIR/$REPO" git pull origin main if [ -f "package.json" ]; then npm ci && npm run build elif [ -f "pom.xml" ]; then mvn clean package -DskipTests fi sudo systemctl restart "$REPO" echo "$REPO updated and restarted" done
14🏆

Best Practices

Write Scripts Like a Senior Engineer

Always start with #!/bin/bash and set -euo pipefail
Quote all variables: "$VAR" not $VAR (prevents word splitting on spaces)
Use local variables inside functions to avoid global pollution
Add a log() function for consistent timestamped output
Use trap cleanup EXIT for cleanup on exit (remove temp files, lock files)
Use lock files to prevent concurrent execution
Add comments explaining WHY, not WHAT (code shows what, comments show why)
Use meaningful variable names: DEPLOY_DIR not D, SERVER_NAME not S
Test scripts with bash -x script.sh (shows every command as it runs)
Use shellcheck (linter) to catch common bash mistakes
Store scripts in Git — version control your automation
Add --help flag to scripts for documentation
Use functions for any logic repeated more than twice
Handle edge cases: what if the file does not exist? What if the service is already stopped?
15💼

Interview Questions

Shell Scripting Q&A

What is #!/bin/bash?
Shebang line — tells the OS which interpreter to use for the script. Without it, the OS does not know to use bash. Always the first line.
set -euo pipefail?
set -e stops on error, -u stops on undefined variables, -o pipefail catches pipe failures. Production scripts MUST have this.
$? meaning?
Exit code of the last command. 0 = success, non-zero = failure. Used in if conditions: if [ $? -eq 0 ]; then echo success; fi
Single vs double quotes?
Double quotes "..." allow variable expansion ($VAR becomes its value). Single quotes '...' treat everything as literal text ($VAR stays as $VAR).
$0 $1 $2 $@ $#?
$0=script name, $1=first argument, $2=second, $@=all arguments, $#=count of arguments. Called positional parameters.
How to read a file?
while IFS= read -r line; do echo "$line"; done < file.txt — reads line by line safely handling spaces and special chars.
cron syntax?
minute hour day month weekday command. Example: 0 2 * * * /opt/backup.sh runs at 2:00 AM every day.
trap command?
Catches signals and runs cleanup. trap cleanup EXIT runs the cleanup function when script exits (success or error). Essential for temp file cleanup.
Difference: > vs >>?
> overwrites the file (creates new). >> appends to existing file. echo "log" > file.txt = fresh. echo "log" >> file.txt = add to end.
What is $() ?
Command substitution — runs the command inside and replaces with its output. TODAY=$(date) puts today date into the variable.
How to debug scripts?
bash -x script.sh shows every command before execution. Add set -x at top of script for same effect. Remove with set +x.
awk vs sed vs grep?
grep FINDS lines matching a pattern. sed REPLACES text in files. awk PROCESSES structured data (columns, fields). All three are essential.