Skip to main content

Safe Operational Mindset

What You'll Learn

The habits that separate scripts that "work on my machine once" from scripts that are safe to run repeatedly on real systems.

Why This Matters

A Python script with a bug can:

  • Delete files it shouldn't
  • Overwrite data without backup
  • Hang forever waiting for a response
  • Crash mid-operation leaving things in a broken state

These habits prevent that.

Habit 1: Validate Inputs Before Acting

Never assume inputs are correct. Check them first.

import sys
from pathlib import Path

def process_file(filepath: str) -> None:
path = Path(filepath)

# Validate before acting
if not path.exists():
print(f"Error: file not found: {filepath}", file=sys.stderr)
sys.exit(1)

if not path.is_file():
print(f"Error: not a file: {filepath}", file=sys.stderr)
sys.exit(1)

# Safe to proceed
content = path.read_text(encoding="utf-8")
print(f"Read {len(content)} characters from {filepath}")

Habit 2: Dry Run Before Making Changes

A --dry-run flag makes a script report what it would do without actually doing it:

import argparse

def rename_files(directory: str, dry_run: bool = False) -> None:
from pathlib import Path
for f in Path(directory).glob("*.txt"):
new_name = f.with_suffix(".bak")
if dry_run:
print(f"[DRY RUN] Would rename: {f}{new_name}")
else:
f.rename(new_name)
print(f"Renamed: {f}{new_name}")

parser = argparse.ArgumentParser()
parser.add_argument("directory")
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args()

rename_files(args.directory, dry_run=args.dry_run)

Run safely first:

python3 rename.py /data/logs --dry-run
# [DRY RUN] Would rename: /data/logs/app.txt → /data/logs/app.bak

Only then run for real:

python3 rename.py /data/logs

Habit 3: Handle Errors Explicitly

Don't let errors pass silently. Either handle them or let them propagate clearly.

import json
import sys

def load_config(path: str) -> dict:
try:
with open(path, encoding="utf-8") as f:
return json.load(f)
except FileNotFoundError:
print(f"Config file not found: {path}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Invalid JSON in {path}: {e}", file=sys.stderr)
sys.exit(1)

What not to do:

# ❌ Silently swallowing errors
try:
config = load_config("settings.json")
except Exception:
pass # This hides bugs — never do this

Habit 4: Use Exit Codes

Programs communicate success or failure through exit codes:

  • 0 = success
  • non-zero = failure
import sys

def main() -> int:
try:
# do work
return 0 # success
except Exception as e:
print(f"Failed: {e}", file=sys.stderr)
return 1 # failure

if __name__ == "__main__":
sys.exit(main())

Checking exit codes in shell:

python3 script.py
echo "Exit code: $?"

Habit 5: Log What Happened

For scripts that run unattended (cron, CI/CD, servers), use logging instead of print:

import logging

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s"
)
log = logging.getLogger(__name__)

def process(item: str) -> None:
log.info("Processing: %s", item)
# do work
log.info("Done: %s", item)

Output:

2024-01-15 10:30:01 INFO Processing: report.csv
2024-01-15 10:30:02 INFO Done: report.csv

Use levels appropriately:

  • log.debug() — detailed info for debugging
  • log.info() — normal progress
  • log.warning() — something unexpected but recoverable
  • log.error() — an operation failed
  • log.critical() — system-level failure

Habit 6: Keep State Changes Small and Reversible

  • Process one item at a time, not all at once
  • Write to a temp file first, then rename
  • Keep backups before overwriting
from pathlib import Path

def safe_write(path: Path, content: str) -> None:
"""Write to a temp file then atomically rename."""
tmp = path.with_suffix(".tmp")
tmp.write_text(content, encoding="utf-8")
tmp.rename(path) # atomic on most systems

Safety Checklist

Before running a script on real data or a real server:

☐ Does it validate its inputs?
☐ Does it have a --dry-run mode?
☐ Does it handle errors and print useful messages?
☐ Does it return a proper exit code?
☐ Does it log what it did?
☐ Is the state change reversible or can you test on a copy?

Common Mistakes

MistakeConsequenceFix
No input validationCrash with cryptic errorValidate at the start
No dry-run modeAccidental destructive changesAdd --dry-run flag
except Exception: passSilent failures, hard to debugAt least log the error
Writing directly to originalData loss if interruptedWrite to .tmp, then rename
No exit codeCI/CD can't detect failuresReturn 0/1 from main()

What's Next

Lesson 4: Python Project Workflow