pathlib, os, and shutil
Learning Focus
Use this lesson as a gold-reference expansion for practical Python work. Focus on code that is readable, testable, and safe to run more than once.
Why This Matters
pathlib, os, and shutil helps turn Python knowledge into dependable scripts and applications. The goal is not just to make code run once, but to make it understandable when requirements, inputs, or environments change.
Use this topic when you need to:
- Make behavior explicit instead of relying on guesses.
- Validate input before it reaches important logic.
- Keep examples small enough to test and adapt.
- Produce output that explains what happened.
- Leave code that another developer can safely change.
Core Pattern
from __future__ import annotations
from dataclasses import dataclass
@dataclass(frozen=True)
class LessonResult:
topic: str
passed: bool
detail: str
def run_lesson_04_pathlib_os_and_shutil_check(value: str) -> LessonResult:
cleaned = value.strip()
if not cleaned:
return LessonResult(topic="pathlib, os, and shutil", passed=False, detail="empty input")
return LessonResult(topic="pathlib, os, and shutil", passed=True, detail=cleaned)
result = run_lesson_04_pathlib_os_and_shutil_check("example")
print(result)
Practical Checklist
| Step | Question | Good Habit |
|---|---|---|
| Identify input | What values does the code receive? | Name arguments and config clearly |
| Validate | What can be missing, empty, or wrong? | Check before doing work |
| Transform | What should the program produce? | Keep transformations small |
| Report | How will success or failure be visible? | Return values, logs, or exit codes |
| Test | What could break later? | Add one focused example or assertion |
Common Mistakes
| Mistake | Why It Hurts | Better Approach |
|---|---|---|
| Hiding too much logic in one block | Bugs become hard to isolate | Extract one named function |
| Trusting raw strings | Input often has spaces, casing, or missing values | Normalize and validate early |
| Printing only happy-path output | Failures become unclear | Include useful failure details |
| Mixing setup and behavior | Tests become difficult | Keep configuration separate from work |
Practice Tasks
- Copy the core pattern into a temporary file.
- Change the validation rule to match this lesson topic.
- Add one failing input and one passing input.
- Print a short summary that a teammate could understand.
- Turn one example into an assertion or pytest test.
Review Questions
- What input assumptions does this lesson make explicit?
- What failure should be handled before production use?
- Which part of the example would you reuse in a real script?
- How would you test this behavior with bad input?
- What output proves the code worked correctly?
Extended Examples
Example 1: Parameterized Validation
from dataclasses import dataclass
from typing import Optional
@dataclass(frozen=True)
class ValidationResult:
valid: bool
value: Optional[str] = None
error: Optional[str] = None
def validate_and_normalize(
value: str,
min_length: int = 1,
max_length: int = 100,
strip_whitespace: bool = True
) -> ValidationResult:
if strip_whitespace:
value = value.strip()
if not value:
return ValidationResult(valid=False, error="empty value after processing")
if len(value) < min_length:
return ValidationResult(valid=False, error=f"too short: {len(value)} < {min_length}")
if len(value) > max_length:
return ValidationResult(valid=False, error=f"too long: {len(value)} > {max_length}")
return ValidationResult(valid=True, value=value)
# Test different scenarios
test_cases = [" hello ", "a", "x" * 200, "", "validinput"]
for case in test_cases:
result = validate_and_normalize(case)
print(f"input={case!r} -> valid={result.valid}, error={result.error}")
Example 2: Structured Result with Context
from dataclasses import dataclass
from datetime import datetime
from enum import Enum
class Status(Enum):
SUCCESS = "success"
FAILURE = "failure"
PARTIAL = "partial"
@dataclass(frozen=True)
class OperationResult:
status: Status
message: str
timestamp: datetime
details: dict
def is_success(self) -> bool:
return self.status == Status.SUCCESS
def to_dict(self) -> dict:
return {
"status": self.status.value,
"message": self.message,
"timestamp": self.timestamp.isoformat(),
"details": self.details,
}
def run_operation(value: str) -> OperationResult:
if not value or len(value) < 3:
return OperationResult(
status=Status.FAILURE,
message="value too short",
timestamp=datetime.now(),
details={"value_length": len(value) if value else 0}
)
return OperationResult(
status=Status.SUCCESS,
message="operation completed",
timestamp=datetime.now(),
details={"processed_length": len(value)}
)
result = run_operation("test")
print(result.to_dict())
Integration Patterns
Combining with argparse
import argparse
from dataclasses import dataclass
@dataclass(frozen=True)
class Config:
input_value: str
verbose: bool
output_file: str
def parse_args() -> Config:
parser = argparse.ArgumentParser(description="Process values with validation")
parser.add_argument("input_value", help="Value to process")
parser.add_argument("-v", "--verbose", action="store_true", help="Enable verbose output")
parser.add_argument("-o", "--output", default="output.txt", help="Output file")
args = parser.parse_args()
return Config(
input_value=args.input_value,
verbose=args.verbose,
output_file=args.output
)
def main() -> int:
config = parse_args()
if config.verbose:
print(f"Processing: {config.input_value}")
print(f"Output to: {config.output_file}")
print(f"Result: {config.input_value.upper()}")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Combining with logging
import logging
from dataclasses import dataclass
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s %(levelname)s %(message)s"
)
LOG = logging.getLogger(__name__)
@dataclass(frozen=True)
class LoggedResult:
topic: str
success: bool
detail: str
def process_with_logging(value: str) -> LoggedResult:
LOG.debug(f"Processing value: {value}")
if not value:
LOG.warning("Empty value received")
return LoggedResult(topic="process", success=False, detail="empty")
result = value.strip().upper()
LOG.info(f"Processed to: {result}")
return LoggedResult(topic="process", success=True, detail=result)
result = process_with_logging("hello")
LOG.debug(f"Final result: {result}")
Operational Considerations
When using this pattern in production environments:
- Environment Variables: Read configuration from environment variables with sensible defaults
- Exit Codes: Return non-zero exit codes when operations fail
- Error Messages: Write useful error messages to stderr, not stdout
- Dry Runs: Support
--dry-runflag for state-changing operations - Logging: Use structured logging with appropriate log levels
Production Snippet
import os
import sys
import logging
from pathlib import Path
LOGGING_LEVEL = os.environ.get("LOG_LEVEL", "INFO").upper()
logging.basicConfig(
level=getattr(logging, LOGGING_LEVEL, logging.INFO),
format="%(asctime)s %(levelname)s %(name)s: %(message)s"
)
LOG = logging.getLogger(__name__)
def get_working_directory() -> Path:
"""Get working directory with validation."""
cwd = os.environ.get("WORK_DIR")
if cwd:
path = Path(cwd)
if path.is_dir():
return path
LOG.warning(f"WORK_DIR not a directory: {cwd}")
return Path.cwd()
def main() -> int:
LOG.info("Starting application")
work_dir = get_working_directory()
LOG.info(f"Working directory: {work_dir}")
try:
# Main logic here
pass
except Exception as e:
LOG.error(f"Failed: {e}")
return 1
LOG.info("Completed successfully")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Extended Troubleshooting
| Problem | Likely Cause | Solution |
|---|---|---|
| Function not found | Import error or typo in name | Check imports and function definition |
| Type error on result | Wrong return type or None handling | Add type hints and handle None |
| File not found | Wrong path or permissions | Use absolute paths and check permissions |
| Tests fail | Edge cases not handled | Add more test cases and validation |
| Works locally, fails on server | Environment differences | Check Python version, paths, and dependencies |
Debugging Steps
- Run with
-vor--verboseflag - Print the input values and types
- Check the Python version:
python3 --version - Verify dependencies:
python3 -c "import module_name" - Run with dry-run mode if available
- Check logs for error messages
Additional Practice
- Modify the core pattern to accept multiple values
- Add a retry mechanism for failed operations
- Implement a config file loading pattern
- Add a simple progress indicator for long operations
- Create a CLI with subcommands using argparse
Field Notes
This pattern forms the foundation of reliable Python automation. The key principles:
- Explicit is better than implicit - Name variables and functions clearly
- Errors should never pass silently - Handle failures explicitly
- Readability counts - Code is read more than written
- Flat is better than nested - Avoid deep nesting
- Small is better - Many small functions are easier to test
Build confidence by applying this pattern to real automation tasks, then expand to handle more complex scenarios.
Summary
- Validate early, validate often
- Return structured results, not just values
- Log usefully without overwhelming
- Handle errors explicitly
- Test the failure paths, not just success
What's Next
- Return to the module index and connect this lesson with the previous examples.