Overview
A small CLI utility that watches one or more directories for file system events — creates, modifications, deletions, and moves — and logs them to a structured TSV file with timestamps. Built using Python's watchdog library.
Motivation
I wanted a lightweight, zero-dependency audit trail for a directory containing generated build artifacts. Existing solutions were either too heavy (inotifywait requires system packages, osquery is an entire platform) or too GUI-focused. A 100-line Python script with a single pip dependency was the right tool.
Usage
# Watch a single directory
python monitor.py ./build --output events.tsv
# Watch multiple directories, recursive
python monitor.py ./src ./config --recursive --output events.tsv
# Tail the log in real time
tail -f events.tsv
Event Log Format
Events are written as tab-separated values for easy parsing with pandas, awk, or any spreadsheet tool:
timestamp event_type src_path dst_path
2024-03-15T14:22:01 created ./build/main.o
2024-03-15T14:22:02 modified ./build/main.o
2024-03-15T14:22:03 moved ./build/main.o.tmp ./build/main.o
2024-03-15T14:22:05 deleted ./build/old_output.o
dst_path is only populated for move events.
Implementation
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import csv, datetime, sys
class TSVHandler(FileSystemEventHandler):
def __init__(self, writer):
self.writer = writer
def _log(self, event_type, src, dst=""):
self.writer.writerow([
datetime.datetime.now().isoformat(timespec='seconds'),
event_type,
src,
dst,
])
def on_created(self, event):
if not event.is_directory:
self._log("created", event.src_path)
def on_modified(self, event):
if not event.is_directory:
self._log("modified", event.src_path)
def on_deleted(self, event):
if not event.is_directory:
self._log("deleted", event.src_path)
def on_moved(self, event):
if not event.is_directory:
self._log("moved", event.src_path, event.dest_path)
Directory events are filtered out by default — most use cases care about files, and directory events are noisy during batch operations. A --include-dirs flag re-enables them.
Graceful Shutdown
The main loop blocks on observer.join() and catches KeyboardInterrupt to flush the output buffer before exit:
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
outfile.flush()
observer.join()
This ensures the last batch of events is written even if Ctrl-C interrupts mid-write.
Limitations & Extensions
The current implementation has a few known limitations:
- No deduplication — rapid saves (e.g., editor auto-save) produce multiple
modifiedevents for the same file. A debounce window would collapse these. - Platform differences — watchdog uses
inotifyon Linux,kqueueon macOS, andReadDirectoryChangesWon Windows. Move detection reliability varies. - Log rotation — long-running sessions accumulate unbounded log files. A
--rotate-mbflag would address this.
These are intentional omissions for a tool with a narrow scope — adding them would cross the line from "utility" to "daemon."
Lessons Learned
watchdog's cross-platform abstraction works well for common cases but surfaces OS differences for edge cases — notably, macOS's kqueue backend doesn't always distinguish between move-to and move-from events. Testing on the target OS matters more than for pure Python logic.