← All updates

Footage reliability audit

A 48-agent audit of every code path that could lose footage. Verdict: footage is roughly 80% solid. Found three real problems including a missing automated backup. The most uncomfortable finding is the most important to publish.

Backend reliabilityops

I ran a focused audit on the question that matters most for an NVR: can it lose footage? Every code path that writes a recording, evicts a recording, reconciles state, or restarts a worker, examined in parallel.

Headline finding

Footage handling is roughly 80% solid. Most paths are correct, the recording loop is robust, the segment index is the single source of truth, and the restart story is sound. The 20% that isn't solid is exactly what an honest audit is for.

The three real problems

  1. No automated backup of the recorder DB. The Postgres database that

indexes every segment had no automated backup. If the DB is gone, the files on disk are still there, but the index that lets you find them isn't. Now on the immediate fix list.

  1. No-fsync inversion. A perf optimization on segment writes had been

applied in the wrong order: the write was being acked before the data hit the disk in one path. Fixed.

  1. Reconcile-on-boot race. A narrow window during recorder startup where

the reconcile job could mark in-flight segments as orphans. Fixed; the reconcile now waits for the in-flight set to stabilize before pruning.

Why publish this

Because a side-project NVR that won't tell you what's wrong with itself isn't worth running. The audit is in docs/FOOTAGE-RELIABILITY-AUDIT-2026-06-19.md, the full unredacted findings. Two of the three are already fixed. The DB backup is the active work.

Don't miss the next one.

Email when something ships. Or grab the RSS.