Self-hosted · Coming soon

Follow the trail.

Crumb is a self-hosted video management system, built by someone who's been running cameras for decades and finally got tired of working around the gaps. Operator-grade timeline scrubbing. A fast multi-camera wall. Polished native clients. Yours.

Opens your mail client with a draft to [email protected]. No spam list, just a direct email.

16cameras under test
Your disk= your retention
0cloud, no account
100%your footage
Crumb desktop client showing the Clips view for the Backyard camera. A grid of nine motion clips from the last 24 hours, each thumbnail with a recording icon and timestamp.
Why this exists

I built this for myself first.

I'd been running the same operator-grade NVR at home for years. It was, and still is, the gold standard at what it does. Multi-camera wall, jog-shuttle scrubbing, an export list that actually works, keyboard shortcuts I knew in my sleep. Genuinely great software.

Read the rest of the story

Then the free 8-camera tier I'd been on was retired. The next step up is priced for businesses, not for someone running an at-home setup. From where they sit, that's reasonable. They make serious software for serious operations. But it left a lot of us looking for something we couldn't find.

There was another reason, too. Frigate is excellent at what it does. Object detection done right, on hardware most of us already own. I run it. I rely on it. But Frigate can only flag what it sees. When the thing you actually care about is a small figure in a far corner of the frame, or in a stretch the detector never picked up at all, you want the raw timeline in front of you and you want to do real detective work yourself, frame by frame. The web viewer wasn't built for that. H.265, now the default on serious cameras, didn't always play smoothly in the browser either. Detection is one job. Operator-grade playback is another.

And it wasn't just about codecs and scrubbing. The way I actually use this stuff matters. I want a tile wall up on a monitor 24/7 for ambient awareness, glance-and-know, which is a different surface than a detection feed. I want a mobile client capable enough that when I'm away from home and something happens, I can pull up the footage and export the clip from a parking lot. I've done it dozens of times; it has to work. And I want a desktop client polished enough to sit down at and do a real investigation. The source of truth, the one where the hard work happens. Three surfaces, one backend. That's the whole job.

So I started building it. Not as a replacement for either of them. I have no interest in competing where someone already does their job well. Just a real operator-grade video management layer that a power user can self-host on their own hardware, on their own terms, served to every surface they care about, with whichever detection engine they want feeding it.

If you've been in the same spot, I think you'll feel at home here.

Full transparency

The whole picture, no spin.

Where it is today

I'm an IT engineer with thirty years in the field, building Crumb on my own time. It runs my own home right now: eleven cameras, multiple storage volumes, recording day in and day out for months. About 90% of where I want v1 to be. The remaining 10% is the boring, important part: hardware variety, which is where I need you.

Read the full transparency

What's working well today: the recording server, the Windows desktop client, and the Android app. What's still in progress: a macOS desktop and an iOS app are both built, but each is about 50% there and needs more time. Right now I'm focused on my personal stack: getting the server, the Windows client, and Android genuinely polished before I circle back to finish the Apple side.

It might not come to fruition. Side projects don't always. But I'm trying my hardest, and I'm writing this page because the work is real and I want help getting it to ship.

The 10% gap is real. Different cameras. Different CPUs. Different drives. Different network topologies. That's where testers come in.

If you sign up to test

Honestly, I don't fully know what the delivery will look like yet. To start, it's the source repo plus a Docker Compose stack. You clone, run a script that generates strong secrets, bring it up with docker compose up -d, then open a browser to /admin. A first-run wizard walks you through the rest: admin account, server address, storage, your first camera, optional Frigate. Updates come as git pull and rebuild.

Pre-built images, signed installers, a one-click setup, an auto-updater. None of that exists yet. Part of what testers will help me figure out is what that workflow should actually look like. If "spin up a Docker stack from a git repo on your own Linux box" sounds doable, you're the audience.

Schedule and risk

I'm not promising a release date. I'm not promising weekly updates. There will be stretches where life gets in the way and nothing ships for a few weeks. Bugs will happen. Some migrations between versions may need a manual step. I'll document what I know.

On the data side, the design intent is that your footage stays portable no matter what happens to Crumb. Recordings are written as plain MP4 files in a predictable folder layout on the disk you chose, and the Postgres schema is open. If Crumb ever pauses, or you decide to move off it, the recordings on disk are still yours and still playable in any video player. That's the architecture. I can't promise it's bug-free, because nothing this early can be.

Built with AI, openly

Full transparency: I'm using AI to build Crumb itself, and I used AI to build this very page you're reading. The words, the decisions, the architecture, the engineering judgment, and the testing are mine. AI is the power tool that lets a side project move at this pace. I'd rather say that out loud and clear before being called a vibe-coder. I code with intent.

About the long term

My eventual goal is to make Crumb installable by anyone. Click an installer, point it at your cameras, done. No Docker, no command line, no terminal. We're nowhere near that today, and I think it's important to put that in writing. Right now, Crumb is for technical users: people comfortable bringing up a Docker container on a Linux host. From there, a browser-driven first-run wizard handles the actual configuration (admin account, server address, storage, your first camera, optional Frigate). If even the Docker part is new to you, the right move is to wait. If it isn't, I'd love to hear from you.

Project direction, I genuinely don't know yet. Open source, closed source, or something in between. I haven't decided. I'm also an underemployed IT engineer, so if Crumb ever helps me keep the lights on, that would be welcome. But it's not why I'm building it, and I'd rather say that plainly than dress it up. There's a small coffee link tucked in the footer for anyone moved to use it. That's the whole ask.

Timeline

Scrub the day. Click an event. Find the moment.

Recording segments, motion events, and per-object detection icons all live on one bar. Hour ticks at top, transport at the right, jump-by-time chips for fast triage. From a 24-hour overview down to a 60-second slice. Same interaction.

Crumb timeline strip showing recording bars, hour ticks, transport controls, jump and zoom controls, and a vertical playhead with a detection icon cluster (person plus shield badges) at 16:28:14.
What it is

The operator layer your cameras have been waiting for.

Crumb is the recorder, the storage policy engine, the timeline, the live wall, the export pipeline, and the clients you actually use day-to-day. It does the unglamorous work that turns "I have cameras" into "I can find the moment I'm looking for in ten seconds."

Record everything, lose nothing

Rust-based recorder with motion-aware segments, drain-residual eviction, per-policy size caps, and a Postgres-backed segment index that's the single source of truth.

Scrub the day in one stroke

Hour-by-hour timeline with crumb-trail recording segments, motion event dots, per-object detection icons, and a playhead that snaps to the next event with one keypress.

Native clients on every screen

Tauri + libmpv on desktop, Kotlin/Compose + Media3 on Android, iOS planned, plus a web admin console served by the API. One footage source, every surface.

Real screens

From the actual client. No mockups.

Captured this week from the desktop build I run on my own hardware.

View Setup dialog. Designing a custom layout with 6 panes assigned to cameras (Front Yard, LPR, Driveway, Front Door, Side Yard, Backyard), columns and rows controls, quick-layout presets, and an icon picker.

Build the wall you want

Drag cameras into panes. Mix in carousels, hotspots, PTZ tiles, clocks, web pages. Save each layout, switch with one click, name and ice it with an icon. Per-device.

Export tab. Build a list of clips (one camera + one time range each), choose MP4 H.264 output, optional timestamp burn-in, optional AES-256 password-protected ZIP, then export the whole batch at once.

Export a list. Encrypt it if you want.

Add multiple clips from multiple cameras, then export the whole list at once. Each clip is one camera and one time range; the batch can span as many as you need. Output as plain MP4 files or as a single AES-256 ZIP with a password. Burn the timestamp in. Lands in your Downloads folder.

Motion tuning panel. Live camera preview with drawable exclusion zones over trees and edges, grid overlay, detector picker (Census, Frame diff, MOG2, Optical flow, Ensemble), grid size and threshold sliders.

Tune motion in the actual scene

Draw exclusion zones right on the live image. Swap detectors (Census, Frame diff, MOG2, Optical flow, Ensemble) per camera. Watch the sensitivity and floor live as you tune.

Server management. Left sidebar shows Cameras with online status, a Motion Cameras group, Recording Profiles (Default, Motion), Storage volumes (16TB Spinner, 2TB NVMe), Roles (Administrator, Viewer), Users, Integrations, Notifications, and Server & Streaming. The main panel shows a custom Viewer role with capability checkboxes and per-camera access.

Roles, groups, and per-camera access

Custom roles with explicit capabilities: Playback, Clips, Export, PTZ control, Manage views. Camera access by individual camera or by group. Bookmarks scoped per-user.

Timeline detail. Selection highlighted in amber between 16:22 and 16:24 with two detection icons (shield badges and a person glyph), right-click context menu with Set export start here, Set export end here, Export selection, Clear selection.

Right-click the timeline. Done.

Drop a start and an end on the timeline, send the selection straight to the export list. Per-object detection icons from Frigate stay on the bar (person, vehicle, animal, package) so you can see exactly what you're picking before you pick it.

Reference setup

My hardware. My real numbers.

Use this as a sanity check, not a requirement. Crumb is designed to scale down to weaker boxes too. See the "less powerful clients" note below.

Recorder host

The box doing the work

CPU
Intel Core Ultra 7 265K · 20 cores / 20 threads
Memory
32 GB
GPU
NVIDIA Quadro RTX 4000 (8 GB) · Intel UHD 770 iGPU · Coral USB TPU
GPU usage
Not in use for motion decode. CPU wins on power at this scale (see benchmark below).
Cameras
11 mixed ONVIF / RTSP, indoor and outdoor, includes 4K mains
Storage
Multiple disks. H.265 footage at roughly half the bitrate of H.264.
OS
Ubuntu, Docker Compose stack
Windows desktop client

What I sit in front of

CPU
Intel Core i5-14500 · 14 cores (6 P + 8 E) / 20 threads
Memory
64 GB
GPU
NVIDIA GeForce RTX 3070 (8 GB) · Intel UHD 770 iGPU
Wall load
11 sub-streams (the everyday view): ~1% CPU, ~22% NVDEC, ~1 GB RAM
Notes
libmpv hands H.264 and H.265 straight to NVDEC. The wall defaults to sub-streams; main only loads when a tile is maximized.
Power benchmark · 2026-06-23

CPU motion decode beat NVDEC by 30%, at the wall meter.

I ran a controlled A/B on the recorder host: motion-detection decode on CPU versus NVDEC. 11 low-resolution sub-streams, fps-capped to 5, with adaptive throttling down to ~2.5 when the scene is quiet. Recording itself is stream-copy (no decode either way), so footage continued either way.

MetricNVDEC (cuda)CPUDelta
Wall meter105 W73 W−32 W (−30%)
GPU power36.6 W8.8 W (idle)−27.8 W
Recorder CPU~26%~33%+7% (about 0.08 of one core)
GPU temperature51 °C35 °C−16 °C

Why CPU wins at this scale: NVDEC has a fixed activation cost. Waking the decode block costs roughly 28 W regardless of how trivial the workload is. At 11 fps-capped sub-streams the real decode work is about 0.08 cores; the CPU does it for free, while the GPU has to spin up just to do the same thing. The wall delta (32 W) is larger than the component delta (~28 W DC) because the PSU is ~90% efficient and the GPU fans wind down as it cools.

When NVDEC wins again: many more cameras, higher fps, higher analysis resolution, or when the GPU is already awake for another job (a streaming server, a local LLM). Decode mode is admin-selectable per server: auto / cpu / cuda / vaapi. VAAPI lets you use an Intel iGPU instead of dedicating a discrete card. Changes hot-reload without a full restart.

Less powerful clients

I'm not building this only for boxes like mine.

The recorder restreams every camera as both main and sub through go2rtc, and clients pick which one to play. The desktop wall defaults to sub-streams: low-resolution H.264 at the camera's sub-stream frame rate, decoded by whatever the client has, hardware preferred, software as a fallback. That's why 11 cameras cost about 1% CPU on a 20-thread machine. It's also why a much weaker client can still run the same wall.

For boxes that struggle with even that, the Android app already runs a low-bandwidth snapshot-poll mode for cellular: a still frame per tile, decoded as a JPEG, refreshed at whatever cadence the network can support. Not pretty for live triage, but it's the right escape hatch when bandwidth or CPU is the constraint. The same idea ports cleanly to any constrained surface (a low-end mini PC, a tablet, a Raspberry Pi class device).

Testers wanted

Help me find what works, and what doesn't.

The first round is targeting Linux-host owners running Windows or Linux on the desktop and Android on the phone. Those clients are the furthest along; iOS and a polished macOS desktop installer aren't ready yet.

I'm specifically looking for two things in a tester: you're comfortable with the possibility of losing footage (I've run 30 days without an incident on my own setup, but I can't promise that for yours), and you're happy to share feedback about what worked, what didn't, and what surprised you. That feedback is the whole point of the early round.

What you'd need

  • Linux x86-64 host for the Docker stack. A NAS, a mini PC, an old desktop, a Proxmox VM, all fine.
  • Docker + Compose v2, with your user able to run docker ps without sudo
  • Storage. Cameras eat terabytes; ~2 TB is a reasonable starting point. Crumb will use whatever you give it.
  • A few ONVIF / RTSP cameras. A mix of brands is even more useful than one consistent set.
  • LAN only to start. Remote access via Tailscale or WireGuard later if you want it. Never open ports to the public internet.
  • A Windows or Linux desktop to run the Smart Client, and/or an Android phone for the mobile app

NVIDIA GPU is optional (CPU motion detection is the default and benchmarks fine for typical residential loads). ARM hosts are on the roadmap but not yet tested. x86-64 + Docker is the supported floor today.

What I'd ask in return

  • Run it for a few weeks. Notice when something feels weird. Tell me.
  • Be patient with bugs. There will be some. There may be migrations between releases.
  • Send a quick note when something works really well too. That's useful signal.
  • Be honest if it's not for you. I'd rather hear that than have you ghost.

Type your email below and hit Apply as a tester. Your mail client will open with a draft to [email protected] — a few prompts about your hardware are already in the body, fill in whatever you feel like, then send. I read every one.

No NDA, no sign-up form theatre. Just an email to me, directly.

Every platform

One backend. Native clients everywhere.

🖥
Desktop
macOS · Windows · Linux · the source of truth

The investigation surface. Sit down, find the moment. Tauri shell, native libmpv playback (H.264 and H.265 both smooth), embedded admin console with token-fragment SSO. Customizable views, on-video PTZ panel, digital zoom, jog-shuttle scrubbing. Designed to live full-screen on a monitor 24/7.

  • libmpv-backed multi-pane wall, always-on friendly
  • Customizable layouts & Setup mode
  • Per-camera PTZ + focus/iris controls
  • Export List with batch archive
📱
Android
Native · Kotlin / Compose / Media3 · in the field

For when something happens and you're not home. Real native app, not a WebView. Pull up footage, scrub the timeline, export the clip on cellular from a parking lot. Low-bandwidth snapshot mode, biometric app-lock, server auto-discovery on the LAN.

  • Pinch-zoom playback, PTZ wheel
  • Stall watchdog & keep-screen-on
  • 10-year saved logins
  • Find-my-server LAN scan
📲
iOS
Planned · SwiftUI · AVPlayer

A first-class native iOS client is in active design. Same backend, same timeline, same biometric lock. Tracking the work in the public roadmap.

  • SwiftUI + AVPlayer
  • Background-friendly live tiles
  • Shared bookmarks across devices
  • Push notifications for events
🌐
Web Admin Console
Served at /admin · Next.js

The configuration plane. Add cameras, define groups and policies, tune motion detection live, manage users, monitor recorder health, all in the browser. Built into the API container.

  • First-run setup wizard
  • Live motion-tuner with sensitivity preview
  • Per-camera, per-policy, per-group config
  • Snapshot & bookmark management
The full feature set

Everything you'd expect. A few things you wouldn't.

Smooth live and smooth playback, including H.265

Most web viewers struggle with H.265, which is now the default on serious cameras. Crumb hands frames straight to libmpv on desktop and Media3 on Android, so 4K H.265 plays without judder, without server-side transcoding, in both live and recorded playback. This is the thing the project spent the most time getting right.

Operator-grade timeline

Hour ticks, recording segments as crumbs, motion events as dots, per-object detection icons (person / car / animal / package), pinch-to-zoom, and a snap-to-event playhead.

Adaptive motion detection

Illumination-invariant census-transform foreground with a percentile-over-decaying-histogram floor and diurnal EMA. It learns your scene. No more 3 AM false triggers from wind in the trees.

Pluggable detectors

Census (default), FrameDiff, MOG2, Optical Flow, Ensemble, or use Frigate as the motion source. Per-camera selection. Golden-replay benchmarks ship with the recorder.

Storage policies + groups

Named retention policies (live / archive tiers), camera groups with inheritance, per-policy free-space headroom, spill buffers, and a tiered cascade: accounting, archive, alert.

Clips, Exports, Bookmarks

Source-abstracted Clips (yours or Frigate's), an Export List with batch archive (AES or Stored zip), and protected-from-auto-delete bookmarks that survive eviction.

Customizable views

Dedicated Setup mode. Drag any camera into any pane, save the layout per device, switch with one click. Tab persistence so the wall comes back the way you left it.

On-video PTZ panel

Place ONVIF d-pad, zoom, home, focus, iris, and preset buttons anywhere on the live tile. Resizable, snap-aligning, renamable. Per-camera, persisted on the desktop client.

Auto-hotspot wall

One tile that auto-follows the most-recently-moved camera in a configured set, with a 4-second dwell. Like the spotter in a security room, without the security room.

Clip motion-highlight

Clips auto-zoom to the motion bbox for the first few seconds, then ease back out. Pinch-zoom on top of that for the parts that matter.

Snapshot confirmation

Hit snapshot, get a toast with a clickable link to the location on disk. No "where did that go" archaeology.

GPU-optional motion compute

This is server-side, separate from playback. Motion detection runs on CPU by default, with an optional NVDEC overlay for hardware-accelerated decode on NVIDIA. CPU has been benchmarked head-to-head and wins on power for typical residential loads. Your call.

Biometric app-lock

Opt-in fingerprint / face / PIN gate on cold-start of the mobile app. The footage stays private even if the phone doesn't.

Features wanted

The next big things on my list.

A big part of why I started this project was to eventually carry most of the operator-grade features I leaned on for years. The kind of stuff that lives in the enterprise tier of commercial VMS products. Not promises, not committed roadmap dates. Just where the project is pointed.

Wanted

Storage tiering: transcode-down for older footage

My 4K mains don't need to stay 4K for three weeks. After a configurable window (a week, say) older recordings transcode down to 1080p or 720p, freeing the bytes for the next week of high-res. Same scene, smaller bucket, same operator surface.

Wanted

Privacy masks, per camera and per role

Draw black-out zones over neighbors' windows, license plates, or anywhere the camera shouldn't keep. Apply globally or scoped per role, so a Viewer sees the redacted footage and an Admin sees the raw. Critical for any household-shared install.

Wanted

Floor-plan / map view

Drop cameras on a floor plan or property map. Click a camera on the plan to bring up its live tile or jump its timeline. Useful when you have more cameras than you can hold in your head, and absolutely required at scale.

Wanted

Smart Search: motion in a region

Draw a box on the scene, ask "when did motion happen here in the last week," get the hits ranked. Compresses hours of scrubbing into seconds. Detection-engine optional; this is geometry over the recorder's own motion data.

Wanted

Federated multi-server

When one box isn't enough: federate two or more Crumb hosts, see all their cameras in one wall, query timelines across all of them at once. The boring distributed-systems work that makes Crumb tenable for properties with too many cameras for one host.

Wanted

SSO, LDAP, and 2FA

Bring-your-own identity provider. SAML/OIDC for SSO, optional LDAP/AD for traditional directories, hardware-key 2FA on the admin role. Needed before any household with shared admin trust would deploy this comfortably.

Want one of these sooner than the others? Sign up as a tester (above) and tell me which. Demand shapes order.

Detection

Bring your own Frigate.

Crumb does not bundle Frigate and does not detect objects itself. Detection is Frigate's job. If you already run Frigate, point Crumb at your MQTT broker and map each camera to its Frigate name. That's the entire integration.

  • FRIGATE_MQTT_URL in .env or the admin UI
  • Per-camera source_camera_name in the admin editor
  • Frigate events flow in over MQTT; Crumb tolerates schema drift (we live-fixed the sub_label regression in production)
  • No broker? The bundled mosquitto is right there; point Frigate at it
  • Detection icons land on the timeline; clips can be sourced from Frigate per-camera

If FRIGATE_MQTT_URL is empty, the entire detection subsystem stays disabled. Zero magic, zero coupling.

Self-hosted by design

No cloud. No account. No telemetry. No surprise.

Crumb is a Docker stack you bring up on a Linux box, a NAS, a mini PC. Anywhere you can run a container. Footage is files on a disk you own. The configuration plane is a web UI on your LAN. Native clients talk to your server. Nothing leaves your network unless you make it.

Secrets are generated for you on first boot. The admin console walks you through creating your administrator account. Every key is documented. The Postgres schema is open. Your footage is yours.

Under the hood

An honest stack, no surprises.

Recorder
Rust
API
Rust · axum
Database
PostgreSQL
Restreaming
go2rtc
Detection (BYO)
Frigate · MQTT
Desktop
Tauri · libmpv
Android
Kotlin · Compose · Media3
Admin
Next.js
Deploy
docker compose up -d
Latest

What we shipped this week.

A running log of releases, fixes, and design notes. RSS · all updates

See all →
Built

Auto-hotspot tile that follows motion

A wall tile that can auto-follow the most-recently-moved camera in a configured set, with a 4-second dwell. Like the spotter in a security room, without the security room.

Desktop
live-wallui
Shipped

Customizable on-video PTZ panel

Place ONVIF d-pad, zoom, home, focus, iris, and preset buttons anywhere on a PTZ camera's live tile. Resizable, snap-aligning, renamable, per-camera.

Desktop
ptzuionvif
Shipped

ONVIF focus + iris control (Imaging service)

New `POST /cameras/:id/imaging` endpoint drives ONVIF focus near/far, autofocus, and iris from the desktop PTZ tile. Verified on a Uniview LPR camera.

BackendDesktop
ptzonvifimaging
Shipped

"Find my server" auto-discovery on login

New users can tap "Find my server" on the Android login screen and Crumb subnet-scans for `/health`. No IP-typing, no mDNS quirks, no router config.

Android
setuponboarding

Be there when Crumb ships.

Send a quick email and I'll let you know when v1.0 drops. No drip campaign, no marketing pipeline, no anything else.

crumbvms.com · self-hosted video management