Skip to main content

Architecture Overview

Open Genie is a monolithic, self-hosted, full-stack platform built around a single Node.js process that serves a React SPA, a REST API, two WebSocket endpoints, scheduled jobs, file watchers, AI inference orchestration, and a sandboxed plugin runtime — all from one repository, one deploy, and one binary.

This page is the canonical orientation for new contributors. By the end of it you should be able to:

  • Trace a request, a chat message, or a cron tick through every layer.
  • Know which file owns which concept, and why each piece exists.
  • Understand the lifecycle of the server, devices, plugins, and skills.
  • Identify the right entry point for any feature you want to add.

1. The 30-second mental model

Picture Open Genie as a smart-home household hub that runs on hardware you control:

  1. A single Node.js server (apps/web/server.ts) boots the world: HTTP, WebSocket, database, scheduler, file watchers, and plugins.
  2. Native client apps (iPhone, iPad, Apple TV) and a web SPA connect to the hub. Browsers/native devices ride HTTP + WebSocket; ESP32 hardware rides a separate /ws/iot channel.
  3. The hub talks to a local LLM (Ollama by default; optional Anthropic) and orchestrates Actions — the universal unit of work. Actions can be triggered by AI tool calls, cron jobs, webhooks, or REST.
  4. Persistent state lives in PostgreSQL (Drizzle ORM, 30 tables) and a filesystem data root (uploads, media, skills, plugins, secrets, SOUL/MEMORY markdown).
  5. Skills and Plugins layer extensibility on top of the core: skills register actions; plugins register actions, cron, widgets, settings pages, OAuth integrations, and event-bus subscribers.

The whole stack is local-first: AI inference, vector storage, and media processing run on your hardware. No cloud dependency is required.


2. Repository topology

open-genie/
├── apps/
│ ├── web/ # The hub: server + SPA + Electron shell
│ │ ├── server.ts # CLI entry — thin signal/exit wrapper
│ │ ├── server/ # Hono API (createApiApp + routes/)
│ │ │ ├── index.ts # Mounts all /api/* sub-routes
│ │ │ ├── auth/ # Web-session, OAuth, QR pairing
│ │ │ ├── middleware/ # Auth, kill-switch
│ │ │ └── routes/ # 28 REST sub-routers (chat, plugins, …)
│ │ ├── client/ # Vite SPA (React 19 + React Router v7)
│ │ │ ├── pages/ # 20 route components
│ │ │ ├── components/ # AppShell, NavRail, AiPanel, widgets
│ │ │ └── routes.tsx # Lazy-loaded route table
│ │ ├── lib/ # All non-HTTP server logic
│ │ │ ├── boot/ # start-server, preflight, migrations, PIN
│ │ │ ├── ws/ # WebSocket router, registry, handlers
│ │ │ ├── auth/ # JWT, sessions, scopes, revocation cache
│ │ │ ├── actions/ # Built-in actions + registry
│ │ │ ├── chat/ # Pipeline, router, prompt-builder, RAG
│ │ │ ├── providers/ # Ollama / Anthropic LLM adapters
│ │ │ ├── memory/ # Embeddings, retrieval, persons, speaker
│ │ │ ├── camera/ # Worker, vision, rules, capture
│ │ │ ├── media/ # Watcher, indexer, thumbnails
│ │ │ ├── plugins/ # Loader, registry, SDK, OAuth, secrets
│ │ │ ├── skills/ # Filesystem-loaded action bundles
│ │ │ ├── notifications/# Expo + FCM push, categories
│ │ │ ├── iot/ # ESP32 WS handler, Tuya, intent
│ │ │ ├── storage/ # Local + S3 adapters (camera sync)
│ │ │ ├── audit/ # Append-only audit log
│ │ │ ├── net/ # Public URL, rate limiting, kill-switch
│ │ │ ├── db/ # Drizzle client + schema (30 tables)
│ │ │ └── config/ # Runtime .env loader, opengenie.json
│ │ ├── drizzle/ # SQL migrations (0000…0009 + meta)
│ │ ├── electron/ # Optional desktop shell
│ │ └── public/, dist/ # Static assets, build output
│ ├── phone/ # Expo iPhone app
│ ├── tablet/ # Expo iPad app
│ └── tv/ # Apple TV (react-native-tvos)
├── packages/ # @genie/* shared libraries
│ ├── shared/ # Cross-app primitives
│ ├── connection/ # WS client + auth helpers
│ ├── chat/, media/, # Per-feature client libraries
│ │ notifications/, #
│ │ camera-sync/, ui/ #
│ └── tokens/ # Brand colors / typography
├── docs/ # Docusaurus site (this page lives here)
└── workplans/ # Internal RFCs / TODO bundles (not shipped)

The repo is an npm workspaces monorepo (apps/* and packages/*). The hub is a single apps/web package; the native clients pull from packages/* so the wire protocol, types, and brand stay consistent.


3. High-level diagram

┌─────────────────────────────────────────────────────────────────┐
│ Client Devices │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ iPhone │ │ Tablet │ │ Apple TV │ │ Web Browser │ │
│ │ (Expo) │ │ (Expo) │ │ (RN-tvos)│ │ (Vite SPA) │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │
│ │ │ │ │ │
│ └─────────────┴──────┬───────┴──────────────┘ │
│ WebSocket /ws HTTP /api │
│ │
│ ┌──────────────────────────┐ │
│ │ ESP32 / IoT hardware │ ── WebSocket /ws/iot │
│ │ (Tuya, music box, …) │ │
│ └──────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘

┌────────────────────────────┴────────────────────────────────────┐
│ apps/web/server.ts (Entry) │
│ delegates to lib/boot/start-server.ts │
│ │
│ ┌─────────────────────┐ ┌────────────────────────────────┐ │
│ │ http.createServer │ │ ws.WebSocketServer (noServer) │ │
│ │ /api/* → Hono │ │ /ws → device handlers │ │
│ │ static → Vite SPA │ │ /ws/iot → ESP32 handler │ │
│ │ fallback → index.html │ Bearer or sec-websocket-proto │ │
│ │ (client-side route)│ │ 30s ping/pong heartbeat │ │
│ └─────────────────────┘ │ ConnectionRegistry (in-mem) │ │
│ │ MessageRouter (domain-based) │ │
│ └────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

┌────────────────────────────┴────────────────────────────────────┐
│ Core Services │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────────────┐ │
│ │ Chat │ │ Actions │ │ Scheduler │ │ Camera Worker │ │
│ │ Pipeline │ │ Registry │ │ (cron) │ │ (capture/vision)│ │
│ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └────────┬────────┘ │
│ │ │ │ │ │
│ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ ┌────────┴────────┐ │
│ │ Provider │ │ Skill │ │ Job Runs │ │ Media Watcher │ │
│ │ Router │ │ Loader │ │ + Audit │ │ + Thumbnails │ │
│ │(Ollama/ │ │ │ │ Trim │ │ (chokidar) │ │
│ │ Anthropic)│ │ │ │ │ │ │ │
│ └───────────┘ └───────────┘ └───────────┘ └─────────────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────────────┐ │
│ │ Memory │ │ Push │ │ Auth │ │ Audit Log │ │
│ │ Retrieval │ │ Notifs │ │ Sessions │ │ + Rate Limit │ │
│ │ (RAG + │ │ (Expo, │ │ (JWT + │ │ + Kill-switch │ │
│ │ vectors) │ │ FCM) │ │ refresh) │ │ │ │
│ └───────────┘ └───────────┘ └───────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

┌────────────────────────────┴────────────────────────────────────┐
│ Plugin Runtime │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌───────────┐ │
│ │ Loader │ │ Registry │ │ SDK │ │ Widgets │ │
│ │(reconcile, │ │(slug→state │ │(genie.* │ │ (cache + │ │
│ │ load, │ │ + cleanup) │ │ surface, │ │ WS push │ │
│ │ unload, │ │ │ │ perm- │ │ + auto- │ │
│ │ watcher) │ │ │ │ gated) │ │ refresh) │ │
│ └────────────┘ └────────────┘ └────────────┘ └───────────┘ │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Event Bus │ │ OAuth │ │ Secrets │ │
│ │(EventEmit. │ │ (Google, │ │ (AES-256- │ │
│ │ memory.*, │ │ Slack, │ │ GCM, │ │
│ │ chat.*, │ │ GitHub, │ │ per- │ │
│ │ iot.*, …) │ │ MS) │ │ plugin) │ │
│ └────────────┘ └────────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────────────┘

┌────────────────────────────┴────────────────────────────────────┐
│ Data Layer │
│ │
│ ┌──────────────────────┐ ┌────────────────────────────────┐ │
│ │ PostgreSQL │ │ File System (data root) │ │
│ │ (Drizzle ORM) │ │ │ │
│ │ │ │ data/.env (runtime) │ │
│ │ 30+ tables: │ │ data/SOUL.md (personality)│ │
│ │ users, devices, │ │ data/MEMORY.md (knowledge) │ │
│ │ conversations, │ │ data/files/ (uploads) │ │
│ │ messages, │ │ data/media/ (library) │ │
│ │ memory_chunks │ │ data/skills/ (skills) │ │
│ │ (pgvector), │ │ data/plugins/ (plugins) │ │
│ │ plugins, cron_jobs,│ │ data/camera-sync/ (uploads) │ │
│ │ audit_log, │ │ │ │
│ │ device_sessions, …│ │ │ │
│ └──────────────────────┘ └────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

┌────────────────────────────┴────────────────────────────────────┐
│ External Services (all optional) │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌────────────┐ ┌──────────────┐ │
│ │ Ollama │ │ Anthropic │ │ Expo Push │ │ Firebase FCM │ │
│ │ (local) │ │ (cloud LLM) │ │ (RN apps) │ │ (Android) │ │
│ └──────────┘ └──────────────┘ └────────────┘ └──────────────┘ │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌────────────┐ ┌──────────────┐ │
│ │ Google │ │ Tuya Cloud │ │ S3-compat. │ │ Caddy/ │ │
│ │ OAuth + │ │ (smart plug) │ │ (camera- │ │ Tailscale/ │ │
│ │ Calendar │ │ │ │ sync,opt.)│ │ Tunnel │ │
│ └──────────┘ └──────────────┘ └────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘

4. The boot sequence (line by line)

The CLI entry server.ts is intentionally tiny — it only handles signals and exit codes. The real work lives in start-server.ts, which is also reused by the Electron main process so the embedded server runs the exact same boot path as the standalone CLI.

The order matters; each step depends on the prior steps' side effects:

#StepWhat it doesWhy the order matters
1loadRuntimeConfig()Merges data/.env into process.env. Process-supplied vars always win.Modules loaded later (DB, Ollama, JWT) cache env at module-load time.
2runMigrations()Applies any pending Drizzle migrations using a short-lived postgres client. Failures are logged but never block boot.The long-lived pool opened by lib/db/index.ts must connect to a fully migrated schema.
3Dynamic importsThe boot file uses await import(...) for everything env-sensitive.Static imports would defeat step 1.
4registerAllHandlers()Wires WebSocket message domains (ping, chat, notification, voice, plugin) to handlers in lib/ws/handlers/.Connections opened later must find a registered handler.
5registerBuiltInActions()Registers 11 built-in actions (chat, memory, reminders, camera, …) into the in-memory ActionRegistry.The chat pipeline and scheduler resolve actions by name at call time.
6runPreflight()Checks DB connectivity, required secrets, optional services (Ollama, push), and produces a PreflightReport.Surfaces actionable warnings; drives the /setup wizard.
7HTTP + WS serverCreates http.createServer, mounts the Hono app on /api/*, serves SPA static files for everything else, and attaches two WebSocketServer instances (/ws, /ws/iot) via noServer upgrade routing.Single TCP port, multi-protocol fan-out.
8ensureBootstrapPin()If no admin user exists, prints a one-time PIN to stdout for first-time pairing.Solves the chicken-and-egg "no auth yet, can't reach /setup" problem.
9httpServer.listen(port, host)Binds the socket. After this point the dashboard is reachable.Anything below this line is non-blocking — the user can already interact with the app.
10bootstrapIot()Auto-seeds the Tuya plug row, schedules an event-pruning cron.Idempotent; safe to re-run on every boot.
11loadSkills()Scans data/skills/, parses manifest.json per folder, dynamically imports actions.ts/.js, and registers their actions.Fire-and-forget; failures don't block.
12reconcilePlugins()Scans data/plugins/, upserts DB rows for every folder, then loads the ones marked enabled.Each plugin is sandboxed; one failing plugin doesn't bring others down.
13startPluginWatcher()Starts a chokidar watcher on data/plugins/ so installs are picked up without a restart.Enables hot-add of plugins.
14scheduler.start()Loads enabled rows from cron_jobs and schedules them via node-cron; also schedules audit-log trim and pairing-request trim.Each job ticks an action by name.
15startMediaWatcher() + fullScan()Begins file-system monitoring of data/media/ and data/backups/; the full scan rebuilds the index.The scan is async; the watcher catches changes that happen during it.
16cameraWorker.start()Loads enabled rows from camera_configs and starts the per-camera capture loop, staggered so simultaneous captures don't pin Ollama.Each camera runs independently; one failure doesn't disable the others.

The entry point returns a StartResult with a stop() closure. The CLI binds it to SIGTERM/SIGINT; Electron calls it on app quit. Teardown reverses the boot order: scheduler → media → cameras → WS clients → HTTP close.


5. The HTTP layer

The HTTP server is a plain http.createServer instance with two routing rules:

  1. Anything starting with /api/ is forwarded to the Hono app via @hono/node-server's getRequestListener.
  2. Everything else falls through to a static-file handler that serves dist/client/ and falls back to index.html for client-side routing.

In development the static handler returns 404; clients hit Vite on :5173 instead, and Vite proxies /api calls back to the Node server on :3000.

Hono app structure

createApiApp() mounts 28 sub-routers on a /api base path:

app.use("*", killSwitchMiddleware); // 503 if local kill-switch tripped

app.route("/auth", authRoutes);
app.route("/_health", healthRoute);
app.route("/status", statusRoute);
app.route("/settings", settingsRoute);
app.route("/chat", chatRoute); // streaming SSE responses
app.route("/memory", memoryRoute);
app.route("/calendar", calendarRoute);
app.route("/reminders", remindersRoute);
app.route("/notifications", notificationsRoute);
app.route("/brief", briefRoute);
app.route("/jobs", jobsRoute);
app.route("/skills", skillsRoute);
app.route("/ollama", ollamaRoute);
app.route("/actions", actionsRoute);
app.route("/household", householdRoute);
app.route("/webhooks", webhooksRoute);
app.route("/paired-devices", pairedDevicesRoute);
app.route("/playlists", playlistsRoute);
app.route("/voice", voiceRoute);
app.route("/iot", iotRoute);
app.route("/cameras", camerasRoute);
app.route("/camera-sync", cameraSyncRoute);
app.route("/files", filesRoute);
app.route("/media", mediaRoute);
app.route("/plugins", pluginsRoute);
app.route("/audit", auditRoute);

Routes that need authentication call authenticateRequest() (bearer token) or requireWebSession() (cookie session) and may further escalate via requireScope("admin").


6. The WebSocket layer

Two WebSocketServer instances run without their own listeners (noServer: true); the HTTP server's upgrade event picks the right one based on the URL path:

/ws — paired devices (phone, tablet, TV, browser)

  • Auth: bearer token. Two delivery channels are accepted to support both native SDKs and browsers:
    • Authorization: Bearer <jwt> header (mobile SDK).
    • Sec-WebSocket-Protocol: opengenie.bearer.<jwt> (browsers can't set arbitrary headers on WS).
  • Verification: authenticateToken() validates the JWT signature and checks the session-revocation cache.
  • Heartbeat: 30s ping/pong; missed pong terminates the socket.
  • Registration: on connect, the device is added to connections (a Map<deviceId, ConnectedDevice> in connections.ts) used for fan-out.
  • Routing: every incoming JSON message goes to the MessageRouter (router.ts), which splits the type on . and dispatches to the registered domain handler.
  • Audit: every connect, disconnect, heartbeat-fail, and scope-deny is appended to the audit_log table.

/ws/iot — ESP32 / hardware

  • Auth: a clientId query parameter; an unknown ID is auto-paired as a new iot_devices row when IOT_AUTO_PAIR != "false" (default on for LAN-only deploys).
  • Handler: handleIotConnection speaks a hardware-specific protocol (button events, BT scans, voice streams, playback state). It does NOT use the generic message router.
  • State: per-device state is held in a process-global Map keyed by iotDeviceId.

Message contract on /ws

interface GenieMessage {
id: string; // client-generated; used to correlate replies
type: string; // "domain.subtype" — first segment routes to a handler
payload: unknown;
timestamp: number;
deviceId: string;
}

Domains and their minimum scopes:

DomainMin scopeHandlerUsed for
pingguestping.tsConnection liveness check
chatmemberchat.tsStreaming chat over WS
notificationmembernotification.tsReceive push delivery acks
voicemembervoice.tsVoice session lifecycle
pluginmemberplugin.tsSubscribe to widget data

Scope is enforced inside the router via hasScope(actual, required); insufficient scope → error reply + audit entry.


7. The Action system (the universal verb)

Actions are the spine of Open Genie. Anything the AI can do, the scheduler can do, a webhook can do, or a plugin can expose is an action. There is one registry, one schema, one execution path.

interface ActionDefinition {
name: string;
description: string;
parameters: { type: "object"; properties: ...; required?: string[] };
execute: (params) => Promise<ActionResult>;
}

Triggers:

  • AI tool calls — the chat pipeline serializes the registry as Ollama/Anthropic tools and feeds tool-result rounds back into the LLM until it stops calling tools.
  • Cron jobscron_jobs rows specify actionType + actionPayload; the scheduler resolves and invokes them.
  • Webhooks — incoming POSTs map a webhooks row to an action.
  • REST API/api/actions exposes the registry directly for testing and for plugin admin UIs.

Built-in actions live in lib/actions/:

add-note, capture-camera, create-reminder, fetch-url, find-person,
generate-daily-brief, ollama-query, play-media, read-memory,
send-notification, update-memory

Plugins can add more via genie.actions.register() from the SDK; skills can add more via manifest.json + actions.ts.


8. The Chat pipeline (the AI brain)

The chat pipeline (lib/chat/pipeline.ts) is the heart of conversational AI in Open Genie. It is provider-agnostic and follows this flow:

User message


1. resolveRole() — pick provider+model from opengenie.json roles
2. createConversation() — DB row + persisted history
3. buildSystemPrompt() — stable+volatile sections (SOUL, household, time, …)
4. retrieveMemory() — RAG: vector search over memory_chunks (if enabled)
5. inferSpeaker() — map deviceId → person (so "remind me" knows who)
6. provider.chat({…}) — Ollama or Anthropic, streaming, with tools
│ (tool-call loop, max 5 rounds)

7. addMessage(assistant) — persist tokens as they stream
8. extractMemories() — async: distill durable facts into memory_chunks
9. updateConversationTitle()— on first turn


ReadableStream<Uint8Array> ─→ HTTP SSE (REST) or WS frames (paired devices)

Provider routing

opengenie.json defines named providers (ollama, anthropic) and named roles (primary, fallback, vision). Each role points to providerName/modelId. The chat/router.ts resolves a role to a concrete LLMProvider instance, with a cache so we don't rebuild clients on every turn. Resolved includes the provider, model id, and max-token budget.

Tool calls

The LLM emits tool calls as part of its response. The pipeline:

  1. Stops streaming after a tool-call boundary.
  2. Looks each call up in actionRegistry.
  3. Executes them in parallel (where safe).
  4. Feeds the results back as a tool_results message.
  5. Resumes streaming. Loop up to DEFAULT_MAX_TOOL_ROUNDS (5).

RAG memory retrieval

When GENIE_RAG_ENABLED=1, before the LLM is called the pipeline embeds the user message and queries memory_chunks (a pgvector table) for top-K relevant snippets. Hits are formatted into the volatile system-prompt section. Memory chunks come from three sources: durable conversation memories, ingested files, and IoT/camera events.


9. The Plugin runtime (extensibility, sandboxed)

Plugins are first-class extensions that can register actions, cron jobs, settings pages, dashboard widgets, OAuth integrations, and event listeners — without touching core code.

Anatomy of a plugin

data/plugins/my-plugin/
├── manifest.json # slug, version, permissions, widget spec, oauth, …
├── server.js # init({genie}) + dispose() — runs in main process
├── widget.js # optional client-side widget renderer
└── icon.svg # optional

Lifecycle (per plugin, per server boot)

reconcilePlugins() ← scans data/plugins/ at boot

├─ for each folder:
│ parse manifest.json
│ UPSERT plugins row (slug, version, manifest)

└─ for each enabled row:
loadPlugin(slug)

├─ resolve entryPath (rejected if it escapes the plugin dir)
├─ buildContext() → genie.{log,fetch,settings,secrets,oauth,
│ cron,actions,notifications,memory,
│ calendar,todos,events,tablet,widget}
├─ dynamic import → call exported init(genie)
├─ register collected cronJobIds, actionNames, eventUnsubs
└─ if widget.refreshIntervalSec → start auto-refresh interval

unloadPlugin(slug) ← on disable / uninstall

├─ call exported dispose() if any
├─ clear refreshInterval
├─ unregister all cron jobs (DB + scheduler)
├─ unregister all action names from actionRegistry
├─ run all eventUnsubs()
├─ removeWidgetProvider(slug)
└─ pluginRegistry.unregister(slug)

Permission model

The manifest declares permissions: PluginPermission[]. The SDK builders gate every cross-cutting capability behind a permission check at call time:

network, notifications,
memory.read, memory.write,
calendar.read, calendar.write,
todos.read, todos.write,
events.subscribe, tablet.sound, actions.register

A plugin that calls genie.memory.write({...}) without memory.write granted raises an error. Admins approve permissions on install via the Plugins page.

Secrets and OAuth

  • Secrets (lib/plugins/secrets.ts) are AES-256-GCM encrypted with the GENIE_SECRETS_KEY env var (32-byte hex). Persisted in plugin_secrets. Per-plugin scope.
  • OAuth (lib/plugins/oauth/) supports Google, Slack, GitHub, Microsoft providers. Tokens stored encrypted in plugin_oauth_tokens. Auto-refresh on use.

Widgets

A plugin manifest may declare a widget.spec (a small declarative DSL: stack/row/grid + text/list/icon/badge blocks). The plugin's init registers a getData() provider; the host caches the latest payload in lib/plugins/widgets.ts and broadcasts updates via connections.broadcast("plugin.widget-data", …) so paired devices update in real time. Refresh can be polled (refreshIntervalSec) or imperative (genie.widget.invalidate()).

Event bus

lib/plugins/events.ts is a typed EventEmitter instance with channels like memory.created, chat.message.user, iot.event, camera.alert. Core code emits, plugins (with events.subscribe permission) consume.


10. The Skill system (lighter than plugins)

Skills are a filesystem-loaded, action-only extension format. They predate plugins and are kept for simple cases that don't need widgets, settings UI, OAuth, or events.

data/skills/my-skill/
├── manifest.json # name, version, actions[], cron?[], config?{}
└── actions.ts # exports: ActionDefinition[]

loadSkills() scans the directory at boot, parses each manifest, dynamically imports actions.ts (using the Function constructor trick to bypass bundler analysis), and registers each exported action with actionRegistry.

Rule of thumb: if you need only new actions and maybe a cron, write a skill. If you need a settings page, widget, OAuth, or event subscriptions, write a plugin.


11. The Scheduler

lib/scheduler.ts wraps node-cron:

  • At boot, loads every cron_jobs row with enabled=true and schedules it.
  • Each tick resolves actionType against the action registry and calls it with actionPayload.
  • Successful and failed runs are written to job_runs for observability.
  • Two system jobs run unconditionally:
    • __audit-log-trim — daily at 03:15, deletes rows older than GENIE_AUDIT_RETENTION_DAYS (default 90).
    • __pairing-requests-trim — every 5 minutes, deletes expired pairing requests.

Plugins register cron jobs via genie.cron.register({schedule, handler}); the loader tracks IDs so unload is clean.


12. The Camera worker

CameraWorker drives the per-camera capture-and-analyze loop:

For each camera_configs row (enabled):
schedule a setInterval at captureIntervalSec
└─ captureSnapshot(streamUrl, protocol) → JPEG blob to /data/files/cameras/
└─ analyzeImage(blob, ollamaModel, prompt) → vision-LLM caption
└─ evaluateRules(caption, alertRules) → triggered alerts
└─ for each alert: notifyAll/notifyDevices via push + (optionally) chat
└─ insert cameras_events row

Cameras are staggered at boot so multiple cameras don't hit Ollama at the same instant. Per-camera lastAlerts deduplicates burst alerts. A processing set prevents overlap when capture takes longer than the interval.


13. The Media pipeline

startMediaWatcher() uses chokidar to watch data/media/ and data/backups/ for adds, changes, and deletes. Each event is debounced 2s (so atomic writes settle), then indexFile() runs:

  1. Read EXIF / audio metadata.
  2. Detect MIME and media type.
  3. Generate a thumbnail (sharp / fluent-ffmpeg).
  4. Upsert into media_files.

fullScan() walks the entire tree at boot; subsequent watcher events keep the index live. Removed rows are pruned when their files disappear.


14. Authentication (devices and the dashboard)

Open Genie has two authentication surfaces that share an underlying session table.

Device sessions (paired clients)

  • JWT access token signed with GENIE_JWT_SECRET (HS256 by default). Carries deviceId, userId, deviceType, scope, sessionId.
  • Refresh token stored hashed in device_sessions. Rotates on every refresh; a 30-second grace cache lets a retried refresh return the same pair instead of generating a third.
  • Revocation cache (lib/auth/revocation-cache.ts) is a small in-memory + DB-backed lookup so a POST /api/auth/sessions/:id/revoke is honored on the next request without scanning the table.

Web sessions (browser dashboard)

  • Cookie-based, signed with AUTH_SECRET. Set by Google OAuth login (/api/auth/login/api/auth/callback) gated by an allow-list (isAllowedUser). Local-only deploys can skip OAuth.
  • webSession middleware populates the request context; routes that need admin call requireWebSession(c).

QR pairing (the new-device handshake)

Dashboard admins create a pairing_request (random code + secretHash) via POST /api/auth/pair/start. The dashboard renders it as a QR code. The mobile/TV app scans, posts the code+secret, and receives a fresh device_sessions row + token pair. Pairing requests expire after 5 minutes; a cron sweeps stale ones.

Scopes

Three-level hierarchy: admin > member > guest. Hierarchy lives in lib/auth/scopes.ts. Both REST routes (via requireScope) and WS handlers (via the router config) enforce it.

Audit and rate limiting

Every authentication event (success, failure, scope-deny, revocation) writes to audit_log via audit(). Hot endpoints (login, callback, pair) go through enforceRateLimit(c, authLimiter).


15. The Database

Connection

lib/db/index.ts opens one long-lived postgres client and wraps it with Drizzle. Migrations run via a separate short-lived client during boot step #2 so schema changes never collide with the pool.

Tables (30 total)

Grouped by purpose:

Identity & access users · devices · pairing_codes · pairing_requests · device_sessions

Conversations & memory conversations · messages · memory_entries · memory_chunks (pgvector) · households · persons

Automation cron_jobs · job_runs · webhooks · reminders

Media & cameras media_files · playlists · playlist_items · camera_configs · camera_events · camera_sync_uploads

Notifications & briefs notifications · daily_briefs

Voice & IoT voice_sessions · iot_devices · iot_events

Plugin runtime plugins · plugin_oauth_tokens · plugin_secrets

Operational audit_log

Migrations

Drizzle stores migrations in drizzle/ with sequential filenames (0000_…0009_…). db:generate produces a new migration from schema diff; db:migrate applies them. CI never auto-pushes; production migrations run inside runMigrations() on boot.


16. Memory (the persistent personality)

Open Genie has three layers of memory:

LayerStorageSpeedUpdated by
Working set (system prompt)In-memory, rebuilt every turnFastbuildPrompt() reads data/SOUL.md, data/MEMORY.md, households/persons
Structured memorymemory_entries (key/value/category)Fastupdate-memory action; manual UI
Long-tail RAG memorymemory_chunks + pgvectorVector searchextractMemories() distills durable facts after each conversation; ingested files are chunked + embedded

SOUL.md is the personality — voice, persona, household-specific tone. MEMORY.md is declarative knowledge — facts the user wants surfaced unconditionally. Both are watched, hot-reloaded into the system prompt.

The persons and households tables let the assistant disambiguate "remind me" vs "remind Alice" by mapping deviceId → primaryPersonId and resolving aliases.


17. Notifications

lib/notifications/ is the unified push surface:

  • Expo Push for the iPhone/iPad/TV apps (delivers FCM under the hood on Android, APNs on iOS).
  • Direct FCM as an optional escape hatch (env-gated).
  • WebSocket fan-out — for online devices, notifyDevices first tries connections.send(deviceId, …); if the socket is closed, it falls back to push.
  • Categories map to actionable buttons on the device (reply, dismiss, snooze).

notifications rows persist what was sent (for the dashboard's notification center) and expose delivery state (pending, sent, delivered, failed).


18. Frontend (the dashboard SPA)

apps/web/client/ is a React 19 SPA built with Vite and React Router v7 (file-style lazy routes in routes.tsx).

Top-level pages: Home, Chat, Calendar, Todos, Files, Media, Cameras, CameraDetail, PairedDevices, Devices, DeviceDetail, Events, Automation, Memory, Plugins, Settings, SettingsBySlug, Setup, Audit.

Shared components live in components/: AppShell (chrome), NavRail (sidebar), AiPanel (slide-in chat), GlobalSearch, plus design-system primitives (Button, Card, Toggle, Toast, …) from @genie/tokens.

Plugin widgets are rendered by components/plugin-widgets/. The host subscribes via WS (plugin.subscribe) and re-renders on every plugin.widget-data push.

Auth flow in the browser: web-session cookie established by Google OAuth (or skipped for LAN-only). For pages that talk to /ws (Chat, Voice), the client pulls a short-lived bearer from /api/auth/refresh and includes it as Sec-WebSocket-Protocol: opengenie.bearer.<token>.


19. Deployment shapes

The same apps/web package supports four deploy modes:

  1. npm run web:dev — Vite on :5173 + Node server on :3000 with tsx watch.
  2. npm run web:build && web:start — production build, Node serves SPA + API on one port.
  3. Dockerapps/web/Dockerfile produces a single image; companion compose files exist for Caddy (TLS reverse proxy), Tailscale (mesh), and Tunnel (Cloudflare Tunnel).
  4. Electronapps/web/electron/ embeds the same startOpenGenie() entry inside a desktop shell. Settings, data root, and ports are wired up to the user-data dir so a single click "just works".

The runtime config file data/.env is the single source of truth for environment vars across all four modes; the in-app Settings page writes to it via /api/settings.


20. Observability and ops

  • Audit logaudit() is called from everywhere security-relevant: connect, disconnect, scope-deny, refresh, revoke, pair, plugin install/enable/disable, settings change. Trimmed daily.
  • Health & status/api/_health is a flat liveness probe; /api/status returns the full preflight report and live counts (devices online, jobs scheduled, plugins loaded).
  • Kill-switch — a local-only emergency mode (/data/.killswitch) makes every HTTP and WS upgrade return 503. Useful when something is misbehaving and you need to stop accepting traffic without kill -9.
  • Rate limitinglib/net/rate-limit.ts is a sliding-window counter, applied to login, OAuth callback, and pairing endpoints via presets.
  • Public-URL detectiondetectPublicUrl() figures out the canonical externally reachable URL by inspecting GENIE_PUBLIC_URL, headers from upstream proxies, and the local LAN interfaces. Surfaced in the boot banner and the /setup wizard.

21. Design principles (and why)

  1. Local-first. Inference, vector storage, and media never leave the box unless you explicitly opt in to a cloud provider. The default install has zero external dependencies beyond Postgres.
  2. One process, no microservices. The whole hub is one Node process. This keeps the deployment story trivial (Docker run, Electron pack, or node server.ts) and makes cross-feature work straightforward — no RPC, no service discovery.
  3. Registry pattern, everywhere. Actions, WS handlers, plugins, skills, cron jobs, widget data providers — all are registered into in-memory maps at boot and looked up by name at runtime. This makes hot enable/disable cheap and testing easy.
  4. Database is the source of truth. Cameras, jobs, webhooks, devices, plugin installs all live in Postgres. Config files (opengenie.json) only seed empty tables and provide model routing — they never override DB state.
  5. Graceful degradation. Boot never fails because of a downstream outage. Preflight reports failures, the user fixes them in /setup, and the affected subsystem reconciles on next start. Cameras, plugins, and skills each load independently.
  6. Streaming. Chat tokens stream to the client over SSE or WS as soon as the LLM emits them. File operations use Node streams. The UI shows progress instead of spinners.
  7. Permission-gated extension. Plugins declare what they touch; the SDK refuses calls without a granted permission. Admins approve permissions on install.
  8. WebSocket-native. Devices stay connected and get push instantly. Polling is reserved for cases where there is genuinely no event source (e.g., camera capture intervals).
  9. Audit everything sensitive. Every authentication, scope check, plugin lifecycle change, and settings write is logged. Cheap in normal operation; invaluable when something goes wrong.
  10. Reuse, don't reinvent. Plugins schedule cron via the same scheduler core uses; widgets ride the same WS that chat does; OAuth tokens use the same secrets crypto. New surface = new SDK builder, not new infrastructure.

22. Where to go next