Skip to main content

WebSocket Protocol

Open Genie exposes two WebSocket endpoints on the same port:

PathClientsAuth
/wsPaired phones, tablets, TVsJWT token in query string
/ws/iotESP32 firmware and IoT hardwareclientId + LAN-only binding

Paired-Device WebSocket (/ws)

All real-time communication between paired devices and the server uses a JSON-based WebSocket protocol over the /ws endpoint.

Connection

ws://localhost:3000/ws?token=<jwt-token>

The token is a JWT containing { deviceId, userId, deviceType }. See Authentication for how to obtain one.

Message Format

Every message (client → server and server → client) follows this structure:

interface WsMessage {
id: string; // Unique message ID (for request/response correlation)
type: string; // Domain-qualified type, e.g. "chat.message"
payload: unknown; // Domain-specific data
deviceId?: string; // Sender's device ID (set by server on incoming)
timestamp?: number; // Unix timestamp in milliseconds
}

Type Convention

Message types use dot notation with the first segment as the domain:

chat.message → routed to "chat" handler
chat.delta → routed to "chat" handler
ping.request → routed to "ping" handler
notification.ack → routed to "notification" handler
media.play → routed to "media" handler (if registered)

Router

The router (lib/ws/router.ts) dispatches messages by extracting the domain from type.split(".")[0]:

Incoming JSON → parse → extract domain → find handler → execute
↓ (not found)
send error response

If no handler is registered for a domain, the router sends back an error:

{
"type": "error",
"payload": {
"message": "No handler for domain: unknown",
"originalType": "unknown.something"
}
}

Built-in Message Types

Ping

Lightweight connectivity check.

Client → Server:

{
"id": "abc123",
"type": "ping.request",
"payload": {}
}

Server → Client:

{
"id": "abc123",
"type": "ping.response",
"payload": { "time": 1712937600000 }
}

Chat

Send a message

Client → Server:

{
"id": "msg-1",
"type": "chat.message",
"payload": {
"content": "What's the weather like?",
"conversationId": "conv-uuid" // optional, creates new if omitted
}
}

Streaming response

The server responds with multiple chat.delta messages followed by a chat.done:

Server → Client:

{
"type": "chat.delta",
"payload": {
"conversationId": "conv-uuid",
"content": "Based on ",
"role": "assistant"
}
}
{
"type": "chat.delta",
"payload": {
"conversationId": "conv-uuid",
"content": "the current data...",
"role": "assistant"
}
}
{
"type": "chat.done",
"payload": {
"conversationId": "conv-uuid",
"messageId": "msg-uuid"
}
}

Tool execution during chat

When the AI decides to call a tool, the server executes it internally and continues the conversation. The client may receive status updates:

{
"type": "chat.tool_call",
"payload": {
"conversationId": "conv-uuid",
"toolName": "capture_camera",
"status": "executing"
}
}

Notifications

Receive notification

Server → Client:

{
"type": "notification.push",
"payload": {
"id": "notif-uuid",
"title": "Camera Alert",
"body": "Motion detected in backyard",
"level": "warning",
"category": "CAMERA_ALERT"
}
}

Acknowledge

Client → Server:

{
"id": "ack-1",
"type": "notification.ack",
"payload": { "notificationId": "notif-uuid" }
}

Mark as read

Client → Server:

{
"id": "read-1",
"type": "notification.read",
"payload": { "notificationId": "notif-uuid" }
}

Media Control

Server → Client:

{
"type": "media.play",
"payload": {
"mediaId": "file-uuid",
"url": "/api/media/file-uuid",
"title": "Movie.mp4",
"type": "video"
}
}

Connection Registry

The ConnectionRegistry (lib/ws/connections.ts) is a singleton that tracks all active WebSocket connections.

API

// Register a device connection
connections.register(deviceId: string, info: DeviceConnection): void

// Remove a device connection
connections.unregister(deviceId: string): void

// Send a typed message to a specific device
connections.send(deviceId: string, type: string, payload: unknown): boolean

// Send to all connected devices
connections.broadcast(type: string, payload: unknown): void

// Send to all devices of a specific type
connections.broadcastToType(deviceType: string, type: string, payload: unknown): void

// Check if a device is connected
connections.isConnected(deviceId: string): boolean

// Get connection count
connections.count(): number

// Get a specific connection
connections.get(deviceId: string): DeviceConnection | undefined

// Get all devices of a type
connections.getByType(deviceType: string): DeviceConnection[]

DeviceConnection

interface DeviceConnection {
ws: WebSocket;
deviceId: string;
userId: string;
deviceType: string;
connectedAt: Date;
lastPong?: Date;
}

Adding Custom Handlers

To add a new message domain, create a handler file and register it:

// lib/ws/handlers/my-domain.ts
import { router } from "../router";

router.register("mydomain", async (message, ws) => {
const { type, payload, id } = message;

switch (type) {
case "mydomain.action":
// Handle the action
ws.send(JSON.stringify({
id,
type: "mydomain.result",
payload: { success: true },
}));
break;
}
});

Then import it in lib/ws/handlers/index.ts:

import "./my-domain";

IoT WebSocket (/ws/iot)

The ESP32 firmware connects on boot using its MAC address as clientId:

ws://<lan-ip>:3000/ws/iot?clientId=esp32-aabbccddeeff

No JWT is required. Auth is gated by:

  • LAN-only binding — the server only accepts /ws/iot upgrades from the local network.
  • clientId matching — the clientId must match a row in iot_devices.externalId, or IOT_AUTO_PAIR must be enabled (default: on).

If auto-pair is on and the device is unknown, the server inserts a new iot_devices row named "ESP32 ({clientId.slice(-4)})" and continues the handshake.

Connection flow

ESP32 → ws://server:3000/ws/iot?clientId=aabbccddeeff
Server: look up iot_devices WHERE external_id = 'aabbccddeeff'
├─ Found → proceed
├─ Not found + IOT_AUTO_PAIR=true → INSERT row → proceed
└─ Not found + IOT_AUTO_PAIR=false → close(4401, "not paired")

Messages (ESP32 → Server)

The IoT WS uses the same binary-capable JSON envelope, but with device-specific types defined in lib/iot/types.ts:

typeMeaning
iot.statusHeartbeat with current state (playing, BT connected, volume, …)
iot.buttonPhysical mic button pressed/released
iot.bt_scanBluetooth scan results
iot.audio_dataPCM chunk from the mic (for Whisper transcription)

Messages (Server → ESP32)

typeMeaning
iot.playPlay a song — includes stream URL
iot.stopStop playback
iot.volumeSet volume (0–100)
iot.bt_connectConnect to a Bluetooth audio sink by MAC
iot.plug_stateTuya plug state change (informational)

Event logging

Every message in either direction is persisted to iot_events via onIotEvent(). This drives the /events page and the recent-events table on each device's detail page.