WebSocket Protocol
Open Genie exposes two WebSocket endpoints on the same port:
| Path | Clients | Auth |
|---|---|---|
/ws | Paired phones, tablets, TVs | JWT token in query string |
/ws/iot | ESP32 firmware and IoT hardware | clientId + LAN-only binding |
Paired-Device WebSocket (/ws)
All real-time communication between paired devices and the server uses a JSON-based WebSocket protocol over the /ws endpoint.
Connection
ws://localhost:3000/ws?token=<jwt-token>
The token is a JWT containing { deviceId, userId, deviceType }. See Authentication for how to obtain one.
Message Format
Every message (client → server and server → client) follows this structure:
interface WsMessage {
id: string; // Unique message ID (for request/response correlation)
type: string; // Domain-qualified type, e.g. "chat.message"
payload: unknown; // Domain-specific data
deviceId?: string; // Sender's device ID (set by server on incoming)
timestamp?: number; // Unix timestamp in milliseconds
}
Type Convention
Message types use dot notation with the first segment as the domain:
chat.message → routed to "chat" handler
chat.delta → routed to "chat" handler
ping.request → routed to "ping" handler
notification.ack → routed to "notification" handler
media.play → routed to "media" handler (if registered)
Router
The router (lib/ws/router.ts) dispatches messages by extracting the domain from type.split(".")[0]:
Incoming JSON → parse → extract domain → find handler → execute
↓ (not found)
send error response
If no handler is registered for a domain, the router sends back an error:
{
"type": "error",
"payload": {
"message": "No handler for domain: unknown",
"originalType": "unknown.something"
}
}
Built-in Message Types
Ping
Lightweight connectivity check.
Client → Server:
{
"id": "abc123",
"type": "ping.request",
"payload": {}
}
Server → Client:
{
"id": "abc123",
"type": "ping.response",
"payload": { "time": 1712937600000 }
}
Chat
Send a message
Client → Server:
{
"id": "msg-1",
"type": "chat.message",
"payload": {
"content": "What's the weather like?",
"conversationId": "conv-uuid" // optional, creates new if omitted
}
}
Streaming response
The server responds with multiple chat.delta messages followed by a chat.done:
Server → Client:
{
"type": "chat.delta",
"payload": {
"conversationId": "conv-uuid",
"content": "Based on ",
"role": "assistant"
}
}
{
"type": "chat.delta",
"payload": {
"conversationId": "conv-uuid",
"content": "the current data...",
"role": "assistant"
}
}
{
"type": "chat.done",
"payload": {
"conversationId": "conv-uuid",
"messageId": "msg-uuid"
}
}
Tool execution during chat
When the AI decides to call a tool, the server executes it internally and continues the conversation. The client may receive status updates:
{
"type": "chat.tool_call",
"payload": {
"conversationId": "conv-uuid",
"toolName": "capture_camera",
"status": "executing"
}
}
Notifications
Receive notification
Server → Client:
{
"type": "notification.push",
"payload": {
"id": "notif-uuid",
"title": "Camera Alert",
"body": "Motion detected in backyard",
"level": "warning",
"category": "CAMERA_ALERT"
}
}
Acknowledge
Client → Server:
{
"id": "ack-1",
"type": "notification.ack",
"payload": { "notificationId": "notif-uuid" }
}
Mark as read
Client → Server:
{
"id": "read-1",
"type": "notification.read",
"payload": { "notificationId": "notif-uuid" }
}
Media Control
Server → Client:
{
"type": "media.play",
"payload": {
"mediaId": "file-uuid",
"url": "/api/media/file-uuid",
"title": "Movie.mp4",
"type": "video"
}
}
Connection Registry
The ConnectionRegistry (lib/ws/connections.ts) is a singleton that tracks all active WebSocket connections.
API
// Register a device connection
connections.register(deviceId: string, info: DeviceConnection): void
// Remove a device connection
connections.unregister(deviceId: string): void
// Send a typed message to a specific device
connections.send(deviceId: string, type: string, payload: unknown): boolean
// Send to all connected devices
connections.broadcast(type: string, payload: unknown): void
// Send to all devices of a specific type
connections.broadcastToType(deviceType: string, type: string, payload: unknown): void
// Check if a device is connected
connections.isConnected(deviceId: string): boolean
// Get connection count
connections.count(): number
// Get a specific connection
connections.get(deviceId: string): DeviceConnection | undefined
// Get all devices of a type
connections.getByType(deviceType: string): DeviceConnection[]
DeviceConnection
interface DeviceConnection {
ws: WebSocket;
deviceId: string;
userId: string;
deviceType: string;
connectedAt: Date;
lastPong?: Date;
}
Adding Custom Handlers
To add a new message domain, create a handler file and register it:
// lib/ws/handlers/my-domain.ts
import { router } from "../router";
router.register("mydomain", async (message, ws) => {
const { type, payload, id } = message;
switch (type) {
case "mydomain.action":
// Handle the action
ws.send(JSON.stringify({
id,
type: "mydomain.result",
payload: { success: true },
}));
break;
}
});
Then import it in lib/ws/handlers/index.ts:
import "./my-domain";
IoT WebSocket (/ws/iot)
The ESP32 firmware connects on boot using its MAC address as clientId:
ws://<lan-ip>:3000/ws/iot?clientId=esp32-aabbccddeeff
No JWT is required. Auth is gated by:
- LAN-only binding — the server only accepts
/ws/iotupgrades from the local network. clientIdmatching — theclientIdmust match a row iniot_devices.externalId, orIOT_AUTO_PAIRmust be enabled (default: on).
If auto-pair is on and the device is unknown, the server inserts a new iot_devices row named "ESP32 ({clientId.slice(-4)})" and continues the handshake.
Connection flow
ESP32 → ws://server:3000/ws/iot?clientId=aabbccddeeff
Server: look up iot_devices WHERE external_id = 'aabbccddeeff'
├─ Found → proceed
├─ Not found + IOT_AUTO_PAIR=true → INSERT row → proceed
└─ Not found + IOT_AUTO_PAIR=false → close(4401, "not paired")
Messages (ESP32 → Server)
The IoT WS uses the same binary-capable JSON envelope, but with device-specific types defined in lib/iot/types.ts:
type | Meaning |
|---|---|
iot.status | Heartbeat with current state (playing, BT connected, volume, …) |
iot.button | Physical mic button pressed/released |
iot.bt_scan | Bluetooth scan results |
iot.audio_data | PCM chunk from the mic (for Whisper transcription) |
Messages (Server → ESP32)
type | Meaning |
|---|---|
iot.play | Play a song — includes stream URL |
iot.stop | Stop playback |
iot.volume | Set volume (0–100) |
iot.bt_connect | Connect to a Bluetooth audio sink by MAC |
iot.plug_state | Tuya plug state change (informational) |
Event logging
Every message in either direction is persisted to iot_events via onIotEvent(). This drives the /events page and the recent-events table on each device's detail page.