DisplaySync

Monitoring & health

A sign is healthy when the kiosk is connected, the assigned URL is reachable, and the device telemetry is within normal ranges. The dashboard surfaces all three on every sign detail page; this page is the reference for what each indicator means.

Heartbeats

Every claimed sign sends a heartbeat to the backend over the WebSocket every 5 seconds. Each heartbeat carries:

  • status — online / offline / error / maintenance
  • currentUrl — what the kiosk is currently displaying
  • deviceInfo — platform, OS, MAC, IP, Tailscale IP, hostname, screen resolution, free disk, RAM, CPU
  • appVersion — desktop sign version
  • uptime — seconds since the kiosk booted
  • cacheState — whether content is cached locally for offline operation

The backend writes the latest heartbeat into Redis (for fast reads) and a rolling window into PostgreSQL (for the uptime timeline).

Online / offline transitions

The dashboard transitions a sign between Online and Offline based on heartbeat freshness:

StateTrigger
OnlineHeartbeat received within the last 15 seconds
OfflineNo heartbeat for 15+ seconds (3 missed in a row)
Online (recovered)Was offline, now received a heartbeat — flagged briefly as "recently reconnected"

Why 15 seconds? Heartbeats fire every 5 seconds, so 15 seconds is exactly 3 missed in a row — strong enough to filter transient packet loss, fast enough that you find out about a real disconnect within the venue's typical "is something wrong?" reaction time.

The backend stores each heartbeat in Redis with a 30 s TTL and runs a background sweep every 10 s to mark expired sign records offline. So a true offline transition can take up to ~25 s to surface — heartbeat goes silent at T=0, Redis entry expires at T+15 s, the sweep next runs by T+25 s. Notifications are deferred a further 60 seconds to give the sign a chance to reconnect — see Notifications for why.

Sign states

A sign is in exactly one state at a time. The dashboard renders each with a consistent color:

StateColorMeaning
OnlineGreenConnected, heartbeating, content displaying
OfflineYellowHeartbeat is stale (no signal for 30+ s) — likely a network or kiosk problem
ErrorRedSign reported an explicit failure (e.g., couldn't load assigned URL)
MaintenanceBlueOperator-controlled state. Ctrl+Shift+Q on the kiosk exits the sign app for maintenance (the watchdog won't relaunch while the .maintenance sentinel is present). The maintenance state on the dashboard is set by the dashboard itself, not by the kiosk's heartbeat. See Crash recovery.
UnlinkedGreySign record exists but no physical device is linked yet

The color is consistent across the dashboard sign grid, the sign detail page, the mobile app, and notification badges.

State transitions are written to the Audit tab so you can answer "when did this sign go offline?" without grepping logs.

The "Monitoring" badge

A sign in monitoring mode shows a Monitoring badge on its dashboard card alongside the state color. The orthogonal mode field on the heartbeat carries 'monitoring' or 'active' — see Sign states → Orthogonal mode field. When the badge is showing:

  • The sign is healthy (heartbeat is current; sign is online)
  • The wall is intentionally dark — display hidden, audio muted
  • This is not a failure to escalate; the operator put the sign in this mode

Toggle off via the dashboard's Exit Monitoring button or Ctrl+Shift+M at the kiosk. See Remote control → Monitoring mode and Hotkeys.

Uptime tracking

The dashboard computes uptime two ways:

  • Per-sign uptime % for the lifetime of the event: online time / event time
  • Per-event uptime %: average across all signs in the event

You'll see both on the event detail page. A few patterns worth recognizing:

  • >99% is normal for a properly-deployed event
  • 95-99% typically reflects venue Wi-Fi flapping rather than kiosk failure — the wall is up, the dashboard just sees the connection drop briefly
  • Below 95% suggests genuine trouble — either a network problem you can fix or a sign in a flaky state

Uptime resets at the start of an event, so historical events keep their stats and a new event starts fresh.

Device info

Every heartbeat carries device telemetry. The dashboard surfaces it on the Device info card:

FieldSource
Platform / OS versionos.platform() + os.release()
Hostnameos.hostname() — useful when you set custom names like LOBBY-SIGN-01
MACFirst non-internal NIC at first boot (stored, doesn't change)
Local IPCurrent primary interface IP
Tailscale IPtailscale ip -4 if installed, blank otherwise
Screen resolutionPer the primary display
CPU / RAM / Free diskSnapshot at heartbeat time
Sign app versionBuild version of the desktop sign
UptimeSeconds since the sign app launched (reset by Reboot app or Reboot device)

Telemetry is for triage, not surveillance — use it to answer "is this sign stuck?" or "did somebody reboot the device an hour ago?" not for performance dashboards.

Content reachability

Independent of the kiosk's connection to us, the kiosk monitors whether the assigned URL is reachable by doing an HTTP HEAD every 60 seconds. The dashboard reports this on a per-sign and per-event basis:

  • Reachable — last HEAD succeeded (2xx or 3xx)
  • Unreachable — last HEAD failed (timeout, 4xx, 5xx, DNS failure)

Reachability state is independent of the sign's online/offline state:

  • A sign can be Online but with Unreachable content — the kiosk reaches us, but its content origin is down
  • A sign can be Offline with Reachable content (last known) — the kiosk lost its WebSocket but the content URL was working at last check

When content goes Unreachable, the kiosk continues displaying the cached version and notifies subscribers (see Notifications). The wall doesn't blank — you have time to fix the content side without an audience seeing the failure.

Local diagnostics on the kiosk

Sometimes you want to look at health from the sign's side rather than the dashboard's. With keyboard access to the kiosk, press Ctrl + Shift + S to open the Status Dashboard overlay on the kiosk itself:

  • Connection status (WebSocket state, last heartbeat sent, last command received)
  • Sign ID, short code, MAC
  • Backend URL, WebSocket URL
  • IP addresses (LAN + Tailscale)
  • Cache status (items cached, size, last sync)
  • Recent error count

Press Esc to dismiss. This overlay is also what techs press when triaging a misbehaving sign in person — answers "is this device even reaching the backend?" without leaving the venue.

When to escalate

A few patterns and what to do about them:

PatternWhat it meansAction
One sign offline for >2 minutesKiosk-specific — the others are fineTroubleshoot offline
Multiple signs offline at onceNetwork or backend issueCheck venue Wi-Fi first. Multiple signs across multiple venues going offline at the same time usually means a backend incident — we'll email an alert if so
All signs online, content unreachableYour content URL is downFix content side, or assign a fallback URL
Sign cycling online/offline rapidlyWi-Fi flapping or kiosk DNS issuesNetwork resilience
Sign in Error stateContent failed to loadFetch logs, look for the failed URL or HTTP error

Reach for Troubleshooting for symptom-by-symptom playbooks.

What's next