docs: propose camera architecture redesign#197
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a spec-driven OpenSpec change proposal for integrating Matter 1.5 Camera device support into BartonCore, spanning SBMD resource mapping, new Matter cluster server infrastructure for WebRTC signaling, and optional embedded WebRTC media handling.
Changes:
- Adds a detailed implementation task breakdown for a 3-tier camera + WebRTC rollout (SBMD → signaling → media/RTP forwarding).
- Introduces requirement specs for camera SBMD mapping, WebRTC signaling (cluster server + public API), and WebRTC media (libdatachannel + RTP forwarding).
- Adds proposal/design documentation and OpenSpec metadata for the new change package.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| openspec/changes/matter-camera-support/.openspec.yaml | Registers the change package as spec-driven with creation date. |
| openspec/changes/matter-camera-support/proposal.md | High-level “Why/What/Impact” proposal for Matter camera + WebRTC support. |
| openspec/changes/matter-camera-support/design.md | Tiered architecture/design decisions, risks, and open questions. |
| openspec/changes/matter-camera-support/tasks.md | Stepwise implementation plan across build, SBMD, signaling, media, and tests. |
| openspec/changes/matter-camera-support/specs/camera-sbmd-spec/spec.md | Requirements for mapping Matter camera clusters/attributes/commands into SBMD resources. |
| openspec/changes/matter-camera-support/specs/webrtc-signaling/spec.md | Requirements for cluster server hosting + signaling/session lifecycle + GObject API exposure. |
| openspec/changes/matter-camera-support/specs/webrtc-media/spec.md | Requirements for BCORE_WEBRTC, libdatachannel integration, PeerConnection lifecycle, and RTP forwarding. |
| ### Requirement: BCORE_WEBRTC CMake feature flag | ||
| The build system SHALL provide a `BCORE_WEBRTC` CMake option (default OFF) in `config/cmake/options.cmake` that gates all WebRTC media code and the libdatachannel dependency. When OFF, signaling-only WebRTC support (Tier 2) SHALL still function — clients handle media externally. | ||
|
|
||
| #### Scenario: WebRTC media code excluded when flag is OFF | ||
| - **WHEN** `BCORE_WEBRTC=OFF` (default) | ||
| - **THEN** libdatachannel is not linked, WebRTC media source files are not compiled, and the resulting library has no libdatachannel symbols | ||
|
|
||
| #### Scenario: WebRTC media code included when flag is ON | ||
| - **WHEN** `BCORE_WEBRTC=ON` | ||
| - **THEN** libdatachannel is discovered via `bcore_find_package(NAME libdatachannel ... REQUIRED)`, WebRTC media source files are compiled, and PeerConnection management is available | ||
|
|
||
| #### Scenario: BCORE_WEBRTC requires BCORE_MATTER | ||
| - **WHEN** `BCORE_WEBRTC=ON` and `BCORE_MATTER=OFF` | ||
| - **THEN** CMake configuration fails with a clear error message indicating BCORE_MATTER is required |
There was a problem hiding this comment.
The BCORE_WEBRTC requirement says signaling-only WebRTC (Tier 2) still functions when the flag is OFF and libdatachannel is not linked, but the signaling requirements/tasks elsewhere assume Barton generates an SDP offer and performs ICE candidate gathering. Offer/ICE generation typically requires a WebRTC stack; if libdatachannel is fully gated off, the spec needs to clarify how the offer and local ICE candidates are produced (e.g., client-supplied SDP/ICE via public API when BCORE_WEBRTC=OFF, or redefine the flag to gate only RTP forwarding/media plumbing while still linking a WebRTC stack for signaling).
| ### Requirement: Public API method to initiate WebRTC session | ||
| The public API SHALL expose a method to initiate a WebRTC live view session with a camera device, taking the device UUID and optional stream parameters (codec, resolution preferences). The method SHALL return a session ID. | ||
|
|
||
| #### Scenario: Initiate live view session | ||
| - **WHEN** a client calls the WebRTC session initiation method with a camera device UUID | ||
| - **THEN** the system allocates a video stream on the camera (via CameraAVStreamManagement), generates an SDP offer, sends it to the camera via WebRTCTransportProvider, and returns a session ID | ||
|
|
||
| #### Scenario: Initiation fails for non-camera device | ||
| - **WHEN** a client calls the session initiation method with a non-camera device UUID | ||
| - **THEN** the method returns an error indicating the device does not support WebRTC streaming | ||
|
|
There was a problem hiding this comment.
This scenario states the system allocates a video stream and generates an SDP offer before sending it via WebRTCTransportProvider. That conflicts with the BCORE_WEBRTC=OFF requirement in webrtc-media/spec.md that libdatachannel is not linked but Tier 2 still works. Please clarify whether Barton is responsible for SDP/ICE generation in all configurations (implying a WebRTC stack dependency even in signaling-only mode), or whether the public API must accept client-generated SDP/ICE when media support is disabled.
| - [ ] 14.1 Create matter.js virtual camera device (`CameraDevice.js`) following the `VirtualDevice` pattern — implements CameraAVStreamManagement, WebRTCTransportProvider, and WebRTCTransportRequestor clusters with side-band operations for test control. Note: may require matter.js upgrade or custom cluster definitions (see Open Question #7) | ||
| - [ ] 14.2 Add `MatterCamera` Python fixture (`matter_camera.py`) and `matter_camera` pytest fixture following the `MatterLight`/`MatterDoorLock` pattern | ||
| - [ ] 14.3 Integration test: discover and commission virtual camera, verify device class is `camera` and resources are populated | ||
| - [ ] 14.4 Integration test: initiate WebRTC session, verify SDP offer/answer exchange completes via signaling API (works with `BCORE_WEBRTC=OFF`) |
There was a problem hiding this comment.
The integration test plan says SDP offer/answer exchange works with BCORE_WEBRTC=OFF, but other requirements/tasks also imply Barton generates the SDP offer and manages ICE flows. If BCORE_WEBRTC=OFF removes the libdatachannel/WebRTC stack dependency, the proposal should clarify what component generates the offer/local ICE in this mode (client-supplied signaling vs Barton-generated) and align the test plan with that choice.
| - [ ] 14.4 Integration test: initiate WebRTC session, verify SDP offer/answer exchange completes via signaling API (works with `BCORE_WEBRTC=OFF`) | |
| - [ ] 14.4 Integration test: initiate WebRTC session via signaling API. With `BCORE_WEBRTC=OFF`, SDP offer/answer (and any ICE candidates) are generated and handled entirely by the virtual camera client, and Barton acts only as a signaling relay (no local WebRTC stack). With `BCORE_WEBRTC=ON`, verify that Barton, via libdatachannel, can generate SDP/ICE and complete the offer/answer exchange end-to-end. |
| └─────────────────────────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| **Thread safety**: The cluster server callbacks execute on the Matter event loop (CHIP platform layer thread). Data must be marshalled to the GLib main loop via `g_idle_add` or `g_main_context_invoke` before emitting GObject signals or modifying core state. This follows the same pattern used by existing Matter subscription callbacks. |
There was a problem hiding this comment.
This thread-safety note claims marshalling via g_idle_add/g_main_context_invoke follows the existing Matter subscription callback pattern, but the current codebase does not use either API (no references in core/). To avoid baking in an inaccurate assumption, either (a) describe the actual marshalling mechanism used today (e.g., creating/attaching a GSource to the target GMainContext), or (b) drop the “same pattern” claim and define a new, explicit helper/pattern for cross-thread delivery that Tier 2/Tier 3 will adopt consistently.
| **Thread safety**: The cluster server callbacks execute on the Matter event loop (CHIP platform layer thread). Data must be marshalled to the GLib main loop via `g_idle_add` or `g_main_context_invoke` before emitting GObject signals or modifying core state. This follows the same pattern used by existing Matter subscription callbacks. | |
| **Thread safety**: The cluster server callbacks execute on the Matter event loop (CHIP platform layer thread). Implementations must marshal data to the GLib main loop (for example by posting to a `GMainContext` via `g_idle_add`, `g_main_context_invoke`, or a dedicated `GSource` helper) before emitting GObject signals or modifying core state. This document defines that cross-thread delivery requirement for new WebRTC-related callbacks, and Tier 2 / Tier 3 code MUST follow it consistently. |
| - **Matter device type ID registration**: Register Camera device type ID (0x0142) in the SBMD factory so cameras are discovered and matched to the new spec. Additional camera device types (Video Doorbell 0x0143, Audio Doorbell 0x0141, Snapshot Camera 0x0145, etc.) are future work | ||
| - **WebRTCTransportRequestor cluster server**: Barton currently only acts as a cluster client. The camera WebRTC flow requires the controller to host a WebRTCTransportRequestor cluster server endpoint so the camera can send SDP answers and ICE candidates back. This is a new architectural pattern for Barton's Matter subsystem | ||
| - **WebRTC signaling API exposure**: Expose SDP offers/answers, ICE candidates, and session state through Barton's public GObject API so client applications can participate in or observe the signaling exchange | ||
| - **Embedded libdatachannel for WebRTC media**: Integrate libdatachannel (lightweight C++ WebRTC library) behind a new `BCORE_WEBRTC` CMake feature flag to manage PeerConnection lifecycle (ICE, DTLS, SRTP) internally, forwarding decrypted RTP media over a local UDP socket for client consumption |
There was a problem hiding this comment.
I don't know that core/ needs libdatachannel. Barton doesn't handle an RTP, it only sets one up on behalf of the client via the SDP/ICE dance. The example app should probably provide a way to interact with an A/V RTP stream from the sample matter camera though.
There was a problem hiding this comment.
Also worth mentioning that libdatachannel is shipped with the matter SDK.
| - Enable Barton to discover, commission, and configure Matter camera devices through the existing resource model | ||
| - Map Matter camera clusters to Barton resources via a new camera SBMD spec (Tier 1) | ||
| - Introduce cluster server hosting capability in the Matter subsystem for WebRTC signaling (Tier 2) | ||
| - Expose WebRTC signaling state through the public GObject API (Tier 2) |
There was a problem hiding this comment.
On thing I'm not seeing is an explicit architecture change to SBMD to support calling out of the javascript runtime to get the SDP/ICE info.
| **Rationale**: Signaling is a transient, session-oriented flow (offer → answer → ICE candidates → connected), not a persistent device state. GObject signals are the established pattern for event-driven data in the public API (`resource-updated`, `device-added`, etc.). Resource values are for device state that persists and can be read at any time. | ||
|
|
||
| **Alternatives considered**: | ||
| - Model SDP/ICE as resources: Signaling data is ephemeral and session-scoped — doesn't fit the persistent resource model. |
There was a problem hiding this comment.
While this is true, we have used resources for epemeral data. One option is a webRTCSessionState resource that's an enum. When the enum state is updated, the device service event is decorated with metadata containing the relevant protocol info.
Irregardless, I don't see this document specify synchronous requests to the client to provide SDP/ICE info to send to the camera, only info sent from the camera.
tleacmcsa
left a comment
There was a problem hiding this comment.
I didn't get too far in this review since there is a much larger and higher level discussion about how to model devices that needs to be had.
| - **THEN** camera.sbmd is loaded and its device type IDs are registered for matching | ||
|
|
||
| ### Requirement: Camera device class with camera endpoint profile | ||
| The SBMD spec SHALL define a `camera` device class (`bartonMeta.deviceClass: "camera"`) with a `camera` endpoint profile containing resources for stream management, PTZ control, privacy, zone management, and audio/speaker controls. |
There was a problem hiding this comment.
The "camera" device class and endpoint profile already have meaning (they are an existing contract -- see Zilker's openHomeCameraDeviceDriver.c). We need to see if/how that is being used and if we can deprecate it. I suspect the implied interface for it wont match up to what we are building here.
We need to revisit Barton's device data modeling anyway. But note that it is possible that if you proceed using this class/profile it may trigger code in clients like Zilker to expect to be able to interact with it per that existing (implicit) contract.
| - **WHEN** a Matter camera device is discovered and commissioned | ||
| - **THEN** the device is assigned device class `camera` and its endpoints use the `camera` profile | ||
|
|
||
| ### Requirement: CameraAVStreamManagement cluster resource mapping |
There was a problem hiding this comment.
The approach with Barton device data modeling was not to have a 1:1 attribute to resource mapping. It makes the class/profile very specific to an implementation like Matter. Remember: one of the primary architectural goals of Barton is to provide a device data model abstraction layer so that concrete device implementations can be added/removed without affecting clients. This is a tall order and certainly a challenge.
We need to step back and consider what that abstract interface looks like for a camera. Ideally that same interface could work for OpenHome, Matter, ONVIF, proprietary, etc. This might mean a resource called "capabilities" that returns a JSON structure, etc.
| - `hdrModeEnabled` (read, write): attribute 0x000D HDRModeEnabled, type bool (present only when HDR feature supported) | ||
| - `softRecordingPrivacyModeEnabled` (read, write, dynamic, emitEvents): attribute 0x0013 SoftRecordingPrivacyModeEnabled, type bool — pauses recording/analysis transports | ||
| - `softLivestreamPrivacyModeEnabled` (read, write, dynamic, emitEvents): attribute 0x0014 SoftLivestreamPrivacyModeEnabled, type bool — pauses live view transports | ||
| - `nightVision` (read, write): attribute 0x0016 NightVision, type TriStateAutoEnum (Off/On/Auto) |
There was a problem hiding this comment.
nightVision is an example of what might be an optional resource containing all the related configuration info in JSON and might be better called "nightVisionConfig". Just an example of stuff we need to think about. Perhaps it would only be added if the "capabilities" JSON from above comment indicated it had it.
Same point for speaker, microphone, lights, etc.
|
Should this be draft? |
Add openspec change for a protocol-agnostic camera session contract to support Matter 1.5 WebRTC cameras alongside existing OpenHome cameras. Two-layer endpoint architecture: - Abstract camera endpoint (startSession, stopSession, sessionStatus) drives session lifecycle via events - Protocol-specific endpoints (webrtc, openhome) carry signaling details per technology - Session coordination reuses existing updateResource() metadata mechanism (cJSON convention, not typed) Design decisions cover endpoint structure, event metadata schema, events-as-source-of-truth, driver-determined protocol selection, backward compatibility with existing OpenHome camera profile, and Matter 1.5 WebRTC transport direction. Implementation is scoped as two follow-on stories, each going through its own openspec explore/propose/apply cycle: 1. Camera session contract + OpenHome adapter - define the contract and prove it e2e on existing hardware 2. Matter camera support via WebRTC - new driver, cluster server hosting, signaling, and media stack (depends on story 1) BARTON-334, BARTON-354
531b1b8 to
4b811f7
Compare
probably shoulda, but just put up another commit up for review |
| Define the protocol-agnostic camera session contract and prove it end-to-end by adapting the existing OpenHome camera driver. This is the foundational work — it lands the architecture in code using existing hardware with no new external dependencies. | ||
|
|
||
| Scope includes: | ||
| - Session lifecycle resources (`startSession`, `stopSession`, `sessionStatus`) and metadata constants (`sessionId`, `technology`, `nextAction`, status values) in `commonDeviceDefs.h` |
There was a problem hiding this comment.
Not sure stopSession is necessary. I also don't think the primary interface for a user should be startSession. I would expect to execute something like "startVideoStream" resource and the session info is emergent.
Sessions should be self managed by the driver (camera parent class driver?) and have its own timeout and cleanup of session info.
|
|
||
| Scope includes: | ||
| - `webrtc` endpoint profile with `peerSdp`, `cameraSdp`, `peerIceCandidates`, `cameraIceCandidates` resources and `WEBRTC_PROFILE_*` constants | ||
| - Matter camera driver (native C/C++) — including SBMD feasibility evaluation to determine whether SBMD can express stateful, bidirectional WebRTC signaling or whether a native driver is required |
There was a problem hiding this comment.
This is worded weirdly. It should be SBMD, but the wrench is you may want a common parent class. Which means a new problem needs to be solved: how do you define a Matter Driver with SBMD composition but also could have one ore more additional class dependencies?
Another option maybe is that an extantiation of the session management class is owned by the openhome driver but also has js bindings and is available in the context to the SBMD.
| - Cluster server hosting pattern (`WebRtcTransportRequestorServer`) — first server-side cluster in Barton's Matter subsystem | ||
| - SDP offer/answer exchange and ICE candidate negotiation | ||
| - WebRTC media stack integration (libdatachannel or alternative) | ||
| - Thread marshalling from CHIP event loop to GLib main loop |
rchowdcmcsa
left a comment
There was a problem hiding this comment.
My bad if my comments overlap with anything that was said prior, I lost track of stuffs as I read through all this.
| | | | | ||
| | ◄── tunnelUrl event ────── | | | ||
| | value: "rtsp://..." | | | ||
| ``` |
There was a problem hiding this comment.
So these session flows document the startSession flow, but I don't see anything similar for stopSession. What would the sessionStatus value and metadata look like during that flow(s)? What happens if stopSession is called with no active session? What happens if it's called mid-signaling? What cleanup (if any) is involved?
I see the other comment questioning whether or not stopSession is even necessary. I don't know if it is or not, but that aside, it just feels like it was glossed over in the proposal so far.
| **Session flow example (WebRTC)**: | ||
| ``` | ||
| Client Driver Camera | ||
| | | | | ||
| |── executeResource | | | ||
| | (startSession, {}) ──► | | | ||
| | |── (allocate stream) ───► | | ||
| | | | | ||
| | ◄── sessionStatus event ── | | | ||
| | value: "offering" | | | ||
| | meta: {sessionId: "abc", technology: "webrtc", | | ||
| | nextAction: "/dev/ep/webrtc/r/peerSdp"} | | ||
| | | | | ||
| |── writeResource | | | ||
| | (peerSdp, sdpOffer) ──► |── (SDP offer) ────────► | | ||
| | | | | ||
| | | ◄── (SDP answer) ────── | | ||
| | ◄── cameraSdp event ────── | | | ||
| | value: "<sdp-answer>" | | | ||
| | | | | ||
| | ◄── sessionStatus event ── | | | ||
| | value: "ice-exchange" | | | ||
| | meta: {nextAction: "/dev/ep/webrtc/r/peerIceCandidates"} | ||
| | | | | ||
| |── writeResource | | | ||
| | (peerIceCandidates) ──► |── (ICE candidates) ───► | | ||
| | | ◄── (ICE candidates) ── | | ||
| | ◄── cameraIceCandidates ── | | | ||
| | | | | ||
| | ◄── sessionStatus event ── | | | ||
| | value: "connected" | | | ||
| | meta: {sessionId: "abc"} | | | ||
| ``` | ||
|
|
||
| **Session flow example (OpenHome)**: | ||
| ``` | ||
| Client Driver Camera | ||
| | | | | ||
| |── executeResource | | | ||
| | (startSession, {}) ──► | | | ||
| | |── createMediaTunnel() ─► | | ||
| | | | | ||
| | ◄── sessionStatus event ── | | | ||
| | value: "connected" | | | ||
| | meta: {sessionId: "abc", technology: "openhome",| | ||
| | nextAction: "/dev/ep/openhome/r/tunnelUrl"} | ||
| | | | | ||
| | ◄── tunnelUrl event ────── | | | ||
| | value: "rtsp://..." | | |
There was a problem hiding this comment.
What happens if a client calls startSession when a session is already active?
| - **[Metadata is untyped]** → Mitigation: Well-documented JSON schema, string constants in `commonDeviceDefs.h`, and integration tests that verify metadata content. Strongly typed metadata is a future capability. | ||
| - **[SBMD may not support the session pattern]** → Mitigation: SBMD feasibility for Matter cameras is explicitly out of scope. The Matter camera driver may need to be a native C/C++ driver rather than SBMD. This is evaluated in the webrtc-endpoint-profile task. | ||
| - **[Parallel sessions deferred]** → Mitigation: The contract is designed for parallel sessions (sessionId in metadata, events as source of truth). Only the implementation is single-session MVP. No contract changes needed later. | ||
| - **[OpenHome adapter complexity]** → Mitigation: The OpenHome driver already implements `createMediaTunnel`. The adapter wraps this in the session contract — `startSession` calls `createMediaTunnel` internally and emits `sessionStatus` events. Incremental, low-risk change. |
There was a problem hiding this comment.
It says elsewhere that existing OpenHome camera clients call createMediaTunnel directly so that the old path continues to work unchanged. What happens if both paths are exercised? Can that even happen?
EDIT: This highlighted a lot of lines. I'm only concerned with line 177 here.
| **Constants** (added to `commonDeviceDefs.h`): | ||
| ```c | ||
| #define CAMERA_SESSION_METADATA_SESSION_ID "sessionId" | ||
| #define CAMERA_SESSION_METADATA_TECHNOLOGY "technology" |
There was a problem hiding this comment.
We should probably define constants for metadata values as well, e.g. a constant for the technology value "webrtc" and so on.
|
|
||
| ## What Changes | ||
|
|
||
| - **Abstract camera endpoint profile** (`profile: "camera"`): Define three resources on the camera endpoint — `startSession` [execute], `stopSession` [execute], and `sessionStatus` [emit events only]. `sessionStatus` is the "traffic controller" that directs clients through the session flow. Events carry metadata with `sessionId`, `technology`, and `nextAction` (full URI to the next resource the client should interact with). |
There was a problem hiding this comment.
nextAction (full URI to the next resource the client should interact with).
Interact with how? For example, if nextAction points to peerSdp, a write resource, then we can surmise that the interaction is a write (and indeed it is). What about an event resource like tunnelUrl? Subscription? Or is it just simply informing the client, "hey, I'm sending the tunnel URL your way via event right after this"? Maybe this is obvious/self-evident but it's just unclear to me.
Add openspec change for a protocol-agnostic camera session
contract to support Matter 1.5 WebRTC cameras alongside
existing OpenHome cameras. Two-layer endpoint architecture:
sessionStatus) drives session lifecycle via events
signaling details per technology
metadata mechanism (cJSON convention, not typed)
Design decisions cover endpoint structure, event metadata
schema, events-as-source-of-truth, driver-determined
protocol selection, backward compatibility with existing
OpenHome camera profile, and Matter 1.5 WebRTC transport
direction.
Implementation is scoped as two follow-on stories, each
going through its own openspec explore/propose/apply cycle:
the contract and prove it e2e on existing hardware
server hosting, signaling, and media stack (depends
on story 1)
BARTON-334, BARTON-354