Skip to content

Visualizer support object and binary data types#28

Open
Aircoookie wants to merge 11 commits into
Sendspin:mainfrom
Aircoookie:visualizer-details
Open

Visualizer support object and binary data types#28
Aircoookie wants to merge 11 commits into
Sendspin:mainfrom
Aircoookie:visualizer-details

Conversation

@Aircoookie

Copy link
Copy Markdown

This is an initial draft of the proposed visualizer type specification, including the visualizer_support JSON object, as well as the binary message data type definitions.

Feedback is highly appreciated!

Updated the visualizer support object structure and binary type details.

@maximmaxim345 maximmaxim345 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Aircoookie!

Thanks for the PR, sorry for taking so long to respond to you.

I have some suggestions on the architecture:

Splitting into multiple roles instead of tag-value encoding

Instead of a single visualizer role with tag-value encoded binary messages, I think it would be cleaner to split this into separate roles: visualize_beats, visualize_loudness, visualize_highest_amplitude, visualize_spectrum.

Benefits:

  • Each role would have its own binary message type (we have 44 reserved message IDs for new roles)
  • No need for tag-value encoding - the message format would be self-describing based on the message type
  • Clients can still subscribe only to the specific visualizations they need
  • Future-proofing: with PR #42, each visualization type can be versioned independently (e.g., visualize_spectrum@v2) without affecting the others

The only drawback I can think of is a slight overhead from receiving more WebSocket messages instead of bundled data - maybe resulting in at most 100 extra bytes per second, which isn't worth worrying about IMO.

Redundant bin count in Spectrum data

For the Spectrum/FFT data, the leading byte containing the bin count n seems redundant:

  • The client already knows n_disp_bins from the stream/start message
  • With separate roles, the client can calculate it from the WebSocket message length

Other than that, looks good to me!

maximmaxim345 added a commit to Sendspin/aiosendspin that referenced this pull request Mar 19, 2026
…_r1`) (#163)

Implement the `visualizer@_draft_r1` role ([spec PR
#28](Sendspin/spec#28)).

Fully implements the draft visualizer spec except:
- **Beats** (`beat` type / message type 17): beat extraction and sending
are not yet implemented.
- **Frame batching**: `batch_max` is negotiated in `client/hello` and
echoed in `stream/start`, but the server always sends one frame per
binary message.
- **`stream/request-format`**: acknowledged and logged but ignored
(matches spec where request-format is still marked as TODO).

Also refactors role support spec registration out of `connection.py`
into the role registry, so each role family self-registers its support
parser.

`numpy` is now a required dependency (was optional) so that clients can
depend on the visualizer role being available.
maximmaxim345 added a commit to Sendspin/sendspin-cli that referenced this pull request Mar 25, 2026
## Summary
Adds a real-time frequency spectrum visualizer to the TUI, rendered
below the info panels.
While the final `visualizer` role isn't part of the Sendspin Spec yet.
This uses the first WIP version of the role (with role id
`visualizer@_draft_r1`) from [this
PR](Sendspin/spec#28), available in Music
Assistant 2.8.

The spectrum and loudness data is computed on the server, and then sent
through the Sendspin protocol.

Toggle it by pressing the `v` key.

## Screenshot

<img width="2424" height="1046" alt="image"
src="https://github.com/user-attachments/assets/fa8ca719-1046-4e14-b57e-20d70b2025a9"
/>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@jhollowe jhollowe mentioned this pull request May 28, 2026
kahrendt added a commit that referenced this pull request Jun 11, 2026
Adds the visualizer role, based on [the previous `visualizer@_draft_r1`
proposal](#28).

This alternative version aims to resolve a couple of gaps that were open
in the previous PR:

## One frame per binary message
Batching multiple frames of mixed types into one WebSocket message, as
in the previous proposal, forces the server to either delay frames
waiting for siblings (hurting low-latency playback) or send tiny batches
anyway. It also makes ordering awkward across batches and pushes sort
work onto the client. Per-type messages keep ordering trivial. Also more
consistent with how other roles structure their binary messages.

The only concern with this approach is the increased number of messages.
We could alternatively batch multiple message types with the same
timestamp together.

## Downbeat flag on `beat`
Lets clients drive bar-aware effects. `stream/start` advertises
`tracks_downbeats` so clients know whether to trust the bit. Accurate
beat detection is hard and often relies on offline analysis, so servers
without it omit the `beat` type entirely. Even when supported, it may be
unavailable for some content (live streams, sparse non-percussive
material).

## Top-level `rate_max`
Now bounds all periodic types, not just `spectrum`. `beat` and `peak`
are event-driven and unthrottled.

## Scaling
Pins down what was previously hand-waved as "perceptual weighting" so
implementations agree on the numbers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants