Skip to content

DeepBlueDynamics/auracle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

auracle

CLI + streaming service for the DJI Mic 3 wireless microphone system.

Detect receivers and transmitters, capture live audio, manage recordings, transcribe with Whisper, and stream everything over HTTP and WebSocket.

Install

cargo install --path .

Requires a Rust toolchain. Builds a single auracle binary.

Quick Start

# Initialize data directory
auracle init

# Check what's connected
auracle detect

# Start listening — captures audio and transcribes continuously
auracle serve --auto-listen

That's the main workflow. Plug in the DJI receiver, run auracle serve --auto-listen, and transcriptions appear as the mic picks up speech. Everything else below is details.

Data Directory

auracle stores everything under ~/.auracle/:

~/.auracle/
├── config.toml          # service_url, http_port, ws_port
├── captures/            # live recordings from receiver
├── recordings/          # copied from transmitter storage
└── transcriptions/      # whisper output

Set AURACLE_HOME to use a different location:

AURACLE_HOME=/tmp/auracle-test auracle serve

All commands auto-create the directory structure if missing.

Configuration

~/.auracle/config.toml:

# Transcription service URL (Whisper HTTP API)
service_url = "http://localhost:8765"

# HTTP API port for auracle serve
http_port = 3131

# WebSocket port for auracle serve
ws_port = 3132

CLI flags and environment variables override config.toml:

Setting Flag Env Config key
Transcription URL --service-url TRANSCRIPTION_SERVICE_URL service_url
Data directory --workspace AURACLE_HOME (directory itself)
HTTP port --http-port http_port
WebSocket port --ws-port ws_port

CLI Commands

All commands output JSON. Most commands default to ~/.auracle/ — no paths required for common operations.

Device Detection

# Scan for DJI Mic 3 (storage mounts + audio devices)
auracle detect

# List all audio input devices
auracle devices

Recording

# Record from DJI receiver (auto-detected), stop with Ctrl-C
auracle capture

# Record for 60 seconds from a specific device
auracle capture --duration 60 --device "Wireless Mic"

# Record to a specific file
auracle capture --output meeting.wav

Listing Files

All commands default to ~/.auracle/ when no path is given.

# List everything (captures + recordings)
auracle list

# Just captures
auracle list --captures

# Just recordings copied from transmitter
auracle list --recordings

# Sort by duration, filter to originals only
auracle list --sort duration --file-type original

# List from a specific directory
auracle list /path/to/wavs

# Detailed info for one file
auracle info recording.wav

Copying from Transmitter

# Copy from transmitter storage to ~/.auracle/recordings/, organized by date
auracle copy E:\

# Organize by channel (TX1, TX2, etc.)
auracle copy E:\ --organize channel

# Only original (unprocessed) files
auracle copy E:\ --file-type original

# Custom destination
auracle copy E:\ --dest /path/to/dest

Transcription

Requires a Whisper HTTP service running (see Transcription Service below).

# Transcribe the latest capture (no path needed)
auracle transcribe

# Transcribe a specific file
auracle transcribe recording.wav

# Transcribe a whole directory
auracle transcribe /path/to/wavs/

# Transcribe all captures
auracle batch

# Transcribe all captures from a specific directory
auracle batch /path/to/wavs

# Check job status
auracle check <job_id>

# Check service health
auracle health

# Copy + organize + transcribe in one shot
auracle pipeline E:\
auracle pipeline E:\ --organize date --file-type original

Streaming Service

# Start HTTP + WebSocket servers
auracle serve

# With auto-listen (the main mode)
auracle serve --auto-listen

# Custom ports
auracle serve --http-port 8080 --ws-port 8081

# Point at a different Whisper service
auracle serve --auto-listen --service-url http://my-gpu-box:8765

Auto-Listen Mode

auracle serve --auto-listen is the primary mode. It turns the DJI receiver into a continuous transcription feed:

  1. Auto-detects the DJI mic (polls every 2s if not connected yet)
  2. Records rolling 5-second segments with 1-second overlap — the last 1s of each segment is repeated as the first 1s of the next, so no words are lost at boundaries
  3. Each segment is saved to ~/.auracle/captures/ as listen_YYYYMMDD_HHMMSS_NNNN.wav
  4. Each segment is submitted to Whisper on a background thread (non-blocking — audio capture never stalls)
  5. A watcher thread polls Whisper every 2s per job, broadcasting progress and results
  6. Transcription results are broadcast to all connected WebSocket clients
  7. Live audio is also streamed to WebSocket clients as binary frames

Logs show the pipeline working:

Auto-listen: Microphone (Wireless Mic Rx) (2 ch, 48000 Hz) — 5s segments, 1s overlap
Auto-listen: segment 0 → listen_20260303_222717_0000.wav (5.0s)
Auto-listen: submitted listen_20260303_222717_0000.wav → job 5be91947805d41ed
Auto-listen: segment 1 → listen_20260303_222721_0001.wav (5.0s)
Auto-listen: submitted listen_20260303_222721_0001.wav → job a3f2c8e901b74d56

HTTP API

Default port: 3131

Method Path Description
GET / Service info (version, ports, status)
GET /health Transcription service health check
GET /devices List audio input devices
GET /detect Scan for DJI connections
GET /recordings List files in ~/.auracle/recordings/
GET /captures List files in ~/.auracle/captures/
POST /capture/start?duration=30 Start capture (optional duration in seconds)
POST /capture/stop Stop active capture
GET /capture/status Current capture state
POST /transcribe?file=/path/to/file.wav Submit file for transcription
GET /transcribe/{job_id} Check transcription status
GET /stream/{filename}?offset=30 Serve 10-second WAV chunk at offset
GET /clip/{filename}?start=10&end=20 Download WAV segment

All JSON responses. /stream/ and /clip/ return audio/wav. CORS enabled.

curl localhost:3131/
curl localhost:3131/health
curl localhost:3131/captures
curl localhost:3131/devices
curl -X POST localhost:3131/capture/start?duration=10
curl localhost:3131/capture/status
curl -X POST localhost:3131/capture/stop

WebSocket Protocol

Default port: 3132

Connect to ws://localhost:3132. On connect:

{"type":"connected","version":"0.2.0","capture_active":false}

Server Messages

Text frames (JSON events):

{"type":"capture_started","file":"capture_20260303_141500.wav"}
{"type":"capture_stopped","file":"capture_20260303_141500.wav","duration":45.3}
{"type":"transcription_queued","job_id":"abc123","file":"..."}
{"type":"transcription_progress","job_id":"abc123","percent":0.45}
{"type":"transcription_complete","job_id":"abc123","text":"Hello world..."}
{"type":"transcription_failed","job_id":"abc123","error":"..."}
{"type":"playback_started","file":"capture.wav","offset":0}
{"type":"playback_complete","file":"capture.wav"}

Binary frames (live audio + playback):

Offset  Size  Type    Field
0       4     u32 LE  chunk_index
4       4     u32 LE  sample_rate
8       2     u16 LE  channels
10      2     u16 LE  reserved (0)
12      N*4   f32 LE  samples

Client Commands

Send JSON text frames:

{"cmd":"start_capture","duration":60,"device":"DJI"}
{"cmd":"stop_capture"}
{"cmd":"play","file":"capture_20260303.wav","offset":0}
{"cmd":"stop_playback"}

Listening to Events

# With wscat (npm install -g wscat)
wscat -c ws://localhost:3132

# With websocat
websocat ws://localhost:3132

You'll see transcription results arrive as JSON lines in real time.

Transcription Service

auracle talks to any HTTP service implementing these endpoints:

Endpoint Method Description
/health GET Returns JSON with service status
/transcribe POST Multipart upload: file (WAV), job_id, model
/status/{job_id} GET Returns {"status":"completed|processing|failed","progress":0.5}
/download/{job_id} GET Returns transcript text

The gnosis transcription service (Whisper large-v3 on CUDA) implements this interface. Run it with Docker:

docker run -d \
  --gpus all \
  -p 8765:8765 \
  gnosis/transcription-service:latest

Then set service_url = "http://localhost:8765" in config.toml (this is the default).

Architecture

No async runtime. Pure threads + crossbeam channels, same model as gnosis-radio.

┌─────────────┐
│  cpal audio  │──→ crossbeam channel ──→ consumer thread
│  callback    │     (lock-free)          │
└─────────────┘                           ├──→ WAV writer (hound)
                                          └──→ AudioBroadcaster
                                               │
                              ┌─────────────────┼──────────────┐
                              ▼                 ▼              ▼
                         WS client 1       WS client 2    WS client N

Auto-listen adds:

consumer thread ──→ 5s segment buffer ──→ WAV file
                                          │
                                          └──→ background thread
                                               ├──→ POST to Whisper
                                               └──→ poll status → broadcast result

Key design choices:

  • cpal callback is lock-free — samples go through a crossbeam channel, never blocks the audio thread
  • Whisper uploads are non-blocking — each segment spawns its own thread to upload and poll
  • Overlap prevents word loss — 1s of audio is shared between adjacent segments
  • Broadcaster auto-prunes — disconnected WebSocket clients are cleaned up automatically

Modules

File Purpose
main.rs CLI parsing (clap), command routing, serve orchestration
home.rs ~/.auracle/ directory management, config.toml parsing
broadcast.rs Fan-out broadcaster: crossbeam-backed, auto-prunes disconnected subscribers
capture.rs cpal audio capture: record() for CLI, record_with_broadcast() for service, auto_listen_loop() for rolling segments
server.rs HTTP server (tiny_http), REST endpoints, WAV chunk serving
ws.rs WebSocket server (tungstenite), per-client threads, binary audio + JSON events
transcribe.rs Whisper integration: submit, check, batch, submit_and_watch() background poller
detect.rs DJI device detection (USB storage mounts + audio devices)
wav.rs WAV metadata, file listing, duration formatting
copy.rs Copy + organize recordings from transmitter storage

Dependencies

Crate Purpose
cpal Cross-platform audio capture
hound WAV read/write
crossbeam-channel Lock-free audio pipeline, broadcaster fan-out
tiny_http HTTP server
tungstenite WebSocket server
dirs Cross-platform home directory
ctrlc Clean shutdown
clap CLI argument parsing
ureq HTTP client (for Whisper API)
chrono Timestamps

DJI Mic 3 Details

  • Receiver appears as Microphone (Wireless Mic Rx) — 2 channels, 48 kHz, F32
  • Transmitter storage: 32 GB, WAV files (24-bit or 32-bit float), one file per 30 minutes
  • File naming: _orig suffix = raw recording, others = processed (noise cancellation/gain)
  • Supports up to 4 transmitters; --organize channel sorts by transmitter number

About

Auricle provides agentic control to the DJI Mic 3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages