CLI + streaming service for the DJI Mic 3 wireless microphone system.
Detect receivers and transmitters, capture live audio, manage recordings, transcribe with Whisper, and stream everything over HTTP and WebSocket.
cargo install --path .
Requires a Rust toolchain. Builds a single auracle binary.
# Initialize data directory
auracle init
# Check what's connected
auracle detect
# Start listening — captures audio and transcribes continuously
auracle serve --auto-listenThat's the main workflow. Plug in the DJI receiver, run auracle serve --auto-listen, and transcriptions appear as the mic picks up speech. Everything else below is details.
auracle stores everything under ~/.auracle/:
~/.auracle/
├── config.toml # service_url, http_port, ws_port
├── captures/ # live recordings from receiver
├── recordings/ # copied from transmitter storage
└── transcriptions/ # whisper output
Set AURACLE_HOME to use a different location:
AURACLE_HOME=/tmp/auracle-test auracle serveAll commands auto-create the directory structure if missing.
~/.auracle/config.toml:
# Transcription service URL (Whisper HTTP API)
service_url = "http://localhost:8765"
# HTTP API port for auracle serve
http_port = 3131
# WebSocket port for auracle serve
ws_port = 3132CLI flags and environment variables override config.toml:
| Setting | Flag | Env | Config key |
|---|---|---|---|
| Transcription URL | --service-url |
TRANSCRIPTION_SERVICE_URL |
service_url |
| Data directory | --workspace |
AURACLE_HOME |
(directory itself) |
| HTTP port | --http-port |
http_port |
|
| WebSocket port | --ws-port |
ws_port |
All commands output JSON. Most commands default to ~/.auracle/ — no paths required for common operations.
# Scan for DJI Mic 3 (storage mounts + audio devices)
auracle detect
# List all audio input devices
auracle devices# Record from DJI receiver (auto-detected), stop with Ctrl-C
auracle capture
# Record for 60 seconds from a specific device
auracle capture --duration 60 --device "Wireless Mic"
# Record to a specific file
auracle capture --output meeting.wavAll commands default to ~/.auracle/ when no path is given.
# List everything (captures + recordings)
auracle list
# Just captures
auracle list --captures
# Just recordings copied from transmitter
auracle list --recordings
# Sort by duration, filter to originals only
auracle list --sort duration --file-type original
# List from a specific directory
auracle list /path/to/wavs
# Detailed info for one file
auracle info recording.wav# Copy from transmitter storage to ~/.auracle/recordings/, organized by date
auracle copy E:\
# Organize by channel (TX1, TX2, etc.)
auracle copy E:\ --organize channel
# Only original (unprocessed) files
auracle copy E:\ --file-type original
# Custom destination
auracle copy E:\ --dest /path/to/destRequires a Whisper HTTP service running (see Transcription Service below).
# Transcribe the latest capture (no path needed)
auracle transcribe
# Transcribe a specific file
auracle transcribe recording.wav
# Transcribe a whole directory
auracle transcribe /path/to/wavs/
# Transcribe all captures
auracle batch
# Transcribe all captures from a specific directory
auracle batch /path/to/wavs
# Check job status
auracle check <job_id>
# Check service health
auracle health
# Copy + organize + transcribe in one shot
auracle pipeline E:\
auracle pipeline E:\ --organize date --file-type original# Start HTTP + WebSocket servers
auracle serve
# With auto-listen (the main mode)
auracle serve --auto-listen
# Custom ports
auracle serve --http-port 8080 --ws-port 8081
# Point at a different Whisper service
auracle serve --auto-listen --service-url http://my-gpu-box:8765auracle serve --auto-listen is the primary mode. It turns the DJI receiver into a continuous transcription feed:
- Auto-detects the DJI mic (polls every 2s if not connected yet)
- Records rolling 5-second segments with 1-second overlap — the last 1s of each segment is repeated as the first 1s of the next, so no words are lost at boundaries
- Each segment is saved to
~/.auracle/captures/aslisten_YYYYMMDD_HHMMSS_NNNN.wav - Each segment is submitted to Whisper on a background thread (non-blocking — audio capture never stalls)
- A watcher thread polls Whisper every 2s per job, broadcasting progress and results
- Transcription results are broadcast to all connected WebSocket clients
- Live audio is also streamed to WebSocket clients as binary frames
Logs show the pipeline working:
Auto-listen: Microphone (Wireless Mic Rx) (2 ch, 48000 Hz) — 5s segments, 1s overlap
Auto-listen: segment 0 → listen_20260303_222717_0000.wav (5.0s)
Auto-listen: submitted listen_20260303_222717_0000.wav → job 5be91947805d41ed
Auto-listen: segment 1 → listen_20260303_222721_0001.wav (5.0s)
Auto-listen: submitted listen_20260303_222721_0001.wav → job a3f2c8e901b74d56
Default port: 3131
| Method | Path | Description |
|---|---|---|
| GET | / |
Service info (version, ports, status) |
| GET | /health |
Transcription service health check |
| GET | /devices |
List audio input devices |
| GET | /detect |
Scan for DJI connections |
| GET | /recordings |
List files in ~/.auracle/recordings/ |
| GET | /captures |
List files in ~/.auracle/captures/ |
| POST | /capture/start?duration=30 |
Start capture (optional duration in seconds) |
| POST | /capture/stop |
Stop active capture |
| GET | /capture/status |
Current capture state |
| POST | /transcribe?file=/path/to/file.wav |
Submit file for transcription |
| GET | /transcribe/{job_id} |
Check transcription status |
| GET | /stream/{filename}?offset=30 |
Serve 10-second WAV chunk at offset |
| GET | /clip/{filename}?start=10&end=20 |
Download WAV segment |
All JSON responses. /stream/ and /clip/ return audio/wav. CORS enabled.
curl localhost:3131/
curl localhost:3131/health
curl localhost:3131/captures
curl localhost:3131/devices
curl -X POST localhost:3131/capture/start?duration=10
curl localhost:3131/capture/status
curl -X POST localhost:3131/capture/stopDefault port: 3132
Connect to ws://localhost:3132. On connect:
{"type":"connected","version":"0.2.0","capture_active":false}Text frames (JSON events):
{"type":"capture_started","file":"capture_20260303_141500.wav"}
{"type":"capture_stopped","file":"capture_20260303_141500.wav","duration":45.3}
{"type":"transcription_queued","job_id":"abc123","file":"..."}
{"type":"transcription_progress","job_id":"abc123","percent":0.45}
{"type":"transcription_complete","job_id":"abc123","text":"Hello world..."}
{"type":"transcription_failed","job_id":"abc123","error":"..."}
{"type":"playback_started","file":"capture.wav","offset":0}
{"type":"playback_complete","file":"capture.wav"}Binary frames (live audio + playback):
Offset Size Type Field
0 4 u32 LE chunk_index
4 4 u32 LE sample_rate
8 2 u16 LE channels
10 2 u16 LE reserved (0)
12 N*4 f32 LE samples
Send JSON text frames:
{"cmd":"start_capture","duration":60,"device":"DJI"}
{"cmd":"stop_capture"}
{"cmd":"play","file":"capture_20260303.wav","offset":0}
{"cmd":"stop_playback"}# With wscat (npm install -g wscat)
wscat -c ws://localhost:3132
# With websocat
websocat ws://localhost:3132You'll see transcription results arrive as JSON lines in real time.
auracle talks to any HTTP service implementing these endpoints:
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Returns JSON with service status |
/transcribe |
POST | Multipart upload: file (WAV), job_id, model |
/status/{job_id} |
GET | Returns {"status":"completed|processing|failed","progress":0.5} |
/download/{job_id} |
GET | Returns transcript text |
The gnosis transcription service (Whisper large-v3 on CUDA) implements this interface. Run it with Docker:
docker run -d \
--gpus all \
-p 8765:8765 \
gnosis/transcription-service:latestThen set service_url = "http://localhost:8765" in config.toml (this is the default).
No async runtime. Pure threads + crossbeam channels, same model as gnosis-radio.
┌─────────────┐
│ cpal audio │──→ crossbeam channel ──→ consumer thread
│ callback │ (lock-free) │
└─────────────┘ ├──→ WAV writer (hound)
└──→ AudioBroadcaster
│
┌─────────────────┼──────────────┐
▼ ▼ ▼
WS client 1 WS client 2 WS client N
Auto-listen adds:
consumer thread ──→ 5s segment buffer ──→ WAV file
│
└──→ background thread
├──→ POST to Whisper
└──→ poll status → broadcast result
Key design choices:
- cpal callback is lock-free — samples go through a crossbeam channel, never blocks the audio thread
- Whisper uploads are non-blocking — each segment spawns its own thread to upload and poll
- Overlap prevents word loss — 1s of audio is shared between adjacent segments
- Broadcaster auto-prunes — disconnected WebSocket clients are cleaned up automatically
| File | Purpose |
|---|---|
main.rs |
CLI parsing (clap), command routing, serve orchestration |
home.rs |
~/.auracle/ directory management, config.toml parsing |
broadcast.rs |
Fan-out broadcaster: crossbeam-backed, auto-prunes disconnected subscribers |
capture.rs |
cpal audio capture: record() for CLI, record_with_broadcast() for service, auto_listen_loop() for rolling segments |
server.rs |
HTTP server (tiny_http), REST endpoints, WAV chunk serving |
ws.rs |
WebSocket server (tungstenite), per-client threads, binary audio + JSON events |
transcribe.rs |
Whisper integration: submit, check, batch, submit_and_watch() background poller |
detect.rs |
DJI device detection (USB storage mounts + audio devices) |
wav.rs |
WAV metadata, file listing, duration formatting |
copy.rs |
Copy + organize recordings from transmitter storage |
| Crate | Purpose |
|---|---|
| cpal | Cross-platform audio capture |
| hound | WAV read/write |
| crossbeam-channel | Lock-free audio pipeline, broadcaster fan-out |
| tiny_http | HTTP server |
| tungstenite | WebSocket server |
| dirs | Cross-platform home directory |
| ctrlc | Clean shutdown |
| clap | CLI argument parsing |
| ureq | HTTP client (for Whisper API) |
| chrono | Timestamps |
- Receiver appears as
Microphone (Wireless Mic Rx)— 2 channels, 48 kHz, F32 - Transmitter storage: 32 GB, WAV files (24-bit or 32-bit float), one file per 30 minutes
- File naming:
_origsuffix = raw recording, others = processed (noise cancellation/gain) - Supports up to 4 transmitters;
--organize channelsorts by transmitter number