Skip to content

TheStageAI/TheStageAI.AndroidSDK

Repository files navigation

TheStage AI SDK

On-device AI inference SDK for Android. Runs models locally on Qualcomm Snapdragon NPU (Hexagon) via ONNX Runtime + QNN Execution Provider, with LiteRT-LM and Genie LLM backends. No server dependency for inference.

This repository contains:

  • TheStageCore.aar -- pre-built SDK binary
  • onnxruntime-android.aar -- ONNX Runtime + QNN Execution Provider
  • onnxruntime-genai-android.aar -- ORT-genai (LLM inference)
  • Flutter plugin (plugin/qlip_sdk) -- Flutter integration via method channels
  • Example apps:
    • examples/voice_transcribe -- Whisper speech-to-text with hold-to-talk mic input
    • examples/tts_app -- NeuTTS text-to-speech with on-device LLM + NeuCodec audio decoder

Table of Contents


Prerequisites

Requirement Version
Android compileSdk 35
Android minSdk 28
JDK 17
Kotlin 2.1+
Flutter 3.22+
QAIRT SDK 2.42.0
  • Physical Snapdragon device required. The SDK targets the Hexagon NPU via QNN; the emulator and non-Snapdragon devices fall back to CPU only.
  • API token -- required for SDK initialization. Generate one at app.thestage.ai.
  • Qualcomm AI Runtime SDK (QAIRT) 2.42.0 -- free download from Qualcomm. Required to run inference on Qualcomm Snapdragon devices -- it provides the QNN backend libraries (libGenie.so, libQnnCpu.so, plus the HTP/NPU stack) that the SDK loads at runtime to dispatch ops to the Hexagon NPU. The shipped AARs are deliberately Qualcomm-clean and don't redistribute QAIRT binaries; each integrator installs QAIRT once locally and accepts Qualcomm's license at install time. setup.sh then copies the two libs that aren't on Maven into the plugin.

Installing Flutter

If you don't have Flutter installed yet:

# macOS (Homebrew)
brew install --cask flutter

# Linux (snap)
sudo snap install flutter --classic

# Verify installation
flutter doctor

Alternatively, follow the official guide: https://docs.flutter.dev/get-started/install

Make sure flutter doctor reports a working Android toolchain (Android Studio + SDK 35 + a connected device). On first run, accept the Android licenses:

flutter doctor --android-licenses

Quick Start

1. Clone and run setup

git clone https://github.com/TheStageAI/TheStageAI.AndroidSDK.git
cd TheStageAI.AndroidSDK

# Point QAIRT at your local install (adjust path).
export QAIRT=~/Qualcomm/AIStack/QAIRT/2.42.0.251225

./scripts/setup.sh

setup.sh:

  • Symlinks TheStageCore.aar, onnxruntime-android.aar, and onnxruntime-genai-android.aar into the Flutter plugin (single source of truth -- the example app and any consuming app reference them from there).
  • Copies libGenie.so and libQnnCpu.so from $QAIRT into the plugin's src/main/jniLibs/arm64-v8a/. Library-module jniLibs are auto-merged into the consuming app's APK, so this works for the example app and any app that depends on the plugin.

Everything else -- libQnnHtp.so, skels/stubs, libonnxruntime.so, etc. -- is pulled in automatically via the com.qualcomm.qti:qnn-runtime Maven dependency and the bundled ORT AARs.

2. Set your API token

Export the token in your shell -- the example apps read it via --dart-define:

export TOKEN="your-thestage-api-token"

3. Build and run

cd examples/voice_transcribe

flutter build apk --release --dart-define=QLIP_API_TOKEN="$TOKEN"
flutter install --release

Use --debug for Flutter UI hot-reload while iterating; --release gives realistic inference timings.

To pick a specific connected device:

flutter devices
flutter run --release --dart-define=QLIP_API_TOKEN="$TOKEN" -d <DEVICE_ID>

Launch the app, grant microphone permission, and hold the button to record. First launch downloads the Whisper engine bundle from Hugging Face (~5 min over Wi-Fi); subsequent launches hit a local cache.


API Token

An API token is required to use the SDK. Generate one at app.thestage.ai and pass it during initialization. The token is validated once on first model start; all subsequent operations run offline.

In the example apps the token is read from the Dart compile-time environment variable QLIP_API_TOKEN (set via --dart-define=QLIP_API_TOKEN=...).


Using in Your Own App

1. Add the plugin dependency

# pubspec.yaml
dependencies:
  qlip_sdk:
    path: /path/to/this-repo/plugin/qlip_sdk

2. Wire the AARs

Run ./scripts/setup.sh (from this repo) to populate the plugin once -- it symlinks the AARs and copies the QAIRT runtime libs from $QAIRT into the plugin's jniLibs/.

The Android library module can't re-export local AARs to the consuming app, so your app must reference them on its runtime classpath -- but it can point straight at the plugin's libs/ directory rather than keeping a second copy:

// android/app/build.gradle.kts
dependencies {
    val pluginLibs =
        "../../../../plugin/qlip_sdk/android/libs"
    implementation(files("$pluginLibs/TheStageCore.aar"))
    implementation(
        files("$pluginLibs/onnxruntime-android.aar")
    )
    implementation(
        files("$pluginLibs/onnxruntime-genai-android.aar")
    )
    implementation("com.qualcomm.qti:qnn-runtime:2.42.0")
}

Adjust the pluginLibs relative path to match where the SDK repo sits next to your app. The QAIRT native libs ride along automatically via the plugin's jniLibs/ -- no per-app copying needed.

Other settings your app needs:

  • compileSdk = 35, minSdk = 28, targetSdk = 35
  • JDK 17 toolchain
  • Add ARM64 ABI filter:
android {
    defaultConfig {
        ndk { abiFilters += "arm64-v8a" }
    }
}

3. Use the Dart API

import 'package:qlip_sdk/qlip_sdk.dart';

// Initialize the SDK (call once at app start).
await QlipSdk.initialize(apiToken: 'YOUR_API_TOKEN');

// Start a model.
await QlipSdk.startModel(
  modelType: 'whisper',
  modelName: 'whisper',
  enginesPath: '/data/local/tmp/whisper_engines',
  device: 'npu',
);

// Run inference.
final results = await QlipSdk.infer(
  modelName: 'whisper',
  inputJson: {
    'audio': pcm16kFloatSamples,  // 16 kHz mono float
    'language': 'en',
  },
);
final transcript = results.first['transcription'] as String;

// Stop when done.
await QlipSdk.stopModel(modelName: 'whisper');

4. Build

flutter clean
flutter run --release --dart-define=QLIP_API_TOKEN="$TOKEN"

Example Apps

voice_transcribe

Hold-to-record speech-to-text. The app auto-detects the device's SoC (Build.SOC_MODEL), downloads the matching engine bundle from TheStageAI/Elastic-whisper-large-v3-turbo on Hugging Face, and runs Whisper encoder + decoder on the NPU.

Benchmarks on Samsung S25 Ultra (SM8750 / Hexagon v79):

Stage Time
Encoder (NPU) 338 ms
Decoder (NPU, 21 tokens) 239 ms
Total inference ~0.6 s

See examples/voice_transcribe/README.md for overrides (local engine path, custom HF repo, ...).

tts_app

On-device text-to-speech built on NeuTTS. Uses the same SoC auto-detect + HF bundle flow and streams synthesised audio from a short text prompt. Bundle pulled from TheStageAI/neutts. Toggle Boost CPU in the UI for a tps boost.


Platform Support

SoC Variant tag Devices
Snapdragon 8 Elite qualcomm_sm8750 Samsung S25, S25 Ultra, S25 Edge
Snapdragon 8 Gen 3 qualcomm_sm8650 Samsung S24 family, OnePlus 12

Other Snapdragon SKUs are detected automatically at runtime (Build.SOC_MODEL) and request a correspondingly-tagged bundle from Hugging Face. If the exact variant isn't published, the SDK falls back to a cpu bundle when available.


Project Structure

.
├── README.md
├── TheStageCore.aar                Pre-built SDK binary
├── onnxruntime-android.aar         ORT + QNN EP
├── onnxruntime-genai-android.aar   ORT-genai (LLM inference)
│
├── plugin/qlip_sdk/                Flutter plugin (Android)
│   ├── lib/qlip_sdk.dart           Dart API
│   └── android/                    Plugin Kotlin code
│
├── examples/
│   └── voice_transcribe/           Whisper STT (hold-to-talk)
│
└── scripts/setup.sh                One-time AAR symlink wiring

Troubleshooting

Symptom Fix
Could not find TheStageCore.aar Run ./scripts/setup.sh to create the AAR symlinks
dlopen failed: library "libQnnCpu.so" not found Vendor QAIRT libs into android/app/src/main/jniLibs/arm64-v8a/ (Quick Start step 2)
dlopen failed: library "libGenie.so" not found Same as above
INSTALL_FAILED_NO_MATCHING_ABIS Add ndk { abiFilters += "arm64-v8a" } in your app's build.gradle.kts
flutter doctor complains about Android licenses flutter doctor --android-licenses and accept

License

The SDK code in TheStageCore.aar ships under TheStage AI's license; see LICENSE alongside this file (if present) or contact TheStage AI. Third-party components:

  • ONNX Runtime -- MIT (redistribution permitted).
  • ORT-genai -- MIT.
  • QAIRT runtime libraries -- Qualcomm AI Stack Software License Agreement; each integrator installs QAIRT themselves and accepts those terms at install time.

iOS counterpart

The same SDK is available on iOS as TheStageAI.AppleSDK. The Flutter plugin Dart API (qlip_sdk.dart) and the platform channel names are identical, so Dart code is portable across both platforms.

About

TheStage AI SDK for Android

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors