ReadMe

A lightweight Chrome/Edge extension that records audio and transcribes it in real time.

Installation • Features • Providers • Development • License

Installation

For Users

Download the latest readme-extension.zip from the Releases page
Unzip the file
Open chrome://extensions (or edge://extensions)
Enable Developer mode (toggle in the top-right corner)
Click Load unpacked and select the unzipped folder
Click the ReadMe icon in your toolbar to get started

Initial Setup

Open the extension popup → Settings → Manage API key
Choose an STT provider and enter your API key
Save — you're ready to record

No API key? Select the Mock provider to test the full pipeline with placeholder transcriptions.

Features

Feature	Description
Multi-source recording	Capture from microphone, browser tab audio, or both simultaneously
Real-time transcription	Audio is chunked and transcribed on the fly as you record
Batch fallback	If live transcription misses anything, a batch pass with overlap deduplication fills the gaps
AI summary	One-click summary powered by GPT-4o-mini — key points and action items extracted automatically
Session management	Full recording history stored locally with search and playback
Export	Download transcripts as `.txt`, `.md`, or `.srt` (subtitles with timestamps)
Audio playback	Replay any recording directly in the popup with seek controls
i18n	English and Chinese (中文) UI
Themes	Light and dark mode
Privacy-first	All data stays in your browser; only audio is sent to your configured STT provider

Supported Providers

Provider	Model	Notes
OpenAI Whisper	`whisper-1`	Multilingual, widely supported
Deepgram	`nova-2`	Fast, smart formatting
SiliconFlow	SenseVoice	Good for Chinese audio
Mock	—	Offline testing, no API key needed

AI summaries require an OpenAI API key (uses gpt-4o-mini).

How It Works

Microphone / Tab Audio / Mix
        │
        ▼
  MediaRecorder (30-second chunks)
        │
        ├──▶ IndexedDB (persistent storage)
        │
        ▼
  Live Transcription Queue
  (every ~60 seconds → STT API)
        │
        ▼
  Overlap Deduplication
        │
        ▼
  Transcript + AI Summary

Audio is recorded in 30-second WebM chunks and persisted to IndexedDB immediately
Every 2 chunks (~60s), a batch is sent to your STT provider for transcription
Adjacent batches overlap slightly; a deduplication algorithm removes repeated text
If any batch fails during recording, it is retried automatically when you stop
Recordings up to 4 hours / 500 MB are supported

Development

Prerequisites

Node.js (v18+)
npm

Setup

cd extension
npm install
npm run build

Load in Browser

Open chrome://extensions (or edge://extensions)
Enable Developer mode
Click Load unpacked → select the extension folder
After each rebuild, click Reload on the extension card

Scripts

Command	Description
`npm run dev`	Vite dev server with hot reload
`npm run build`	Type-check (`tsc`) + production build
`npm test`	Run unit tests (Vitest)
`npm run test:watch`	Run tests in watch mode
`npm run lint`	Lint with ESLint
`npm run format`	Format with Prettier

Tech Stack

UI: React 18 + TypeScript
Build: Vite 5
Extension: Chrome Manifest V3
Storage: IndexedDB (sessions & audio) + chrome.storage.local (settings)
Audio: Offscreen document with MediaRecorder API
Testing: Vitest + Testing Library

Project Structure

extension/
├── src/
│   ├── App.tsx                    # Popup UI (3-tab layout)
│   ├── options.tsx                # Options page (API key management)
│   ├── service_worker.ts          # MV3 background service worker
│   ├── offscreen/
│   │   ├── recording.ts           # Audio capture pipeline
│   │   ├── live-transcribe.ts     # Real-time transcription queue
│   │   ├── segmentation.ts        # WebM segmentation for batch mode
│   │   ├── transcription.ts       # Batch transcription processor
│   │   └── state.ts               # Offscreen state management
│   ├── stt/
│   │   ├── whisper.ts             # OpenAI Whisper client
│   │   ├── deepgram.ts            # Deepgram client
│   │   └── llm.ts                 # GPT-4o-mini summary generation
│   ├── components/
│   │   ├── TranscriptionView.tsx  # Recording controls & live transcript
│   │   ├── NotesView.tsx          # Session list, export, AI summary
│   │   ├── SettingsView.tsx       # Theme, language, provider settings
│   │   └── AudioPlayer.tsx        # Playback with seek controls
│   ├── db/
│   │   └── indexeddb.ts           # IndexedDB schema & CRUD
│   ├── utils/
│   │   ├── dedup.ts               # Overlap deduplication (CJK-aware)
│   │   ├── export.ts              # TXT / Markdown / SRT formatters
│   │   └── webm.ts                # WebM binary parsing
│   └── i18n.ts                    # Translations (en / zh)
├── public/
│   ├── manifest.json              # Extension manifest
│   └── icons/                     # Extension icons
└── dist/                          # Build output

Name		Name	Last commit message	Last commit date
Latest commit History 282 Commits
docs		docs
extension		extension
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
PRIVACY_POLICY.md		PRIVACY_POLICY.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReadMe

Installation

For Users

Initial Setup

Features

Supported Providers

How It Works

Development

Prerequisites

Setup

Load in Browser

Scripts

Tech Stack

Project Structure

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReadMe

Installation

For Users

Initial Setup

Features

Supported Providers

How It Works

Development

Prerequisites

Setup

Load in Browser

Scripts

Tech Stack

Project Structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages