Local AI coding assistant for macOS, powered by Apple MLX (Machine Learning eXtensions).
MLX Code runs language models directly on your Mac using Apple Silicon. No cloud, no API keys, no subscriptions. Your code stays on your machine.
MLX Code is a chat-based coding assistant with tool calling. You describe what you need, and the model reads files, searches code, runs commands, and builds your project — all locally.
14 built-in tools:
| Tool | What it does |
|---|---|
| File Operations | Read, write, edit, list, delete files |
| Bash | Run shell commands |
| Grep | Search file contents with regex |
| Glob | Find files by pattern |
| Xcode | Build, test, clean, archive, full deploy pipeline |
| Git | Status, diff, commit, branch, log, push, pull |
| GitHub | Issues, PRs (Pull Requests), branches, credential scanning |
| Code Navigation | Jump to definitions, find symbols |
| Code Analysis | Metrics, dependencies, lint, symbols, full analysis |
| Error Diagnosis | Analyze and explain build errors |
| Test Generation | Create unit tests from source files |
| Diff Preview | Show before/after file changes |
| Context Analysis | Analyze project structure and dependencies |
| Help | List available commands and usage |
Slash commands: /commit, /review, /test, /docs, /refactor, /explain, /optimize, /fix, /search, /plan, /help, /clear
- You type a message (e.g., "Find all TODO comments in the project")
- The model generates a tool call:
<tool>{"name": "grep", "args": {"pattern": "TODO", "path": "."}}</tool> - MLX Code executes the tool and feeds results back to the model
- The model responds with findings or takes the next action
Read-only tools (grep, glob, file read, code navigation) auto-approve. Write/execute tools (bash, file write, xcode build) ask for confirmation.
31 findings resolved across CRITICAL, HIGH, MEDIUM, LOW, and INFO severities:
Critical Fixes:
- API Keys to Keychain: All AI backend API keys (OpenAI, Anthropic, Google, AWS, Azure, IBM) migrated from UserDefaults to macOS Keychain with automatic migration on first launch
High Fixes:
- Command Validator Hardened: Replaced naive
String.contains()withNSRegularExpressionword-boundary matching to prevent bypass via substrings - Python Import Validator: Regex-based import validation with comment filtering prevents bypass via inline comments
- Model Hash Verification: SHA256 verification of downloaded models using CryptoKit
- Buffered I/O: 4096-byte chunk reading replaces byte-by-byte daemon communication for significant performance improvement
- Task Cancellation: All infinite
while trueloops replaced withwhile !Task.isCancelledfor clean shutdown - Portable Paths: Bundle-relative paths replace hardcoded file paths
- Secure Logging: All
print()statements replaced withSecureLoggercalls
Medium Fixes:
- Proper Unicode search with
localizedCaseInsensitiveContains() - O(n) context management replacing O(n^2) insert-at-zero pattern
- 1MB file content cap for memory management
- Multi-version Python path lookup (3.13 down to 3.9)
- Serial queues for thread-safe MLX service operations
- Async logging via serial queue in CommandValidator
- Permission check for script execution
- Regex validation improvements
- Build, test, clean, archive from chat
- Full deploy pipeline: version bump, build, archive, DMG (Disk Image), install
- Error diagnosis with context-aware analysis
- GitHub integration: issues, PRs, branches, credential scanning
- Code analysis: metrics, dependencies, linting, symbol inspection
- Persistent preferences that shape assistant behavior
- 50+ built-in coding standards across 8 categories
- Custom memories stored locally (~/.mlxcode/memories.json)
- Categories: personality, code quality, security, Xcode, git, testing, docs, deployment
- User-specific settings (name, paths) injected at runtime — never hardcoded
- Token budgeting with automatic message compaction
- Project context auto-include when workspace is open
- Two tool tiers: core (always available) and development (when project is open)
MLX Code uses mlx-community models from Hugging Face, quantized for Apple Silicon.
Recommended:
| Model | Size | Context | Best for |
|---|---|---|---|
| Qwen 2.5 7B (default) | ~4 GB | 32K | General coding, tool calling |
| Mistral 7B v0.3 | ~4 GB | 32K | Versatile, good at instructions |
| DeepSeek Coder 6.7B | ~4 GB | 16K | Code-specific tasks |
| Qwen 2.5 14B | ~8 GB | 32K | Best quality (needs 16GB+ RAM) |
Models download automatically on first use. You can also add custom models from any mlx-community repo.
- macOS 14.0 (Sonoma) or later
- Apple Silicon (M1, M2, M3, M4)
- 8 GB RAM (Random Access Memory) minimum (16 GB recommended for 7B models)
- Python 3.9+ with
mlx-lminstalled
Download the latest release from Releases, open the DMG, and drag to Applications.
git clone https://github.com/kochj23/MLXCode.git
cd MLXCode
open "MLX Code.xcodeproj"
# Build and run (Cmd+R)pip install mlx-lmMLX Code uses a Python daemon (mlx_daemon.py) for model inference. It applies the model's native chat template automatically (ChatML for Qwen, Llama format for Llama, etc.).
MLX Code (SwiftUI)
|
|-- ChatViewModel # Conversation management, tool execution loop
|-- MLXService # Talks to Python daemon via stdin/stdout JSON
|-- ContextManager # Token budgeting, message compaction
|-- ToolRegistry # 14 registered tools (2 tiers)
|-- SystemPrompts # Compact prompt with few-shot examples + user memories
|-- UserMemories # Persistent coding standards and preferences
|
|-- Services/
| |-- GitHubService # GitHub API: issues, PRs, branches, credentials scan
| |-- ContextAnalysis # Project structure and dependency analysis
| `-- UserMemories # Configurable standards, custom memory persistence
|
|-- ViewModels/
| |-- ProjectViewModel # Build operations and project management
| |-- GitHubViewModel # GitHub panel state
| `-- CodeAnalysis VM # Code metrics and analysis state
|
`-- Python/mlx_daemon.py # mlx-lm model loading, chat_generate with templates
Key design decisions:
- Chat templates applied by the Python tokenizer (not hand-rolled in Swift)
- Tool prompt is ~500 tokens (not 4000) — leaves room for actual conversation
- Context budget system allocates tokens: system prompt, messages, project context, output reservation
- Two tool tiers: core (always available) and development (when project is open)
- User memories injected at runtime from AppSettings — no personal data in source code
- Command Validation: All bash commands pass through
CommandValidatorwith regex word-boundary matching before execution, blocking dangerous patterns (rm -rf /, fork bombs, etc.) - Python Import Validation (v6.1.0): Regex-based validation with comment filtering prevents bypass via inline comments
- No Shell Interpolation: Git and build tools use
process.currentDirectoryURLinstead ofcdstring interpolation, preventing directory traversal and injection attacks - Tool Approval Flow: Write and execute tools (bash, file write, xcode build) require user confirmation before running
- Read-Only Auto-Approve: Only safe, read-only tools (grep, glob, file read) auto-approve without user interaction
- Permission Checks (v6.1.0): File permission validation before script execution in CommandValidator
- macOS Keychain Storage: All API keys (OpenAI, Anthropic, Google, AWS, Azure, IBM) stored in macOS Keychain using
SecItemAdd/SecItemCopyMatching - Automatic Migration: Existing UserDefaults-stored keys automatically migrated to Keychain on first launch
- No Plaintext Secrets: Non-secret config only (region, model names) stored in UserDefaults
- SHA256 Hash Verification: Downloaded models verified against expected hashes using CryptoKit
- Secure Logging: All debug output routed through
SecureLoggerinstead ofprint()— no sensitive data in console
- 100% Local: All model inference runs on-device via Apple MLX -- no data leaves your machine
- No Telemetry: No analytics, crash reporting, or usage tracking
- No API Keys Required: No cloud services, no subscriptions, no accounts
- Local Memory Storage: User memories stored in
~/.mlxcode/memories.json, never transmitted
- Serial Queues: MLX service I/O operations serialized to prevent race conditions
- Buffered I/O: 4096-byte chunk reading replaces byte-by-byte daemon communication
- Task Cancellation: All infinite loops replaced with
while !Task.isCancelledfor clean shutdown
Being honest about limitations:
- No internet access — can't browse, fetch URLs, or call APIs
- No image/video/audio generation — this is a code assistant, not a media tool
- Small model constraints — 3-8B parameter models make mistakes, especially with complex multi-step reasoning
- No IDE integration — standalone app, not an Xcode plugin (yet)
- Tool calling is imperfect — local models sometimes format tool calls incorrectly
- Comprehensive security audit: 31 findings resolved (2 CRITICAL, 8 HIGH, 10 MEDIUM, 9 LOW, 1 INFO)
- API keys migrated from UserDefaults to macOS Keychain with automatic migration
- Command validator hardened with NSRegularExpression word-boundary matching
- Python import validator hardened with regex matching and comment filtering
- SHA256 model hash verification using CryptoKit
- Buffered 4096-byte I/O replacing byte-by-byte daemon communication
- Task cancellation (
while !Task.isCancelled) replacing infinite loops - Bundle-relative paths replacing hardcoded file paths
- Multi-version Python path lookup (3.13 down to 3.9)
- Serial queues for thread-safe MLX service operations
- SecureLogger replacing all
print()statements - Async logging via serial queue in CommandValidator
localizedCaseInsensitiveContains()for proper Unicode search- O(n) context management replacing O(n^2) insert-at-zero pattern
- 1MB file content cap for memory management in codebase indexer
- Implemented Clear Conversations confirmation dialog in Settings
- Force unwrap elimination in MLXService
- NSString cast chains replaced with URL API across 3 files
- Named constants for context budget ratios
- Deprecated unused ContentView with
@availableattribute
- GitHub integration: issues, PRs, branches, credential scanning
- Code analysis: metrics, dependencies, lint, symbols
- Xcode full deploy pipeline: build, archive, DMG, install
- User memories system — persistent coding standards and preferences
- Context analysis service for project structure inspection
- Project dashboard, GitHub panel, code analysis panel, build panel views
- 14 tools (up from 11)
- Major simplification: deleted 41 files (~16,000 lines) of unused features
- Rewrote system prompt to be honest and compact
- Default model: Qwen 2.5 7B
- 11 focused tools
- Phase 1: Chat template support, structured message passing, tool tier system
- Phase 2: Context budget system, smart token estimation, project context auto-include
- Tool approval flow with auto-approve for read-only operations
- Initial release with MLX backend
- Desktop widget extension
- Basic chat interface
MIT License - Copyright 2026 Jordan Koch
See LICENSE for details.
Disclaimer: This is a personal project created on my own time. It is not affiliated with, endorsed by, or representative of my employer.