Security Audit Report - RAG Red Team Assessment (Score: 70/100)

## Security Audit Report: OpenLess

**Auditor:** DeepSeek TUI - RAG Red Team  
**Date:** 2026-06-06  
**Repository:** Open-Less/openless  
**Overall Score: 70 / 100**

---

### Executive Summary

A comprehensive security audit was conducted on the OpenLess Rust backend (~35,600 lines), focusing on the LLM integration pipeline, credential management, network security, and prompt injection attack surfaces. The project demonstrates a mature security posture overall, with one high-severity finding (SSRF) and several defense-in-depth recommendations.

---

### Vulnerability Findings

#### [F-01] HIGH - SSRF via User-Configurable LLM Endpoint (CWE-918)

> File: `coordinator.rs`, `resolve_ark_endpoint_with_policy()`

The custom LLM endpoint URL is accepted without any IP range validation. An attacker could set the endpoint to internal addresses (e.g., `169.254.169.254` for cloud metadata, `10.0.0.0/8` for internal services), leaking API keys to internal networks.

**Fix:** Add IP range validation rejecting RFC 1918, RFC 6598, link-local, and loopback addresses. Enforce HTTPS-only for non-localhost endpoints.

---

#### [F-02] MEDIUM - Prompt Injection via ASR Transcript (CWE-77)

> File: `polish.rs`, `prompts::user_prompt()`

User input is wrapped in an XML envelope with basic escaping of `</raw_transcript>`, but this is insufficient against semantic prompt injection (e.g., "Ignore all previous instructions..."). LLMs are not security boundaries.

**Fix:** Add multi-layer escaping (filter delimiter patterns, max length cap), adversarial defense phrasing in system prompt, and output validation.

---

#### [F-03] MEDIUM - Conversation History Poisoning (CWE-349)

> File: `polish.rs`, `build_polish_history_messages()`

`history.json` is stored in plaintext without integrity checks. An attacker with local filesystem access can inject crafted assistant messages into the chat context, influencing LLM behavior.

**Fix:** Add HMAC integrity check for history.json or sign assistant messages.

---

#### [F-04] LOW - Plaintext History Storage (CWE-312)

All dictation sessions stored in plaintext `history.json`. No encryption at rest beyond OS filesystem-level.

**Fix:** Encrypt history.json at rest using a key derived from the OS keychain, or clearly document this behavior.

---

#### [F-05] MEDIUM - Missing Integration Tests (CWE-1076)

~300 unit tests exist but zero integration tests cover the full dictation pipeline (ASR -> Polish -> Insert). `coordinator/qa.rs` and `coordinator/resources.rs` have zero test coverage.

**Fix:** Add integration tests with mock LLM/ASR servers. Add snapshot/golden tests for prompt assembly.

---

#### [F-06] LOW - QA Prompt Injection (CWE-77)

QA mode feeds user-selected text directly into the LLM without XML envelope isolation (unlike the polish pipeline).

**Fix:** Wrap selected text in an XML envelope similar to the polish pipeline.

---

### Remediation Priority Matrix

| Finding | Priority | Effort |
|---------|----------|--------|
| F-01: SSRF | P0 - Critical | ~2 hours |
| F-02: Prompt injection | P1 - High | ~3 hours |
| F-03: History poisoning | P2 - Medium | ~2 hours |
| F-05: Integration tests | P2 - Medium | ~4 hours |
| F-04: Plaintext history | P3 - Low | ~3 hours |
| F-06: QA injection | P3 - Low | ~1 hour |

---

### Scoring Breakdown (70/100)

| Dimension | Score | Key Observation |
|-----------|-------|-----------------|
| Architecture | 19/25 | Clean dependency graph, but God module at 5.5K lines |
| Code Quality | 15/20 | Readable, well-documented, some unwrap() concerns |
| Security | 13/20 | SSRF gap; prompt injection depth insufficient |
| Testing | 9/15 | Unit tests okay, zero integration tests |
| Documentation | 7/10 | Good module docs, CLAUDE.md referenced but missing |
| Maintainability | 7/10 | coordinator.rs at 5,542 lines is a bottleneck |

---

### Notable Strengths

- Credential vault with OS-native keychain + chunking for Windows CM limits
- Network retry strategy deliberately excludes timeouts (prevents LLM double-billing)
- IPC window-level isolation via `ensure_main_window()`
- Clean architectural invariant: all modules depend only on `types.rs`
- Zero FIXME/TODO/HACK markers in codebase

### Key Weaknesses

- SSRF: No IP range validation on user-configurable endpoints
- God module: coordinator.rs at 5,542 lines
- No integration/E2E testing pipeline
- CLAUDE.md referenced in README but missing from repository

---

A detailed PDF report (~63KB) is available. With the top 3 fixes completed, this project would score ~85/100.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security Audit Report - RAG Red Team Assessment (Score: 70/100) #609

Security Audit Report: OpenLess

Executive Summary

Vulnerability Findings

[F-01] HIGH - SSRF via User-Configurable LLM Endpoint (CWE-918)

[F-02] MEDIUM - Prompt Injection via ASR Transcript (CWE-77)

[F-03] MEDIUM - Conversation History Poisoning (CWE-349)

[F-04] LOW - Plaintext History Storage (CWE-312)

[F-05] MEDIUM - Missing Integration Tests (CWE-1076)

[F-06] LOW - QA Prompt Injection (CWE-77)

Remediation Priority Matrix

Scoring Breakdown (70/100)

Notable Strengths

Key Weaknesses

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Finding	Priority	Effort
F-01: SSRF	P0 - Critical	~2 hours
F-02: Prompt injection	P1 - High	~3 hours
F-03: History poisoning	P2 - Medium	~2 hours
F-05: Integration tests	P2 - Medium	~4 hours
F-04: Plaintext history	P3 - Low	~3 hours
F-06: QA injection	P3 - Low	~1 hour

Dimension	Score	Key Observation
Architecture	19/25	Clean dependency graph, but God module at 5.5K lines
Code Quality	15/20	Readable, well-documented, some unwrap() concerns
Security	13/20	SSRF gap; prompt injection depth insufficient
Testing	9/15	Unit tests okay, zero integration tests
Documentation	7/10	Good module docs, CLAUDE.md referenced but missing
Maintainability	7/10	coordinator.rs at 5,542 lines is a bottleneck

Security Audit Report - RAG Red Team Assessment (Score: 70/100) #609

Description

Security Audit Report: OpenLess

Executive Summary

Vulnerability Findings

[F-01] HIGH - SSRF via User-Configurable LLM Endpoint (CWE-918)

[F-02] MEDIUM - Prompt Injection via ASR Transcript (CWE-77)

[F-03] MEDIUM - Conversation History Poisoning (CWE-349)

[F-04] LOW - Plaintext History Storage (CWE-312)

[F-05] MEDIUM - Missing Integration Tests (CWE-1076)

[F-06] LOW - QA Prompt Injection (CWE-77)

Remediation Priority Matrix

Scoring Breakdown (70/100)

Notable Strengths

Key Weaknesses

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions