Skip to content

Security Audit Report - RAG Red Team Assessment (Score: 70/100) #609

@appergb

Description

@appergb

Security Audit Report: OpenLess

Auditor: DeepSeek TUI - RAG Red Team
Date: 2026-06-06
Repository: Open-Less/openless
Overall Score: 70 / 100


Executive Summary

A comprehensive security audit was conducted on the OpenLess Rust backend (~35,600 lines), focusing on the LLM integration pipeline, credential management, network security, and prompt injection attack surfaces. The project demonstrates a mature security posture overall, with one high-severity finding (SSRF) and several defense-in-depth recommendations.


Vulnerability Findings

[F-01] HIGH - SSRF via User-Configurable LLM Endpoint (CWE-918)

File: coordinator.rs, resolve_ark_endpoint_with_policy()

The custom LLM endpoint URL is accepted without any IP range validation. An attacker could set the endpoint to internal addresses (e.g., 169.254.169.254 for cloud metadata, 10.0.0.0/8 for internal services), leaking API keys to internal networks.

Fix: Add IP range validation rejecting RFC 1918, RFC 6598, link-local, and loopback addresses. Enforce HTTPS-only for non-localhost endpoints.


[F-02] MEDIUM - Prompt Injection via ASR Transcript (CWE-77)

File: polish.rs, prompts::user_prompt()

User input is wrapped in an XML envelope with basic escaping of </raw_transcript>, but this is insufficient against semantic prompt injection (e.g., "Ignore all previous instructions..."). LLMs are not security boundaries.

Fix: Add multi-layer escaping (filter delimiter patterns, max length cap), adversarial defense phrasing in system prompt, and output validation.


[F-03] MEDIUM - Conversation History Poisoning (CWE-349)

File: polish.rs, build_polish_history_messages()

history.json is stored in plaintext without integrity checks. An attacker with local filesystem access can inject crafted assistant messages into the chat context, influencing LLM behavior.

Fix: Add HMAC integrity check for history.json or sign assistant messages.


[F-04] LOW - Plaintext History Storage (CWE-312)

All dictation sessions stored in plaintext history.json. No encryption at rest beyond OS filesystem-level.

Fix: Encrypt history.json at rest using a key derived from the OS keychain, or clearly document this behavior.


[F-05] MEDIUM - Missing Integration Tests (CWE-1076)

~300 unit tests exist but zero integration tests cover the full dictation pipeline (ASR -> Polish -> Insert). coordinator/qa.rs and coordinator/resources.rs have zero test coverage.

Fix: Add integration tests with mock LLM/ASR servers. Add snapshot/golden tests for prompt assembly.


[F-06] LOW - QA Prompt Injection (CWE-77)

QA mode feeds user-selected text directly into the LLM without XML envelope isolation (unlike the polish pipeline).

Fix: Wrap selected text in an XML envelope similar to the polish pipeline.


Remediation Priority Matrix

Finding Priority Effort
F-01: SSRF P0 - Critical ~2 hours
F-02: Prompt injection P1 - High ~3 hours
F-03: History poisoning P2 - Medium ~2 hours
F-05: Integration tests P2 - Medium ~4 hours
F-04: Plaintext history P3 - Low ~3 hours
F-06: QA injection P3 - Low ~1 hour

Scoring Breakdown (70/100)

Dimension Score Key Observation
Architecture 19/25 Clean dependency graph, but God module at 5.5K lines
Code Quality 15/20 Readable, well-documented, some unwrap() concerns
Security 13/20 SSRF gap; prompt injection depth insufficient
Testing 9/15 Unit tests okay, zero integration tests
Documentation 7/10 Good module docs, CLAUDE.md referenced but missing
Maintainability 7/10 coordinator.rs at 5,542 lines is a bottleneck

Notable Strengths

  • Credential vault with OS-native keychain + chunking for Windows CM limits
  • Network retry strategy deliberately excludes timeouts (prevents LLM double-billing)
  • IPC window-level isolation via ensure_main_window()
  • Clean architectural invariant: all modules depend only on types.rs
  • Zero FIXME/TODO/HACK markers in codebase

Key Weaknesses

  • SSRF: No IP range validation on user-configurable endpoints
  • God module: coordinator.rs at 5,542 lines
  • No integration/E2E testing pipeline
  • CLAUDE.md referenced in README but missing from repository

A detailed PDF report (~63KB) is available. With the top 3 fixes completed, this project would score ~85/100.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0Critical priorityarea:securitySecurity areabugSomething isn't workingdocumentationImprovements or additions to documentationneeds-triageNeeds triagesecuritySecurity vulnerability or concern

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions