Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
- [OAuth Integration](#github-and-google-oauth-integration)
- [Docker Image Configuration](#docker-image-configuration)
- [Development](#development)
- [Pentesting Prompt Methodology](#pentesting-prompt-methodology)
- [Testing LLM Agents](#testing-llm-agents)
Comment on lines +42 to 43
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Table of Contents nests “Pentesting Prompt Methodology” under “Development”, but the actual “### Pentesting Prompt Methodology” section appears under “Testing LLM Agents” later in the README. This makes the TOC structure misleading even though the anchor works; please either move the section under the Development chapter or relocate the TOC entry under the correct parent heading (Testing LLM Agents).

Suggested change
- [Pentesting Prompt Methodology](#pentesting-prompt-methodology)
- [Testing LLM Agents](#testing-llm-agents)
- [Testing LLM Agents](#testing-llm-agents)
- [Pentesting Prompt Methodology](#pentesting-prompt-methodology)

Copilot uses AI. Check for mistakes.
- [Embedding Configuration and Testing](#embedding-configuration-and-testing)
- [Function Testing with ftester](#function-testing-with-ftester)
Expand Down Expand Up @@ -3192,6 +3193,18 @@ When developing new prompt templates or agent behaviors:
3. Observe responses and adjust prompts accordingly
4. Check Langfuse for detailed traces of all function calls

### Pentesting Prompt Methodology

When refining prompts for offensive security work, give the agent a clear methodology instead of a flat list of payloads:

1. Start with explicit scope, authorization, and success criteria
2. Map the application first: roles, routes, parameters, uploads, integrations, and trust boundaries
3. Prioritize attack surfaces systematically instead of testing everything at once
4. Validate findings with reproducible evidence before escalating to deeper exploitation
5. Finish with report-ready notes that capture impact, prerequisites, and next steps

For PentAGI-specific prompt guidance, see [`backend/docs/prompt_engineering_pentagi.md`](backend/docs/prompt_engineering_pentagi.md). For a practical starting point, reuse and adapt [`examples/prompts/base_web_pentest.md`](examples/prompts/base_web_pentest.md) to match the target application, technology stack, and engagement scope.

### Verifying Docker Container Setup

Ensure containers are properly configured:
Expand Down
14 changes: 14 additions & 0 deletions backend/docs/prompt_engineering_pentagi.md
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,20 @@ A comprehensive framework for designing high-performance prompts within the Pent
- **Key Sections**: `KNOWLEDGE MANAGEMENT` (Memory Protocol), `OPERATIONAL ENVIRONMENT` (Container Constraints), `COMMAND EXECUTION RULES` (Terminal Protocol), `PENETRATION TESTING TOOLS` (list available), `TEAM COLLABORATION`, `DELEGATION PROTOCOL`, `SUMMARIZATION AWARENESS PROTOCOL`, `COMPLETION REQUIREMENTS` (using `{{.HackResultToolName}}`).
- **Critical Instructions**: Check memory first, strictly adhere to terminal rules & container constraints, use only listed available tools, delegate appropriately (e.g., exploit development to Coder), provide detailed, evidence-backed exploitation reports using `{{.HackResultToolName}}`.

#### Pentesting Methodology Checklist for Prompt Authors
- Encode authorization boundaries explicitly. Prompts should remind the agent to test only approved targets, respect engagement scope, and avoid destructive actions unless the task requires them.
- Start with coverage before exploitation. Instruct the agent to map routes, roles, inputs, file handling, integrations, and trust boundaries before choosing attack paths.
- Organize testing by attack surface. Good prompts group checks around authentication, access control, injection, cross-site scripting, server-side request forgery, file processing, and business logic instead of presenting a random payload dump.
- Prefer low-risk validation first. Reflection markers, controlled payloads, timing checks, and out-of-band verification should be used deliberately to confirm hypotheses before deeper exploitation.
- Require evidence at every stage. Prompts should ask for captured requests, responses, tool output, prerequisites, and impact notes so confirmed findings can move directly into a report.
- Use memory and iteration intentionally. The agent should record confirmed dead ends, revisit promising leads with new context, and avoid repeating the same failed checks.
- End with actionable reporting. A strong pentesting prompt tells the agent to summarize what was confirmed, what remains unverified, how the issue can be reproduced, and which follow-up actions are justified.

#### Recommended Reference Material
- Use public methodology resources such as [HackTricks](https://book.hacktricks.wiki/en/index.html) and [Pentest Book](https://pentestbook.six2dez.com/) as inspiration for attack-surface coverage and testing depth.
- Translate those references into concise phases, priorities, and verification rules for the agent instead of copying long checklists into the system prompt verbatim.
- Keep prompt examples aligned with live PentAGI assets such as `backend/pkg/templates/prompts/pentester.tmpl` and `../../examples/prompts/base_web_pentest.md`.
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bullet mixes a repo-root path (backend/pkg/...) with a relative-from-this-file path (../../examples/...). For clarity and consistency, prefer a single convention (e.g., repo-root paths like examples/prompts/base_web_pentest.md) and ideally make them markdown links so readers can click through.

Suggested change
- Keep prompt examples aligned with live PentAGI assets such as `backend/pkg/templates/prompts/pentester.tmpl` and `../../examples/prompts/base_web_pentest.md`.
- Keep prompt examples aligned with live PentAGI assets such as [`backend/pkg/templates/prompts/pentester.tmpl`](../pkg/templates/prompts/pentester.tmpl) and [`examples/prompts/base_web_pentest.md`](../../examples/prompts/base_web_pentest.md).

Copilot uses AI. Check for mistakes.

### Searcher Agent
- **Focus**: Highly efficient information retrieval (internal memory & external sources), source evaluation and prioritization, synthesis of findings.
- **Key Sections**: `CORE CAPABILITIES` (Action Economy, Search Optimization), `SEARCH TOOL DEPLOYMENT MATRIX`, `OPERATIONAL PROTOCOLS` (Search Efficiency, Query Engineering), `SUMMARIZATION AWARENESS PROTOCOL`, `SEARCH RESULT DELIVERY` (using `{{.SearchResultToolName}}`).
Expand Down
Loading