[Project] Autonomous Recon Agent with LLMs for Hack The Box

Introduction

Reconnaissance is the backbone of any successful penetration test or red team engagement. Yet, it’s often a tedious and repetitive process: run a bunch of tools, parse messy output, figure out what’s important, and decide the next steps. What if you could automate all of that, and make it smart?

That’s what this project is about: a self-triaging recon agent that uses Large Language Models (LLMs) to analyze tool output, summarize findings, recommend follow-ups, and even suggest possible CVEs — all in a fully automated workflow.

Why This Matters

There are plenty of recon scripts and tools out there, but very few do intelligent triage. This project:

Automates noisy, repetitive recon workflows
Adds logic and insight using LLMs (Groq / openAI / Ollama)
Structures and stores output cleanly for review
Suggests relevant next steps and possible exploits

This isn’t just a tool — it’s an assistant.

High-Level Architecture

[Host System]
└── start.sh
    ├── Validates OVPN + target IP
    ├── Builds Docker image
    └── Runs Docker container

[Inside Container]
└── agent.py
    ├── Establishes VPN
    ├── Runs nmap (-sC -sV -p-)
    ├── Captures and summarizes output
    ├── Calls LLM for triage suggestions
    ├── Runs follow-up tools (e.g., gobuster, ffuf)
    ├── Maps services to CVEs using searchsploit
    └── Generates markdown executive summary

LLM-Driven Intelligence

The real magic comes from tight LLM integration:

Input: Raw output from nmap, gobuster, nikto, etc.
Prompt Engineering: Strong constraints enforce JSON output: summary, recommended commands, and discovered services.
Repair Logic: Malformed responses are automatically fixed via secondary LLM call.
CVE Mapping: Services found are piped into searchsploit, then wrapped into a final executive summary.
Example output from a post-nmap step

{
  "summary": "- Apache 2.4.41 found on port 80.\n- Potential directory listing enabled.",
  "recommended_steps": ["gobuster dir -u http://10.10.10.10 -w wordlist.txt"],
  "services_found": ["apache 2.4.41"]
}

Getting Started

Prerequisites

Docker
Python 3.8+
A Hack The Box VPN (.ovpn file)
API key for Groq OR a running Ollama instance

Setup Steps

git clone https://github.com/jackhax/htb_recon_agent
cd htb-recon-agent

# Create and fill in your .env file
cp .env.example .env

# Example .env (Groq)
LLM_API_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXX
LLM_PROVIDER=groq
MODEL=meta-llama/llama-4-scout-17b-16e-instruct
OLLAMA_HOST=http://host.docker.internal:11434 #if using ollama

# Run recon against a box
./start.sh --force-build 10.10.11.123 path/to/htb.ovpn machinename

Output Directory

triage/10.10.11.123/
├── nmap.txt
├── gobuster.txt
├── summary.md
├── exploits.txt
└── summary_exec.md

Challenges Faced

Docker disk space limitations (solved via phased apt installs)
Handling unstructured tool output (e.g., gobuster flooding)
Forcing LLMs to behave predictably (prompt design is key!)
Connection edge cases for Ollama in Docker (solved with host.docker.internal)

What’s Next?

Add nuclei or jaeles for more automated vuln scanning
Automatically detect CMSes and invoke wpscan, joomscan
Export report to PDF or HTML with styling
Build a simple dashboard to review multiple boxes
Allow attack simulation or flag enumeration as a future module

Conclusion

This agent saves hours of manual work, reduces human error, and makes recon actually fun again. Whether you’re grinding away on Hack The Box or working through a red team engagement, this approach can drastically improve your workflow.

You can find the full code and setup instructions here: https://github.com/jackhax/htb_recon_agent