Strix Introduction: Using AI Agents for Automated Penetration Testing and Vulnerability Remediation

Strix is an open source AI penetration testing tool. It is not positioned as a traditional static scanner, but a set of AI pentesting agents that can dynamically run code, explore the attack surface, try to exploit and verify vulnerabilities. The project README describes it very directly: discover and fix application vulnerabilities in a way similar to real hackers.

This type of tool is best suited for development teams, security teams, and DevSecOps processes: run tests in local code repositories, GitHub repositories, web applications, or CI/CD to find high-risk issues early and string together vulnerability reproductions, fix recommendations, and even patch generation.

Boundaries need to be emphasized first: Strix can only be used with applications, repositories, and domains that you own or have explicitly authorized. Do not use it for unauthorized targets. The value of penetration testing tools is to help defend and remediate, not to bypass authorization.

What problem does Strix solve?

There are two common pain points in traditional security detection: static scanning has many false positives and manual penetration testing has a long cycle. What Strix wants to do is combine AI Agents, dynamic execution environments and penetration testing tool chains to make security checks closer to real attack paths.

Its core capabilities include:

Built-in penetration testing tool chain: Reconnaissance, exploitation, verification and other steps are available out of the box.
Multi-Agent orchestration: Multiple AI penetration testing Agents can work separately and collaborate.
Real vulnerability verification: Emphasis on runnable PoC rather than just static warnings.
CLI for developers: Outputs actionable findings, reproduction steps, and fix recommendations.
Automated remediation and reporting: Generate patches and penetration testing reports suitable for compliance scenarios.

In other words, Strix doesn’t just tell you “there may be a problem here”, but tries to answer three more critical questions: whether the problem can be exploited, how to reproduce it, and how to fix it.

Applicable scenarios

Typical scenarios given in Strix’s README include:

Application Security Testing: Detect and verify critical vulnerabilities in applications.
Rapid Penetration Testing: Compress penetration testing cycles from weeks to shorter times and generate reports.
Bug Bounty Automation: Assist bug bounty research, generate PoC and reproduction materials.
CI/CD Integration: Run security tests in pull requests or deployment pipelines to prevent high-risk code from entering production.

If the team already has SAST, dependency scanning, and container scanning, Strix can complement it as a dynamic validation layer. It is more suitable for discovering “actually openable” paths, such as access control bypasses, business logic flaws, identity authentication issues, XSS, SSRF, SQL injection, API abuse, etc.

Preparation before installation

运行 Strix 前需要准备两类东西：

Docker and make sure Docker is running.
An API Key for a supported LLM Provider, such as OpenAI, Anthropic, Google, etc.

When running for the first time, Strix will automatically pull the sandbox Docker image. Scan results will be saved to:

1

strix_runs/<run-name>

This means that it does not simply read the file and immediately output the conclusion, but does dynamic testing and verification in a sandbox environment. Before using it in production projects, it is recommended to run it in the test repository or staging environment to confirm the scope, cost, time consumption and output format.

Installation and first scan

The installation method given in the README is to directly execute the official installation script:

1

curl -sSL https://strix.ai/install | bash

Configure the AI Provider after installation. Example using OpenAI:

1
2


export STRIX_LLM="openai/gpt-5.4"
export LLM_API_KEY="your-api-key"

Then run the first security assessment on the local application directory:

1

strix --target ./app-directory

If you are more interested in a remote repository, you can also change the target to the GitHub URL:

1

strix --target https://github.com/org/repo

If you want to do black box web application testing, you can specify the URL directly:

1

strix --target https://your-app.com

These three entrances correspond to the local codebase, remote code repository and online application respectively. In actual use, do not expand the range too large at one time. Starting with a single service, single repository, or staging domain makes it easier to control test noise and cost.

Advanced scanning methods

Strix supports adding additional instructions to Agent, which is suitable for gray box testing, account testing, business logic testing and limited scope testing.

For example, do gray box testing with authentication information:

1

strix --target https://your-app.com --instruction "Perform authenticated testing using credentials: user:pass"

Test the source code and deployed application simultaneously:

1

strix -t https://github.com/org/app -t https://your-app.com

Do source code awareness scanning of the local repository:

1

strix --target ./app-directory --scan-mode standard

Focus on business logic defects and IDOR:

1

strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"

If the test scope, rules, and exclusions are complex, they can be placed in a file:

1

strix --target api.your-app.com --instruction-file ./instruction.md

In the PR scenario, you can force to only look at the diff range of a certain base branch:

1

strix -n --target ./ --scan-mode quick --scope-mode diff --diff-base origin/main

These parameters are important. The stronger the security tool, the more clear the scope needs to be. It is recommended to clearly write in instruction.md the domain names, paths, accounts, prohibited behaviors, rate limits, test windows and contacts that are allowed for testing.

Headless mode

Server, CI/CD, and automation tasks typically don’t require an interactive UI. Strix can use -n/--non-interactive to enable headless mode:

1

strix -n --target https://your-app.com

In this mode, the CLI prints vulnerability findings in real time and outputs a final report before exiting. If a vulnerability is found, it will end with a non-zero exit code. This is useful for CI/CD because the pipeline can block merges or releases accordingly.

GitHub Actions integration

Strix can be put into GitHub Actions to run lightweight security tests on pull requests. The README example is roughly as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


name: strix-penetration-test

on:
  pull_request:

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
        with:
          fetch-depth: 0

      - name: Install Strix
        run: curl -sSL https://strix.ai/install | bash

      - name: Run Strix
        env:
          STRIX_LLM: ${{ secrets.STRIX_LLM }}
          LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
        run: strix -n -t ./ --scan-mode quick

Here are two details:

fetch-depth: 0 is important, PR diff range analysis requires full history.
API Key should be placed in GitHub Secrets and should not be written to the repository.

The README also reminds that when CI’s pull request is running, Strix will automatically limit the quick review scope to the changed files. If the diff scope cannot be resolved, either ensure checkout uses the full history or pass in --diff-base explicitly.

Configuration items

Commonly used environment variables are as follows:

1
2
3
4
5
6
7


export STRIX_LLM="openai/gpt-5.4"
export LLM_API_KEY="your-api-key"

# Optional
export LLM_API_BASE="your-api-base-url"  # if using a local model, e.g. Ollama, LMStudio
export PERPLEXITY_API_KEY="your-api-key"  # for search capabilities
export STRIX_REASONING_EFFORT="high"  # control thinking effort (default: high, quick scan: medium)

Strix will save the configuration to:

1

~/.strix/cli-config.json

This way you don’t need to re-enter it each time you run it. README recommended models include:

openai/gpt-5.4
anthropic/claude-sonnet-4-6
vertex_ai/gemini-3-pro-preview

When actually choosing a model, you can choose based on the type of task: quick scan pays more attention to speed and cost; complete penetration testing pays more attention to reasoning ability, context processing and tool invocation stability.

What vulnerabilities can be detected

Strix covers the OWASP Top 10 as well as broader application security issues. Types listed in the README include:

Broken Access Control: IDOR, privilege escalation, authentication bypass.
Injection Attacks: SQL injection, NoSQL injection, OS command injection, SSTI.
Server-Side Vulnerabilities: SSRF, XXE, unsafe deserialization, RCE.
Client-Side Attacks: Storage/reflective/DOM XSS, prototype pollution, CSRF.
Business Logic Flaws: Race conditions, payment manipulation, process bypassing.
Authentication & Session: JWT attack, session fixation, credential stuffing.
Infrastructure & Cloud: Misconfigurations, exposed services, cloud security issues.
API Security: Authentication destruction, mass assignment, current limit bypass.

These categories illustrate that the goal of Strix is not just to do code style checking, but to cover security testing from source code to runtime behavior, from API to business logic.

Agentic Pentesting Tools

Strix Agent comes with a set of offensive security tools, similar to the tool chain a professional penetration tester would use:

HTTP Interception Proxy: Request/response interception, modification and analysis through Caido.
Browser Exploitation: Automated browser for testing XSS, CSRF, clickjacking, authentication bypass and other processes.
Shell & Command Execution: Interactive terminal used for exploit development and post-exploitation phases.
Custom Exploit Runtime: Python sandbox for writing and verifying PoCs.
Reconnaissance & OSINT: Automated attack surface mapping, subdomain enumeration and fingerprinting.
Static & Dynamic Code Analysis: Combining SAST and DAST.
Vulnerability Knowledge Base: Structured vulnerability discovery, including CVSS and OWASP classifications.

This is also the difference between it and ordinary scanners: Agent not only matches rules, but also tries to combine tools, verify hypotheses, and generate recurrence paths.

Strix Platform

In addition to the open source CLI, Strix also offers the Strix Platform. The README mentions that the platform version can connect repositories and domains, start pentest in a few minutes, and provide:

Verified vulnerability discovery with PoC.
One-click autofix turns AI-generated security patches into mergeable PRs.
Continuous pentesting, continuous scanning following deployment.
DevSecOps integrations: GitHub, GitLab, Bitbucket, Slack, Jira, Linear, CI/CD.
Continuous learning: Adapt the code base based on historical findings to gradually reduce false positives.

If you just want to validate the tool locally, the CLI is sufficient; if your team needs ongoing scanning, collaboration, reporting, and enterprise integration, the platform version is better suited.

Enterprise version capabilities

The README also mentions enterprise-level penetration testing capabilities, including:

SSO: SAML/OIDC.
Compliance reporting: SOC 2, ISO 27001, PCI DSS, and more.
Dedicated support and SLA.
Custom deployment: VPC/self-hosted.
BYOK model support.
AI pentesting agents customized for enterprise environments.

This section is suitable for teams with compliance, auditing, internal security processes and data boundary requirements.

Usage suggestions

First, use Strix in an authorized and isolated environment. Run the local repository or staging environment first, and do not directly perform high-intensity testing on the production system.

Second, write a clear scope for the test. It is recommended to maintain a instruction.md to record the paths, accounts, excluded interfaces, prohibited destructive operations and test windows that allow testing.

Third, use quick scan first when connecting it to CI/CD. Once the team understands the output, false positive rate, and cost, then gradually expand the scope of the test.

Fourth, do not regard AI output as the final safety conclusion. Even if Strix emphasizes a real PoC, it should still be reviewed by a security engineer or development lead to confirm risks, impact areas, and fixes.

Fifth, key management must be cautious. LLM_API_KEY, PERPLEXITY_API_KEY, and test account passwords should be placed in a secure secret management system and should not be written into command history, logs, or repositories.

Summary

Strix features AI Agents, penetration testing tool chains, PoC verification and developer workflows. It is suitable for complementing the blind spots of traditional scanners, especially for fast security feedback in dynamic verification, business logic and CI/CD stages.

It’s also not a tool that “automatically replaces security teams.” A more reasonable way to use Strix is to use Strix as an efficient AI security testing assistant: it helps you discover verifiable problems faster, generates reproduction materials and repair suggestions, and then the team completes risk judgment, code review and official release.