ChallengeFeb 12, 20264 min read

The Clean Code Challenge: 1,000 developers, 30 days, $2,000 in prizes

Most developers have never fixed a real security vulnerability. Not because they can't — because they've never had the chance. Production bugs get triaged to security teams, CTFs feel disconnected from real codebases, and sanitized training exercises don't teach you what a messy authorization flaw actually looks like in a 50,000-line monolith.

We're changing that. Starting March 1, BrokenApp is running the Clean Code Challenge: 30 days, 10 intentionally vulnerable applications, and a leaderboard that rewards both speed and code quality. The top three participants split $2,000 in cash prizes. Everyone who finishes gets a verified credential they can add to their resume.

Why we're running this

BrokenApp exists to make security testing accessible. Our CLI scans APIs for IDORs, leaked secrets, and broken authorization — but scanning is only half the equation. The other half is remediation. Knowing that endpoint /api/users/:id has a BOLA flaw doesn't help if the developer staring at the finding doesn't know how to fix it.

The Clean Code Challenge is a structured environment where developers practice finding and fixing real vulnerability classes: IDOR, mass assignment, SSRF, JWT misconfiguration, and more. Each challenge app ships with a BrokenApp scan report so participants can see exactly what's broken before they start patching.

How AI tools fit in

Here's where it gets interesting: participants can use any tools they want. Claude Code, ChatGPT Codex, Cursor, Copilot — all fair game. We're not testing whether humans can beat AI at pattern-matching. We're testing how effective the human-plus-AI pairing is at security remediation.

Every submission records which tools were used and how. This gives us something the industry desperately needs: real data on how AI coding assistants perform on security-critical tasks. Can Claude Code reliably fix an IDOR when given a scan report? Does Copilot introduce new vulnerabilities while patching existing ones? We don't have good answers to these questions yet. After 1,000 developers work through 10 challenge apps, we will.

# Clone a challenge app and scan it

$ brokenapp challenge pull challenge-03

Pulling challenge-03 (jwt-misconfiguration)...

$ brokenapp scan --url http://localhost:4000

Found 6 findings across 14 endpoints

# Fix the code, then verify

$ brokenapp challenge verify challenge-03

✓ 6/6 vulnerabilities patched — no regressions

What participants will learn

Each of the 10 challenge apps isolates a specific vulnerability class. They're not toy examples — they're stripped-down versions of patterns we see in real production codebases. You get a Node/Express or Python/FastAPI application with a database, user sessions, and enough business logic that the fix isn't obvious.

Challenges 1–3: Authorization

IDOR via numeric ID, IDOR via UUID enumeration, and horizontal privilege escalation across role boundaries. You'll learn to implement proper ownership checks and middleware-level authorization.

Challenges 4–5: Authentication

JWT none-algorithm attack and session fixation. You'll learn why allowlisting algorithms matters and how to implement secure session rotation.

Challenges 6–8: Injection

SQL injection in a search endpoint, SSRF via URL parameter, and mass assignment through unvalidated request bodies. Each requires understanding how data flows from input to storage.

Challenges 9–10: Secrets & Config

Leaked API keys in client bundles and debug endpoints exposed in production. These are the configuration mistakes that BrokenApp's exposure scanner catches daily.

Judging criteria

We're not just grading on "did you fix the bug." The challenge evaluates three dimensions, equally weighted:

Completeness — All identified vulnerabilities are patched, with no regressions introduced. BrokenApp re-scans your submission automatically.
Code quality — Fixes follow secure coding best practices. No hardcoded secrets, no commented-out auth checks, no security-through-obscurity.
Speed — Time from challenge pull to verified submission. Faster solutions score higher, incentivizing efficient workflows (including AI-assisted ones).

The leaderboard updates in real-time. Each participant's profile shows their completed challenges, average fix time, and tool usage — creating a portfolio of demonstrated security engineering skill.

The research angle

Beyond the competition, the Clean Code Challenge will produce the largest real-world dataset on AI-assisted security debugging. We're partnering with two university research groups to analyze the anonymized results. The questions we want to answer:

Do developers using AI assistants fix vulnerabilities faster, slower, or at the same speed as those working manually?
Which vulnerability classes are AI tools best and worst at remediating?
Do AI-assisted fixes introduce new security issues at a higher rate than manual fixes?
Is there a measurable difference between AI tools (Claude Code vs. Copilot vs. Codex) on security-specific tasks?

The full dataset and analysis will be published as an open-access report after the challenge ends. If you're a researcher interested in collaborating, reach out — we're actively looking for more institutional partners.

How to enter

Registration is open now. You need a free BrokenApp account and the CLI installed. On March 1 at 00:00 UTC, the first three challenges unlock. New challenges release every three days after that, giving everyone time to work through them without cramming.

$ brokenapp challenge register

Registered for Clean Code Challenge 2026

Challenges unlock: March 1, 2026 00:00 UTC

# Check your standing anytime

$ brokenapp challenge leaderboard

All posts Enter the Challenge