Is AI Killing the CTF Scene? An In‑Depth Look at the Decline
The article examines how rapid advances in large language models—from GPT‑4 to Mythos—have automated most CTF challenges, reshaping leaderboards, prompting top teams to quit, and forcing the security community to rethink competition formats, talent assessment, and education.
When AI can clear almost every challenge in a CTF within 48 hours, the leaderboard shifts from measuring human skill to measuring AI orchestration ability, putting the once‑vibrant competitive ecosystem in a survival crisis.
1. A Veteran’s “Death Declaration”
In May 2026, security researcher frays posted a bluntly titled article “The CTF scene is dead”. Frays, a former member of Australia’s top team Blitzkrieg and later TheHackersCrew, had ranked in the global top‑10 on CTFTime until leaving at the end of 2025. He writes, “Seeing people pretend the format still works is disheartening because the old game is gone.”
2. How AI Is “Killing” CTFs
2.1 The GPT‑4 Era: Time‑saving but still human‑dependent
After GPT‑4’s 2023 release, medium‑difficulty challenges began to be solved with a single prompt—e.g., a cryptography problem would return an answer within ten minutes. The community initially dismissed this as a harmless tool, arguing that harder problems still required manual analysis.
2.2 Claude Opus 4.5 Era: Scoreboard distortion
Claude Opus 4.5 (Anthropic Opus 4.5) arrived in 2024 and automated almost all medium‑difficulty and some high‑difficulty tasks. Claude Code packaged the workflow into a CLI that, via the CTFd API, launches a Claude instance per challenge. Teams that run the tool in the first hour only need to focus on the remaining hard problems.
Refusing AI now means competing on a slower version of the contest. The leaderboard increasingly reflects “orchestration ability” and willingness to use cutting‑edge models rather than pure security skill.
2.3 GPT‑5.5/Mythos Era: 48‑hour sweep
From 2025 to 2026, GPT‑5.5 and Anthropic’s Mythos series could clear almost every CTF challenge within 48 hours. Security Boulevard’s analysis of 423 HackTheBox machines showed “first‑blood” times shrinking dramatically after LLMs appeared—hard‑difficulty times fell by 27 % and insane‑difficulty by 67 %.
Frays reports using GPT‑5.5 Pro to orchestrate Insane‑difficulty heap‑exploitation challenges on HackTheBox, stating that a 48‑hour CTF could be won before the competition ends if enough tokens are spent.
This turns public online CTFs into a “pay‑to‑win” race where token budget directly determines scoreboard dominance.
3. Mythos: The First AI to Complete a 32‑Step Enterprise Attack Chain
The UK’s AI Security Institute (AISI) evaluated Mythos in 2026. Key findings:
Mythos is the first model to complete a target‑oriented attack (TLO) test from scratch, achieving a 73 % success rate on expert‑level CTF tasks.
In a 32‑step enterprise attack chain test, Mythos averaged 22 steps per run, far surpassing Claude 4.6’s 16‑step average.
In simulated enterprise network takeover tests, Mythos completed the full attack with a 30 % success rate, whereas a human expert would need roughly 20 hours for the same task.
AISI’s report concludes, “Mythos can autonomously attack small, poorly defended enterprise systems.” This is an official government assessment, not a laboratory prototype.
4. Community Reaction: Team Withdrawals and Event Cancellations
Other respected figures echo frays’ warning. Prominent competitors note that many top players have stopped participating, and leading teams such as TheHackersCrew either no longer compete or field drastically reduced rosters, rarely breaking into the top ten.
Several premier events, e.g., Plaid CTF, have been discontinued. Organizers who spend weeks crafting challenges see them solved in minutes by AI, leading to widespread demotivation.
CTFTime’s 2026 leaderboard shows almost no continuity with previous years, lacking any trace of human‑skill progression.
5. Why “CTF Is Dead” Is Not Mere Hyperbole
5.1 What the Scoreboard Measures Now
Historically, the scoreboard served as a growth feedback loop: solving more challenges raised rankings, opened doors to elite teams, and fostered skill development. That ladder is now broken.
With AI‑driven orchestration dominating, newcomers see a ranking driven by token expenditure rather than effort, creating a “anti‑learning” mode where they are pushed to use AI before they can develop genuine intuition.
5.2 Chess Analogy: Engines Aren’t Allowed on the Board
In chess, engines are prohibited; allowing every player to use the strongest engine would destroy fairness and spectator appeal. The same logic applies to CTFs.
5.3 Organizers’ Dilemma
Attempts to thwart LLMs—semantic obfuscation, prompt‑injection traps, or leveraging post‑cutoff technologies—provide only temporary friction. Claude Code no longer cares about old “reject‑string” tricks; LLMs now recognize prompt‑injection reliably, and rules banning LLM use are largely ignored in public online events.
Designing truly AI‑resistant challenges forces creators into “guess‑the‑answer” or over‑engineered problems that are unfriendly to human participants, degrading the overall experience.
6. “Adaptation” Is Not a Sufficient Answer
Common rebuttals claim that “CTF is just AI‑enhanced, we just need to adapt.” Frays responds that this is meaningless without defining the target state.
If adaptation means building better tools, contestants have already done that.
If it means writing harder problems, organizers have already tried.
If it means accepting an AI‑orchestrated scoreboard, we should state that openly rather than pretending the old competition still exists.
Even when organizers craft problems that current LLMs cannot solve—requiring interactive debugging, side‑channel analysis, or physical hardware—players lack a viable path to stay competitive while still learning essential skills. Future model improvements will likely render those workarounds obsolete.
7. Industry Impact: Rethinking Talent Selection and Skill Definition
7.1 Recruiting Logic Needs Overhaul
CTF rankings have long been a proxy for hiring security talent. Frays argues that this metric is rapidly losing relevance because AI now handles the orchestration portion, which is largely open‑source or can be scripted.
Companies should shift toward evaluating bug‑bounty records, real penetration‑testing experience, and independent research ability.
7.2 Educational Value Persists, Format Must Evolve
CTF’s educational merit remains, but beginners should gravitate toward platforms like picoGym or HackTheBox that prioritize learning over competition and have lower incentives for cheating.
Community activities—SecTalks, student conferences, local meetups, and Discord learning groups—continue to provide valuable networking and knowledge‑sharing opportunities.
8. Future Directions: The Transformation Is Underway
8.1 Possible Paths Forward
Dual‑track competitions : Separate lanes for AI‑assisted and pure‑human participation, measuring AI orchestration efficiency on one side and human skill on the other.
Transparent token‑budget scoring : Publish the token or compute cost per flag, making AI assistance’s expense visible.
Human‑only challenges : Design tasks requiring interactive debugging, side‑channel analysis, or physical hardware interaction—areas where current LLM pipelines struggle.
Shift toward educational platforms : Direct novices to learning‑centric environments rather than ranking‑driven CTFs.
8.2 Community Remains, Needs New Vehicles
Frays concludes, “Even though AI and CTFs are becoming commercialized beyond our control, CTFs have had a massive positive impact on the industry. I met many kind, smart, passionate people through CTFs and solved some beautifully crafted challenges.” The community’s cohesion is more important than ever, and new formats are required to preserve that spirit.
9. Conclusion: Not “Dead”, but Needing Rebirth
AI is undeniably reshaping CTFs. The phrase “CTF is dead” should be understood as the old public online competition model losing its original meaning. Leaderboards no longer reflect human skill, challenges no longer test genuine vulnerability understanding, and beginners lose a visible path of progress.
However, the educational, social, and knowledge‑dissemination value of CTFs persists. The industry must either devise new competitive frameworks for public online events or pivot toward platforms and communities that keep learning at the core.
The old CTF format has ended. A new form is waiting to be built.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Black & White Path
We are the beacon of the cyber world, a stepping stone on the road to security.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
