When AI Writes the Code — Who Reasons About the Risk?
Recent independent research from Sonar’s State of Code Report (2025) reveals something counterintuitive about modern AI-generated code: as LLMs become more capable, the security risks they introduce don’t disappear — they shift.
The report analyzed thousands of identical programming tasks across major models and found a consistent pattern. Let’s dive in:
1. All major LLMs still produce serious security flaws
Even the best-performing models introduce high rates of BLOCKER-severity vulnerabilities — including injections, hard-coded secrets, and broken security controls. This isn’t a glitch. It’s structural. LLMs understand patterns, not security boundaries.
2. “More advanced” models simply make different mistakes
Reasoning-focused models reduce some obvious issues, but introduce new classes of subtle bugs — particularly concurrency, error-handling holes, and complex logic failures.
This aligns with a trend many engineering leaders are quietly feeling: AI is helping ship more code, faster — but also creating a larger, more complex attack surface.
3. The real problem emerging: complexity
One of the most striking findings: the highest-performing models generate the most code — in some cases 3–4× more than smaller models. With that comes:
- exponentially larger review surfaces
- more deeply buried flaws
- higher cognitive load for security teams
- new categories of defects that traditional scanners rarely see
We’re entering a phase where risk is not just “more issues,” but harder-to-detect issues. And let’s not kid ourselves — teams are already at cognitive overload.
4. Security teams are already at a breaking point
Across the industry, AppSec organizations are dealing with:
- unprecedented alert volume
- 30–45 minutes of manual triage per finding
- lack of context across multiple codebases
- regulatory frameworks demanding continuous evidence
The Sonar report underscores what everyone in security feels but hasn’t yet articulated publicly: LLM-driven development is accelerating, but our ability to validate and reason about risk is not.
5. A new layer is clearly missing in the ecosystem
What the report really highlights — indirectly — is the gap that now exists:
- LLMs can generate code
- Scanners can flag patterns
- Humans still must interpret meaning, context, and real-world risk
None of today’s tools truly understand:
- how code connects across repositories
- which parts of a system matter most
- how vulnerabilities align with business logic
- how risks map to compliance obligations
- whether an issue is actually exploitable
Everything the industry relies on today still operates at the surface level — patterns in, alerts out.
As AI-generated code becomes normal, this gap only widens.
What’s missing isn’t just better detection — it’s a continuous understanding of risk exposure over time. Not just what issues exist, but which ones are reachable, exploitable, and impactful as systems evolve and code changes accumulate.
6. The direction of the future is becoming obvious
The next evolution in software assurance won’t be another scanner, another rule set, or another LLM.
It will be a new intelligence layer that can: – understand systems holistically – reason about intent and risk – validate what actually matters – continuously track real risk exposure as software changes
This is where the industry will inevitably move — toward systems that reason, not just scan.
The details of how this happens are still emerging, but the message from the latest research is clear:
AI is rewriting how software is built. Now we need AI to help verify how software stays secure.
7. What this means for AppSec leaders today
If you’re responsible for product security, engineering quality, or compliance, the takeaway isn’t fear — it’s readiness:
- expect more code
- expect more hidden flaws
- expect more pressure from regulators
- expect traditional tooling to plateau
- expect validation and context to become the next bottleneck
And most importantly:
expect new architectures to emerge that treat assurance as a reasoning and context problem, not a scanning problem.
This is where AI shifts from being a code generator to acting as a set of AI security engineers — maintaining understanding, surfacing exposure, and enabling humans to make better decisions under constant change.
The groundwork is being laid now.
At Neuralsec.io, we’re building the intelligence layer that understands systems holistically and reasons about real-world risk — not just patterns.