Tool	Accuracy of Findings	Detects Non-Pattern-Based Issues?	Coverage of SAST Findings	Speed of Scanning	Usability & Dev Experience
DryRun Security	Very high – caught multiple critical issues missed by others	Yes – context-based analysis, logic flaws & SSRF	Broad coverage of standard vulns, logic flaws, and extendable	Near real-time PR feedback	Clear PR comments, expandable policies with no scripting or coding (NLCP)
Snyk Code	High on well-known patterns (SQLi, XSS), but misses other categories	Limited – AI-based, focuses on recognized vulnerabilities	Good coverage of standard vulns; may miss SSRF or advanced auth logic issues	Fast, often near PR speed	Decent GitHub integration, but rules are a black box
GitHub Advanced Security (CodeQL)	Very high precision for known queries, low false positives	Partial – strong dataflow for known issues, needs custom queries	Good for SQLi and XSS but logic flaws require advanced CodeQL experience.	Moderate to slow (GitHub Action based)	Requires CodeQL expertise for custom logic
Semgrep	Medium, but there is a good community for adding rules	Primarily pattern-based with limited dataflow	Decent coverage with the right rules, can still miss advanced logic or SSRF	Fast scans	Has custom rules, but dev teams must maintain them
SonarQube	Low – misses serious issues in our testing	Limited – mostly pattern-based, code quality oriented	Basic coverage for standard vulns, many hotspots require manual review	Moderate, usually in CI	Dashboard-based approach, can pass “quality gate” despite real vulns

Tool

Accuracy of Findings

Detects Non-Pattern-Based Issues?

Coverage of SAST Findings

Speed of Scanning

Usability & Dev Experience

DryRun Security

Very high – caught multiple critical issues missed by others

Yes – context-based analysis, logic flaws & SSRF

Broad coverage of standard vulns, logic flaws, and extendable

Near real-time PR feedback

Clear PR comments, expandable policies with no scripting or coding (NLCP)

Snyk Code

High on well-known patterns (SQLi, XSS), but misses other categories

Limited – AI-based, focuses on recognized vulnerabilities

Good coverage of standard vulns; may miss SSRF or advanced auth logic issues

Fast, often near PR speed

Decent GitHub integration, but rules are a black box

GitHub Advanced Security (CodeQL)

Very high precision for known queries, low false positives

Partial – strong dataflow for known issues, needs custom queries

Good for SQLi and XSS but logic flaws require advanced CodeQL experience.

Moderate to slow (GitHub Action based)

Requires CodeQL expertise for custom logic

Semgrep

Medium, but there is a good community for adding rules

Primarily pattern-based with limited dataflow

Decent coverage with the right rules, can still miss advanced logic or SSRF

Fast scans

Has custom rules, but dev teams must maintain them

SonarQube

Low – misses serious issues in our testing

Limited – mostly pattern-based, code quality oriented

Basic coverage for standard vulns, many hotspots require manual review

Moderate, usually in CI

Dashboard-based approach, can pass “quality gate” despite real vulns

Vulnerability Class

Snyk (partial)

GitHub (CodeQL) (partial)

Semgrep

SonarQube

DryRun Security

SQL Injection

Cross-Site Scripting (XSS)

SSRF

Auth Flaw / IDOR

User Enumeration

Hardcoded Token

Tool	Accuracy of Findings	Detects Non-Pattern-Based Issues?	Coverage of C# Vulnerabilities	Scan Speed	Developer Experience
DryRun Security	Very high – caught all critical flaws missed by others	Yes – context-based analysis finds logic errors, auth flaws, etc.	Broad coverage of OWASP Top 10 vulns plus business logic issues	Near real-time (PR comment within seconds)	Clear single PR comment with detailed insights; no config or custom scripts needed
Snyk Code	High on known patterns (SQLi, XSS), but misses logic/flow bugs	Limited – focuses on recognizable vulnerability patterns	Good for standard vulns; may miss SSRF or auth logic issues	Fast (integrates into PR checks)	Decent GitHub integration, but rules are a black box (no easy customization)
GitHub Advanced Security (CodeQL)	Low - missed everything except SQL Injection	Mostly pattern-based	Low – only discovered SQL Injection	Slowest of all but finished in 1 minute	Concise annotation with a suggested fix and optional auto-remedation
Semgrep	Medium – finds common issues with community rules, some misses	Primarily pattern-based, limited data flow analysis	Decent coverage with the right rules; misses advanced logic flaws	Very fast (runs as lightweight CI)	Custom rules possible, but require maintenance and security expertise
SonarQube	Low – missed serious issues in our testing	Mostly pattern-based (code quality focus)	Basic coverage for known vulns; many issues flagged as “hotspots” require manual review	Moderate (runs in CI/CD pipeline)	Results in dashboard; risk of false sense of security if quality gate passes despite vulnerabilities

Tool

Accuracy of Findings

Detects Non-Pattern-Based Issues?

Coverage of C# Vulnerabilities

Scan Speed

Developer Experience

DryRun Security

Very high – caught all critical flaws missed by others

Yes – context-based analysis finds logic errors, auth flaws, etc.

Broad coverage of OWASP Top 10 vulns plus business logic issues

Near real-time (PR comment within seconds)

Clear single PR comment with detailed insights; no config or custom scripts needed

Snyk Code

High on known patterns (SQLi, XSS), but misses logic/flow bugs

Limited – focuses on recognizable vulnerability patterns

Good for standard vulns; may miss SSRF or auth logic issues

Fast (integrates into PR checks)

Decent GitHub integration, but rules are a black box (no easy customization)

GitHub Advanced Security (CodeQL)

Low - missed everything except SQL Injection

Mostly pattern-based

Low – only discovered SQL Injection

Slowest of all but finished in 1 minute

Concise annotation with a suggested fix and optional auto-remedation

Semgrep

Medium – finds common issues with community rules, some misses

Primarily pattern-based, limited data flow analysis

Decent coverage with the right rules; misses advanced logic flaws

Very fast (runs as lightweight CI)

Custom rules possible, but require maintenance and security expertise

SonarQube

Low – missed serious issues in our testing

Mostly pattern-based (code quality focus)

Basic coverage for known vulns; many issues flagged as “hotspots” require manual review

Moderate (runs in CI/CD pipeline)

Results in dashboard; risk of false sense of security if quality gate passes despite vulnerabilities

Vulnerability Class

Snyk Code

GitHub Advanced Security (CodeQL)

Semgrep

SonarQube

DryRun Security

SQL Injection (SQLi)

Cross-Site Scripting (XSS)

Server-Side Request Forgery (SSRF)

Auth Logic/IDOR

User Enumeration

Hardcoded Credentials

Vulnerability

DryRun Security

Semgrep

GitHub CodeQL

SonarQube

Snyk Code

1. Remote Code Execution via Unsafe Deserialization

2. Code Injection via eval() Usage

3. SQL Injection in a Raw Database Query

4. Weak Encryption (AES ECB Mode)

5. Broken Access Control / Logic Flaw in Authentication

Total Found

5/5

3/5

1/5

0/5

Vulnerability

DryRun Security

Snyk

CodeQL

SonarQube

Semgrep

Server-Side Request Forgery (SSRF)

(Hotspot)

Cross-Site Scripting (XSS)

SQL Injection (SQLi)

IDOR / Broken Access Control

Broken Authentication Logic

Invalid Token Validation Logic

Broken Email Verification Logic

Dimension	Why It Matters
Surface	Entry points & data sources highlight tainted flows early.
Language	Code idioms reveal hidden sinks and framework quirks.
Intent	What is the purpose of the code being changed/added?
Design	Robustness and resilience of changing code.
Environment	Libraries, build flags, and infra metadata flag, infrastructure (IaC) all give clues around the risks in changing code.

Dimension

Why It Matters

Surface

Entry points & data sources highlight tainted flows early.

Language

Code idioms reveal hidden sinks and framework quirks.

Intent

What is the purpose of the code being changed/added?

Design

Robustness and resilience of changing code.

Environment

Libraries, build flags, and infra metadata flag, infrastructure (IaC) all give clues around the risks in changing code.

KPI	Pattern-Based SAST	DryRun CSA
Mean Time to Regex	3–8 hrs per noisy finding set	Not required
Mean Time to Context	N/A	< 1 min
False-Positive Rate	50–85 %	< 5 %
Logic-Flaw Detection	< 5 %	90%+

KPI

Pattern-Based SAST

DryRun CSA

Mean Time to Regex

3–8 hrs per noisy finding set

Not required

Mean Time to Context

N/A

< 1 min

False-Positive Rate

50–85 %

< 5 %

Logic-Flaw Detection

< 5 %

90%+

	Severity
Location	utils/authorization.py :L118	utils/authorization.py :L49 & L82 & L164
Issue	JWT Algorithm Confusion Attack: jwt.decode() selects the algorithm from unverified JWT headers.	Insecure OIDC Endpoint Communication: ‍urllib.request.urlopen called without explicit TLS/CA handling.
Impact	Complete auth bypass (switch RS256→HS256, forge tokens with public key as HMAC secret).	Susceptible to MITM if default SSL behavior is weakened or cert store compromised.
Remediation	Replace the dynamic algorithm selection with a fixed, expected algorithm list. Change line 118 from algorithms=[unverified_header.get('alg', 'RS256')] to algorithms=['RS256'] to only accept RS256 tokens. Add algorithm validation before token verification to ensure the header algorithm matches expected values.	Create a secure SSL context using ssl.create_default_context() with proper certificate verification. Configure explicit timeout values for all HTTP requests to prevent hanging connections. Add explicit SSL/TLS configuration by creating an HTTPSHandler with the secure SSL context. Implement proper error handling specifically for SSL certificate validation failures.
Key Insight	This vulnerability arises from trusting an unverified portion of the JWT to determine the verification method itself	This vulnerability stems from a lack of explicit secure communication practices, leaving the application reliant on potentially weak default behaviors.

Severity

Critical

High

Location

utils/authorization.py :L118

utils/authorization.py :L49 & L82 & L164

Issue

JWT Algorithm Confusion Attack:
jwt.decode() selects the algorithm from unverified JWT headers.

Insecure OIDC Endpoint Communication:
‍urllib.request.urlopen called without explicit TLS/CA handling.

Impact

Complete auth bypass (switch RS256→HS256, forge tokens with public key as HMAC secret).

Susceptible to MITM if default SSL behavior is weakened or cert store compromised.

Remediation

Replace the dynamic algorithm selection with a fixed, expected algorithm list. Change line 118 from algorithms=[unverified_header.get('alg', 'RS256')] to algorithms=['RS256'] to only accept RS256 tokens. Add algorithm validation before token verification to ensure the header algorithm matches expected values.

Create a secure SSL context using ssl.create_default_context() with proper certificate verification. Configure explicit timeout values for all HTTP requests to prevent hanging connections. Add explicit SSL/TLS configuration by creating an HTTPSHandler with the secure SSL context. Implement proper error handling specifically for SSL certificate validation failures.

Key Insight

This vulnerability arises from trusting an unverified portion of the JWT to determine the verification method itself

This vulnerability stems from a lack of explicit secure communication practices, leaving the application reliant on potentially weak default behaviors.

AI in AppSec

•

December 8, 2025

Introducing the LLM App Risk White Paper

Why we wrote “Building Secure AI Applications” and how it helps teams build secure LLM features without slowing down

‍

Everywhere I go, I meet teams who are building with LLMs even if they do not describe it that way. They might say they are adding a new search feature, or experimenting with an internal copilot, or stitching a model into an existing workflow. The label changes, but the truth is the same. Modern software now has an LLM somewhere inside it. This shift is happening fast, often faster than the security thinking behind it.

As we spent the last year talking with customers and prospects, a pattern kept showing up. Teams were treating LLM features the way they used to treat web apps or APIs. They were assuming the threat model was similar. It is not.

When you add a model to your product, you are no longer only dealing with input validation and classic injection flaws. You are dealing with unpredictable behavior, semantic vulnerabilities, retrieval-based mistakes, and agents that can take actions in ways traditional code scanners were never designed to understand.

The OWASP Top 10 for LLM Applications captured this problem clearly. Prompt injection. Data and model poisoning. System prompt leakage. Misinformation. Unbounded consumption. Anyone who has spent time with real production LLM systems knows these are not theoretical risks. They appear the moment a model starts touching real code, real data, and real users.

We wrote this white paper to give engineering and security leaders a practical guide to understanding this new class of risk.

We wanted to show where these failures actually happen in real systems, not in abstract research. We also wanted to give teams a sense of what good looks like. Not perfection but a path. Something they can point to when they are building or reviewing design decisions.

What to expect

Inside the white paper, we walk through every OWASP LLM risk. We explain what the risk is, why it matters in real environments, and how it has shown up in actual incidents. We talk about what attackers are doing today and what controls help most. The paper also includes a reference architecture that shows where guardrails belong.

*Reference architecture with OWASP Top 10 for LLMs mapping*

‍

For example, how a policy layer outside the model can mediate prompts, how a tool proxy lets you enforce least privilege for agents, how vector stores become a new data perimeter, and why you need structured outputs before anything reaches a downstream system.

Teams kept telling us they needed something actionable. So we also included a control checklist. It maps each OWASP risk directly to the part of the system where the mitigation belongs. API gateway, orchestrator, policy layer, retrieval pipeline, vector store, SDLC, and monitoring. If you are wondering who should own what or where the work should sit, this helps clarify it.

One of the highlights of the project came from a conversation with Adam Dyche at Commerce. They are deep into AI driven shopping experiences and they are doing it with intent. Adam said that OWASP LLM risks are really about context and that they wanted to build security in from the beginning.

Adam said DryRun Security outperformed every other tool they tested because it understood the code the way their engineers did.

It was a reminder to me that teams do not want magic. They want clarity. They want tools that see the real environment they are building, not a generic pattern matcher that was designed for yesterday’s applications.

The truth is simple. Most teams are shipping LLM features faster than their security practices are evolving. It’s not because they do not care. It’s because the old mental models no longer fit. Traditional code scanners miss the majority of LLM specific vulnerabilities because they were never designed to inspect model planners, tool interfaces, RAG pipelines, embeddings, or agent frameworks. They cannot reason about behavior that emerges at runtime. They can only look for the patterns they already know.

Our work at DryRun sits directly in that gap. We focus on context and code intent. We help teams catch missing guardrails, unsafe prompt handling, weak tool usage, insecure RAG pipelines, over privileged agents, and code paths that can lead to runaway token use. These are problems that only show up when the model meets the application. And that is exactly where teams are struggling today.

But this white paper is not about us. It is about the shift happening across the industry. Developers are now responsible for part of the AI attack surface. Security teams are learning a new language.

Everyone is trying to figure out what good looks like.

Our goal is to give you a clear starting point and a map you can use no matter what tools you choose.

If your team is building with LLMs today or planning to, this paper will give you a deep understanding of the top ten risks, a proven reference architecture, and a checklist you can use during design, review, and implementation. It will help you build faster with fewer surprises and with the guardrails needed for systems that behave differently from anything we have built before.

You can read it here: Building Secure AI Applications

My hope is that it helps teams ship great AI features without losing sight of security in the process. This new era of software is full of opportunity. We just need to build it with clear thinking, strong controls, and a willingness to evolve our practices alongside the technology.

James Wickett

CEO & Co-Founder

DryRun Security

No items found.

Introducing the LLM App Risk White Paper

What to expect

Related Blogs

The Half Life (and Decay) of Static Rules in a Modern Codebase

7 Mistakes Teams Make When Building AI Applications