HIGHCWE-770AI Security

Rate Limiting Not Enforced on LLM Endpoint

Description

LLM-powered endpoints lack rate limiting, allowing attackers to exhaust your API quota and rack up massive bills.

How Vezraa Detects It

We send rapid requests to your LLM endpoints and check if rate limiting is enforced via 429 responses.

Real-World Impact

Attackers can drain your OpenAI or Anthropic budget in minutes — a single compromised endpoint can cost thousands of dollars per hour.

Fix Example

// Rate limiting with express-rate-limit
const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 10,
  message: 'Too many requests'
});
app.use('/api/chat', limiter);

Affected Stacks

OpenAIAnthropicLLM APIsNext.js

References

https://owasp.org/www-project-api-security/
CWE-770

Check if your app has this vulnerability

Scan your app in 25 seconds — no install, no code access required.