binary-code-hacked-background

MCP Prompt Hijacking: The New AI Protocol Threat CIOs and CISOs Can’t Ignore

  • 30 October, 2025

Overview: What is MCP prompt hijacking?

MCP prompt hijacking is a sneaky, practical attack that targets the communication layer between AI models and the services that feed them — specifically the Model Context Protocol (MCP). From what I've seen in real-world incident reviews, attackers rarely need to break a model's weights. Instead they mess with the session and message plumbing so the assistant starts doing things it shouldn't. Security researchers at JFrog demonstrated this with an MCP implementation flaw that lets an attacker impersonate sessions and steer assistant behavior — not by changing the model, but by changing the conversation around it.

Why this matters to business leaders and security teams

Connecting AI assistants to internal systems, code repos, and developer tools is sensible and increasingly common. It makes AI actually useful day-to-day. But it's also expanded the attack surface in ways many teams don't appreciate. In my experience, organizations obsess over model safety and forget the middleware — the protocol layer that hands the model context. That's where predictable identifiers, weak session handling, and sloppy event semantics bite you. Those flaws let attackers manipulate outputs, quietly exfiltrate data, or even poison supply chains — all without ever touching the model itself.

How MCP was meant to help — and where it goes wrong

MCP was conceived to give agents structured, controlled access to live files, developer tools, and external services — basically to point the assistant at the right context so it can make responsible, accurate suggestions. Great idea. It just depends on the implementation.

JFrog's research highlighted a glaring implementation problem (oatpp-mcp, tracked as CVE-2025-6515). Instead of secure, random session IDs, that library used memory addresses. Memory addresses are not random — they're predictable and sometimes recycled. That makes session impersonation disturbingly easy. In short: a perfectly reasonable protocol, a careless implementation, and suddenly you have an attack surface nobody expected.

How MCP prompt hijacking works — a simplified step-by-step

  • Session assignment: Clients connect, typically using Server-Sent Events (SSE), and the server hands back a session identifier.
  • Predictable IDs: The flawed implementation used pointer/memory addresses as IDs — which can be reused or guessed.
  • Pattern harvesting: An attacker spins up and tears down sessions fast to learn which addresses get recycled and to record those IDs.
  • Impersonation: Later the attacker replays a recorded ID and sends malicious MCP events that the server accepts as belonging to a legitimate session.
  • Impact: The assistant receives the injected prompts/responses and acts on them: bad package suggestions, malicious code snippets, leaking internal data, you name it.

Concrete example: supply chain manipulation

Picture a developer asking the assistant, “Which Python package should I use to process images?” A safe assistant points to Pillow. But if MCP is hijacked, the attacker can make the assistant recommend a fake package — theBestImageProcessingPackage — that pulls malicious code during install. That’s a classic supply-chain compromise, and it’s shockingly plausible. I’ve seen similar patterns in dependency confusion incidents — same playbook, different layer.

Why protocol-layer attacks are especially dangerous

  • They don’t touch the model — they manipulate inputs and outputs at runtime.
  • They’re stealthy: injected prompts can look perfectly normal in user-facing logs.
  • They scale: a single bug in a widely used library or middleware can hit many deployments at once.

Technical root cause in the reported case

The oatpp-mcp issue boiled down to two classic mistakes: using a memory address as a session identifier and relying on overly predictable SSE event numbers. When session IDs aren't cryptographically random, they can be discovered or reused. When event IDs are simple incrementing counters, an attacker can spray guesses until one sticks. Combined, they allow impersonation and event forgery.

What security leaders should do now

Below are practical steps CIOs, CISOs, and engineering teams should adopt immediately. These aren’t academic recommendations — they’re the same defensive moves that have stopped sessions and API attacks for a decade, applied to the AI layer.

  • Enforce secure session IDs: Generate session identifiers with cryptographically secure randomness and make them long enough to resist brute force. No pointers, no guessable sequences. Period.
  • Harden event design: Avoid sequential event IDs. Use unpredictable identifiers and put message authentication on critical events (HMACs are your friend here).
  • Adopt zero-trust for AI channels: Treat MCP endpoints like any other external API—mTLS where feasible, strict auth, and least-privilege access to sensitive resources.
  • Client-side validation: Have clients reject out-of-band events that don't match expected formats, IDs, or cryptographic checksums. Clients need to be guardians too.
  • Session lifecycle policies: Short-lived sessions, token rotation, and strict expirations dramatically reduce the value of a stolen ID.
  • Supply-chain hygiene: Monitor dependency usage, require signed packages, and consider allowlists for package names in developer tools. You don't want an assistant suggesting anything from the wild west. Learn more in our guide to AI in Customer Engagement.
  • Patch and inventory: Discover components that implement MCP or related middleware across your estate and prioritize patching known CVEs such as CVE-2025-6515.

Operational checklist for rapid response

  • Inventory MCP-enabled services and libraries across your estate. You can’t secure what you don’t know you have.
  • Apply vendor patches and mitigations urgently for known CVEs.
  • Enable strict logging and anomaly detection on MCP endpoints — watch for weird event patterns and ID reuse.
  • Simulate hijacking in a safe test environment to validate defenses, then remediate gaps uncovered in the exercise.
  • Train developers about secure session generation and the hazards of predictable identifiers — make it part of dev onboarding and code reviews.

Broader lessons: apply classic security to new AI layers

This is basically a web session-hijack story retold for AI protocols. The cure is not exotic. Strong session management, robust authentication, and zero-trust principles applied to protocol and middleware layers go a long way. As assistants move from isolated models to integrated systems, the connectors — the things that feed context to models — become as critical to secure as the models themselves. Kind of obvious in hindsight, but easy to overlook in the rush to ship features.

Further reading and sources

If you want the gritty technical detail, read the JFrog research disclosure and the NVD entry for CVE-2025-6515. Also check the MCP spec and vendor advisories as they appear. These sources give the attack mechanics and recommended mitigations. [Source: JFrog research] [Source: NVD/CVE-2025-6515]

Final takeaway

Prompt hijacking via MCP is a fresh-sounding vector grounded in old mistakes: insecure session handling and predictable identifiers. Leaders need to assume the attack surface now includes protocol middleware, not just models. Fix the fundamentals — secure session generation, message validation, and zero-trust access — and you cut the risk dramatically. Bottom line: secure the plumbing or risk the house flooding.

Personal note: In projects I’ve overseen, rolling out HMAC-signed event envelopes and rotating session tokens cut down suspicious event patterns in weeks. It’s not glamorous work. But it keeps your AI assistants honest and your devs safer. Worth every hour.