Security

    How do I stop Windsurf Cascade from leaking data without my consent?

    A developer routing Windsurf Cascade's outbound HTTPS through a local proxy and reviewing an egress allowlist while the agent reads project files, illustrating the layered defense pattern that blocks data exfiltration when read_url_content tries to fetch attacker-controlled URLs

    You read about Cascade's read_url_content disclosure, looked at your own setup, and realised the agent has been running outbound HTTPS without ever pausing for approval. The question now is not whether the issue is real. It is how to harden your workflow before the next indirect prompt injection lands in a README you never opened.

    Short answer

    Protecting an AI coding agent against unauthorized data exfiltration is a layered job. Pin manual approval on every network tool, push secrets out of the project tree, run the agent behind an outbound proxy with a per-domain allowlist, sandbox the workspace from the rest of the filesystem, and audit every dependency before the agent reads it. No single control closes the gap on its own. The Windsurf Cascade read_url_content disclosure from May 2025 made the cost of skipping any one of them concrete.

    What you should know

    • The exfiltration channel is the network tool, not the model. read_url_content fetches arbitrary URLs and accepts attacker-controlled query strings. The model decides what to send.
    • Indirect prompt injection sits in data you did not write. A README in a transitive dependency, a docstring, or an invisible Unicode comment all carry instructions Cascade reads as directives.
    • Approval caching is the silent failure mode. Per NVIDIA's sandboxing guidance, a single legitimate approval can immediately open a path to repeat abuse if the IDE caches the decision.
    • CVE-2025-62353 widens the blast radius. The path traversal flaw, scored CVSS 9.8 by HiddenLayer in October 2025, means files outside the project root are also reachable. Secrets in your home directory are not safe just because the workspace looks clean.
    • OWASP recommends tiered approval by impact. Network egress and file writes belong in the medium and high tiers, with explicit per-call confirmation, not auto-approve.
    • Mobile builders carry signing keys. App Store Connect API keys, Google Play service-account JSON, and Firebase admin tokens are the kind of artifact a vibe-coded project keeps nearby, and exactly the kind a leak hits hardest.

    The phrase covers two distinct behaviours. The first is that Cascade's network tools, read_url_content above all, run without per-call approval in default Windsurf builds. The second is that Cascade interprets file contents as instructions, so the consent step is moved upstream from "did you approve this HTTP request" to "did you approve every file the agent read along the way".

    The disclosure that surfaced this was Hijacking Windsurf: How Prompt Injection Leaks Developer Secrets by Johann Rehberger at Embrace The Red, published in August 2025 after three months of triage silence from Windsurf. The proof of concept hid instructions in a README, asked the agent to summarise the project, and watched Cascade fetch a URL with .env contents in the query string. The user never typed a URL. The agent never asked.

    What makes the consent flow misleading is that the IDE shows the visible action (a summary) while the side effect (an HTTPS request) runs first. By the time the human sees the agent's reply, the exfiltration has already gone through.

    Which defense layers actually stop outbound exfiltration?

    The honest answer is that no single control closes the path. The OWASP AI Agent Security Cheat Sheet groups the controls into authorization middleware, tool restrictions, output validation, and monitoring. NVIDIA's practical sandboxing guidance for agentic workflows adds two specific mandatory controls: network egress restrictions enforced through HTTP proxy or IP filtering, and filesystem write blocks outside the workspace. Each layer catches a different failure mode.

    A short comparison helps:

    LayerWhat it blocksWhat it does not
    Per-call approval on network toolsSilent outbound HTTP from the agentApproved actions inside an injected sequence
    Egress allowlist at the proxy or firewallTraffic to unrecognised domainsTraffic that piggybacks on an allowed domain
    Filesystem isolation outside the workspaceReads of ~/.aws, ~/.ssh, ambient secretsFiles the user opens into the workspace
    Project tree hygiene (no secrets, audited READMEs)Indirect prompt injection from project filesInjections from imported library code
    Workspace sandbox (VM, Kata container, devcontainer)Persistence and kernel escapesInformation already in scope for the agent

    Reading the table left to right, every row catches a class the row above missed. Skipping one row creates a category of leak the others cannot see.

    How do you set up a network egress allowlist for an AI coding agent?

    The practical pattern is a workspace-local proxy that the agent's HTTPS calls are forced through, with an explicit list of domains the agent is allowed to reach. NVIDIA's guidance is specific: "tightly scoped allowlists enforced through HTTP proxy, IP, or port-based controls" combined with "enterprise-level denylists that cannot be overridden by local users".

    On macOS or Linux, a project-scoped approach is to run the IDE inside a devcontainer or VM with HTTP_PROXY and HTTPS_PROXY pointed at a local mitmproxy or squid instance configured with the allowlist. On corporate networks, an outbound proxy with TLS inspection and domain-level rules can apply the same control at a wider scope. For an agent that needs to fetch package documentation, the allowlist typically includes npmjs.com, pypi.org, developer.apple.com, developer.android.com, and the small set of vendor docs the project actually depends on. Everything else is blocked.

    The limit is that an allowed domain can still be a carrier. If gist.github.com is on the list because you read public gists during research, an injection can stash a query string inside a gist URL and route exfiltrated data through it. The proxy reduces the surface; it does not eliminate it.

    What should leave your project tree before Cascade ever opens the folder?

    Indirect prompt injection works because the agent treats project files as input. The defensive move is to keep secrets and signing material out of the directories Cascade can read. Concretely:

    • Move .env, .env.local, signing keys, certificates, and provisioning profiles to a location outside the workspace, ideally to the platform keychain or a managed secrets service that requires an explicit fetch.
    • Stop committing example .env files that include real-looking placeholder names; the placeholders prime the agent to look for the matching real file.
    • Audit the README and any vendored licence files for hidden Unicode characters. The follow-up Sneaking Invisible Instructions by Developers in Windsurf write-up shows that Cascade reads Unicode Tag characters as agent commands even though they render as blank space in the editor.
    • Pin auto-execute mode off at the workspace level so the terminal tool does not run shell commands without per-call approval. This does not close the network fetch path on its own, but it stops a compromised session from rotating signing material or installing a persistent backdoor.

    For builders shipping iOS and Android apps, the relevant secret set is wider than the typical vibe-coded project assumes. App Store Connect API keys, Google Play service-account JSON, Firebase admin tokens, Supabase service-role keys, and Sentry auth tokens all show up in repositories that should have moved them out months ago.

    How does this defense pattern map to OWASP MASVS for mobile builds?

    OWASP MASVS approaches the same problem from the mobile side and the categories map cleanly. Storage covers secrets at rest in the app or its build inputs. Network Communication covers what the compiled bundle calls out to. Code quality and build settings cover what enters the final binary. An exfiltration flaw in a coding agent affects all three: secrets read from the dev environment can land in the bundle, the bundle can carry a hardcoded endpoint to an attacker-controlled host, and the build configuration can leak environment variables through embedded resources.

    The Witness AI catalog of Windsurf CVEs groups the read_url_content disclosure with CVE-2025-62353 (path traversal, CVSS 9.8) and CVE-2026-30615 (MCP configuration tampering, CVSS 8.0) because the same indirect prompt injection can chain through all three. A README that triggers codebase_search to harvest secrets, then read_url_content to exfiltrate, then MCP tampering to install persistence is a plausible sequence the agent will execute if auto-execute is on.

    The defensive pattern is the same on both sides. Move secrets out of the readable surface, allowlist outbound destinations, audit what actually entered the bundle. For builders who want an external automated read of a compiled APK, AAB, or IPA before upload, PTKD.com (https://ptkd.com) is one of the platforms focused specifically on pre-submission scanning aligned with OWASP MASVS, including checks on hardcoded secrets and outbound endpoints inside the bundle.

    What to watch out for

    Two patterns burn developers most often. The first is treating workspace trust as binary. Cascade does not separate the file you wrote yesterday from the README in a transitive dependency you have never read. Both arrive in the context window with the same authority. Reviewing your own code does nothing for the eight thousand files the agent traverses through node_modules.

    The second is treating the IDE's per-tool approval prompt as the last line of defense. Per-call approval works against accidental abuse and casual misuse. It works less well against an injection that has already set up a plausible reason for the next call. A confirmation that asks "fetch this URL?" without context for why the agent wants to fetch it is a confirmation almost everyone clicks through.

    The myth worth retiring is that disabling Turbo mode closes the exfiltration path. Turbo, also called terminal auto-execute, affects shell commands, not network fetches. read_url_content runs whether Turbo is on or off in default Windsurf builds, so turning Turbo off narrows the blast radius but does not block the original channel.

    Key takeaways

    • Treat any tool that can reach the network as a possible exfiltration channel. Per-call approval, an egress allowlist, and a sandboxed workspace each close a different class of leak; running all three in parallel is what holds.
    • Move secrets out of the project tree before opening the folder in Cascade. CVE-2025-62353's path traversal scope means a global cleanup is more durable than a per-project one.
    • Audit READMEs, licence files, and dependency manifests for invisible Unicode characters when working with unfamiliar repositories. Visual review alone misses Unicode Tag injections.
    • Pin auto-execute off at the workspace level, log every tool call, and alert on outbound HTTPS to domains outside the allowlist during agent sessions.
    • For builders shipping to App Store Connect or Google Play, an external pre-submission scan from a platform like PTKD.com (https://ptkd.com) is a calm way to verify that the compiled bundle does not carry secrets, hardcoded endpoints, or outbound calls a leaked key might enable.
    • #windsurf
    • #cascade
    • #data-exfiltration
    • #prompt-injection
    • #ai-security
    • #egress-controls
    • #owasp-masvs

    Frequently asked questions

    What is the most important defense to add first if I am using Windsurf Cascade today?
    Pin manual approval on every network tool inside Cascade, then move .env files and signing material out of the workspace. The combination closes the simplest exfiltration path, where an indirect prompt injection asks the agent to fetch a URL with secrets in the query string. Egress proxy work is the next layer, but those first two steps cost an hour and stop the most common pattern.
    Does disabling Turbo or auto-execute mode in Cascade stop data exfiltration?
    Disabling auto-execute closes the terminal channel, where shell commands run without per-call approval, but it does not close the network fetch path. read_url_content can still call out to any URL whether Turbo is on or off in default builds. Treat the two as separate defenses: Turbo off limits damage from injected shell commands; per-tool approval on network calls limits the original data leak.
    How does an egress allowlist work for a local coding agent?
    The agent's HTTP traffic is forced through a local proxy, usually mitmproxy or squid inside a devcontainer, with an explicit list of domains it can reach. Calls to anything outside the list fail. The allowlist typically includes package registries, official platform docs, and the small set of vendor APIs the project depends on. Everything else is denied by default, which is what blocks exfiltration to unfamiliar hosts.
    Is OWASP MASVS relevant if the leak happened in my IDE rather than my app?
    The categories map directly. Secrets exfiltrated from the dev machine often end up baked into the build the IDE produces, so MASVS Storage and Network controls catch the downstream effect of an IDE-side leak. A pre-submission scan that inspects the compiled APK, AAB, or IPA verifies that no hardcoded secrets, unfamiliar outbound endpoints, or leaked tokens ride along into the store binary.
    Can a network proxy alone stop indirect prompt injection?
    No. A proxy with an allowlist blocks outbound traffic to unrecognised domains, but an injection can still route data through an allowed domain. A gist URL, a documentation site that echoes query strings, or any allowed host with a public endpoint can become a carrier. The proxy reduces the attack surface to whatever is on the list, which is why audit, monitoring, and per-call approval remain part of the picture.

    Keep reading

    Scan your app in minutes

    Upload an APK, AAB, or IPA. PTKD returns an OWASP-aligned report with copy-paste fixes.

    Try PTKD free