I am not affiliated with Codeium or Windsurf. I am not responsible for any damage or loss of data that may occur from using this extension. It may stop working at any time, as it's based on decompiled protos and guessed logic. If you are a Windsurf developer, I dare you: please consider adding an API to your wonderful IDE. That would benefit a lot of your users - just imagine being able to send prompts with context directly from Figma or Chrome DevTools without needing to copy-paste large texts.
Windsurf uses a random port for its gRPC server. The port is managed by (I think) a modified part of the VSCode core. But the chat panel is actually a built-in extension, so how the fuck does it know where to connect?
Simple: after server initialization and user account retrieval, Windsurf executes a command (windsurf.initializeCascade) with parameters including the port and nonce.
The command execution mechanism is the only public API of any extension. Every extension in VSCode has two isolated parts: host process and renderer. Like in Electron, but we don't actually have control over the renderer process - we can only ask VSCode to spawn an iframe for us. The problem is that iframes are also isolated, and you cannot just go to some other iframe and execute scripts in its context.
The solution: VSCode docs won't tell you this, but you can override handlers for any other extension. The only thing that matters is the order of loading, and thanks god Windsurf extension loads first.
const windsurfHandler = vscode.commands.registerCommand(
"windsurf.initializeCascade",
async (...args) => {
const params = JSON.parse(atob(args[0]));
console.log(params);
}
);This simple trick gives you the port, server API key, and metadata. But it breaks Windsurf's extension. So how do we call the original handler?
There's no direct way to get a callback for a registered command, but if we dispose our handler and re-execute the command, it gets passed to the original handler. For some reason VSCode doesn't replace it completely - it just pushes our handler to the top of the stack.
const windsurfHandler = vscode.commands.registerCommand(
"windsurf.initializeCascade",
async (...args) => {
const params = JSON.parse(atob(args[0]));
client = new Client(params);
console.log("Windsurf initialized");
windsurfHandler.dispose();
setTimeout(() => {
vscode.commands.executeCommand("windsurf.initializeCascade", ...args);
});
}
);This allows both our extension and Windsurf to work.
Windsurf uses gRPC for communication with its server, and it's not a public API. We need to get the protos. Thankfully JS is fucking stupid and straightforward - having a compiled gRPC client, we can do some magic (described in ./decompile/DECOMPILE.md) and get the source protos back, just to compile them back to TypeScript. Fucking cycle of nonsense.
Update: Improved the decompiler to recover full gRPC service definitions with proper streaming annotations. This makes adding new API features way easier - just look at the proto, call the RPC method through the generated client.
With ports and protos, we can finally send messages to the server. Go to the network tab, observe what workbench.desktop.main.js sends, decode protobufs to get an idea of what to send and where. Write a client and that's it.
Wrap that shit in a REST server with proper queueing (Windsurf does this UI-side instead of at gRPC level), and now you can:
- Start new conversations
- Continue existing conversations by cascadeId
- List available models and select which one to use
- Send messages without blocking - queue handles cascade status automatically
The queue system is per-cascade, so multiple conversations can process concurrently. If a cascade is idle, messages send immediately. If busy, they queue and wait.
- Install the extension in Windsurf
- Configure settings:
windsurfapi.port- HTTP server port (default: 47923)windsurfapi.autoStart- Auto-start server on Windsurf init (default: false)
- Start server manually via command palette:
Windsurf API: Start Server(or enable autoStart)
Send a message to Windsurf. Returns immediately with status and cascadeId.
# New conversation
curl -X POST http://localhost:47923/prompt \
-H "Content-Type: application/json" \
-d '{"text": "Hello from API"}'
# Continue existing conversation
curl -X POST http://localhost:47923/prompt \
-H "Content-Type: application/json" \
-d '{"text": "Follow-up question", "cascadeId": "cascade-id-here"}'
# With images
curl -X POST http://localhost:47923/prompt \
-H "Content-Type: application/json" \
-d '{
"text": "What is in this image?",
"images": [{"base64": "iVBORw0KGgo...", "mime": "image/png"}]
}'
# With model selection
curl -X POST http://localhost:47923/prompt \
-H "Content-Type: application/json" \
-d '{"text": "Use GPT-5", "model": "GPT-5 (low reasoning)"}'Response:
{
"status": "sent", // or "queued" if cascade is busy
"messageId": "message-uuid",
"cascadeId": "cascade-uuid",
"queuePosition": 1 // only present if queued
}Get list of available models.
curl http://localhost:47923/modelsReturns: ["Claude Sonnet 4.5 (promo)", "SWE-1", "GPT-5 (low reasoning)", ...]
List all conversations.
curl http://localhost:47923/trajectoriesReturns array of conversations with cascadeId, name, status, timestamps, etc.
View queued messages. Optional ?cascadeId=xxx to filter by cascade.
curl http://localhost:47923/queue
curl http://localhost:47923/queue?cascadeId=cascade-idCheck status of a specific message.
curl http://localhost:47923/queue/message-idCheck if a cascade is idle or busy.
curl http://localhost:47923/status?cascadeId=cascade-idLocated in tests/:
continue.sh- Interactive script to list and continue conversationsqueue_test.sh- Send 5 messages, monitor queue until emptytest_text.sh- Basic text message testtest_image.sh- Image message testtest_models.sh- List available modelstest_model_selection.sh- Test sending messages with different modelstest_trajectories.sh- List all conversations
Windsurf API: Start Server- Start HTTP serverWindsurf API: Stop Server- Stop HTTP server
To build a .vsix file:
pnpm run packageThe GitHub Actions workflow automatically builds and releases the extension on git tags (e.g., v0.0.2).
If you're building with this code or learned from it, let us know! We'd love to hear about your project.
I've kept an eye out for other implementations and references to the core APIs, I typically find them by searching for the protobuf method names in GitHub/elsewhere like this:
- Github Search: /exa.cortex_pb.SendChatMessage/ - Github search for SendChatMessage method
- Github Search: /exa.cortex_pb.ArenaModeInfo/ - Github search for recently added APIs
- mobb-dev/bugsy: .../CortexTrajectory.ts - Contains ArenaModeInfo, recent additions to the API
- IronBit-0/cpwn:.../antigravity_auto/universal_proxy_docs/patch_extension.py - A fun reminder that google anti-gravity is a windsurf fork which itself is a vscode fork 🤣
- rsvedant/opencode-windsurf-auth - Also reverse engineering the Winsdurf API, building opencode.ai plugin
- vishnu09bharath/ai-usage-limit-monitor - MacOS Menu Bar app that monitors AI usage for various services including Antigravity
- yuxinle1996/windsurf-grpc - Proto reverse engineering scripts, and various forks ( like windsurf-account-manager)
If you've built something with these protos or figured out more of the API, feel free to open a PR.
Made with ❤️ by:
- @AlexStrNik - Original author and creator
- @dfallon - Maintainer and ongoing development
Special thanks to all contributors who help keep this project alive!