AI Strikes Back: Using an LLM to write COBOL
A Long Time Ago in an IT company Far, Far Away…
EPISODE II — AI STRIKES BACK
It is a dark time for the rebel developer. The ancient COBOL technology, once dismissed by the empire of modern frameworks, has proven its enduring worth.
Emboldened by this discovery, the rebel developer has enlisted an unexpected ally: an AI assistant, whose knowledge spans billions of lines of code from across the galaxy.
But the AI is a fast and eager partner. Perhaps too fast. Building at the speed of thought, it answers the question you asked, not the one you should have asked. And in doing so, it may have handed the Empire its greatest weapon yet…
This is the second article of The AI Wars Trilogy, following A new hope. Good bye React. Meet COBOL-Admin.
Copilot and COBOL, The Unnatural Marriage
“AI can generate any code you want.” You’ve heard this. Maybe you’ve said it. Maybe you’ve built something with it and felt invincible.
So I decided to test the edges of that claim. Not with React or Python, but with COBOL, the language born in 1959 that everyone knows is dead, except for the banks quietly running it on every ATM transaction you’ve ever made.
The key detail for what follows: there is almost no open-source COBOL code. It’s a language maintained by institutional memory, not communities, which made it the perfect test case.

“Surely It Can’t Write COBOL”
My goal was simple: build a browser-based IDE where anyone can write and run COBOL, like CodePen or JSFiddle, but for a language that predates the moon landing.
I genuinely expected the AI to struggle. COBOL is niche, proprietary, and wildly underrepresented in the kind of public code that feeds training data. Before asking it to write anything, I asked Copilot about COBOL (its history, its quirks, its DIVISION syntax) just to gauge how much it actually knew.
It knew a lot.
So I described the goal: let anyone run COBOL from their browser. Copilot took it from there and designed a plan for the full architecture, tech stack, steps, tradeoffs, and rough timeline.
Then I typed: “OK, do this.”
So it built a working COBOL IDE in the browser with a clean interface, syntax highlighting, example programs, and real-time output. I didn’t need to dive into old documentation diving, search Stack Overflow for hours, and get my fingers sore with boilerplate. For a language with almost no open-source presence, it was genuinely impressive.

I work with AI agents every day on production projects. I know what they’re good at with JavaScript, React, Node.js. This was the first time I’d seen one handle something truly obscure, and handle it well.
And Then I Actually Read the Code
Here’s what the backend was doing:
exec(`cobc -x -free "${cobolFile}" -o "${outputFile}" 2>&1`);exec(`"${outputFile}"`);Let me rephrase these two lines:
- A user writes COBOL in their browser.
- The code gets sent to the server.
- The server compiles it.
- The server runs the resulting binary.
Directly. Without any isolation whatsoever.
This is remote code execution, one of the most well-known, most documented, most universally avoided vulnerabilities in web security. It’s the kind of thing that shows up in first-year security courses under “things you never, ever do.”
And the AI built it without volunteering a single security concern. Copilot didn’t warn me, not even a “you might want to consider sandboxing this.”
To be clear about what this means in practice, let’s imagine a malicious user writes this COBOL code in the IDE:
IDENTIFICATION DIVISION.PROGRAM-ID. TOTALLY-INNOCENT.PROCEDURE DIVISION. DISPLAY 'Hello, World!'. CALL 'SYSTEM' USING 'cat /etc/passwd'. STOP RUN.The web page would display the server’s /etc/passwd. Users of the web IDE could read any file the server can access, execute arbitrary system commands, install backdoors, mine crypto, or take the whole machine down. COBOL — the language legendary for stability — turned into an attack vector in a few hours.

To be fair: when I went back and explicitly asked Copilot to review the security of the produced code, it listed the vulnerabilities clearly: remote code execution, missing sandboxing, lack of input validation, and a few more. It just didn’t say anything until I asked.
That’s the subtlety worth understanding. The AI built exactly what I described: something that executes COBOL code. Assessing whether that design is safe to deploy is a different question, and it didn’t ask that question. The responsibility to ask sits with whoever is building.
The Fix a Developer Actually Makes
When I asked Copilot to fix the security issues, it proposed reasonable solutions. There are three classic approaches here:
1. Container isolation. Each execution runs in a separate Docker container with no network access, a read-only filesystem, CPU and memory limits, and automatic termination after a few seconds. It’s the standard approach for code playgrounds like CodeSandbox. It works, but it adds real operational complexity (like managing container lifecycle on every request).
2. Strict input validation and syscall filtering. Forbid dangerous patterns in the code (e.g., CALL 'SYSTEM'), apply seccomp/AppArmor profiles to restrict what the compiled binary can do, and add hard limits on execution time and output size. Less infrastructure than containers, but it’s a blocklist approach, and the program is always one edge case away from a bypass.
3. Move execution to the client entirely. COBOL can be compiled to WebAssembly and run in the browser. The server sends the WASM runtime once. After that, compilation and execution happen client-side. The server never touches user code again.
// Execution moves to the browserconst result = await cobolWasm.compile(cobolCode);if (result.success) { const output = await cobolWasm.run(result.binary); setOutput(output);}This third option is the most elegant because it eliminates the problem rather than containing it. The attack surface disappears by design.
Copilot could implement any of these once I described the architecture. The agent is perfectly capable of writing the code. What it cannot do is decide which approach is right for this context, or notice that the original design had a problem worth solving in the first place.
That judgment requires a developer.
The Empire Strikes Back, But Not Alone
When I saw that COBOL IDE running in my browser, it looked like magic. I had zero prior COBOL experience, yet I managed to create a working product. I felt like a true Jedi, and it’s exactly what makes vibe-coding dangerous.
To be clear, the AI didn’t hide the risks. It would have listed them if I’d asked. But you have to know to ask. The confidence the tool projects doesn’t come with a warning label.
Think of the AI as a very eager Padawan. Given the right guidance, step by step, it can build impressive things. But left unsupervised, it will build you a fully functional Death Star with the exhaust port unguarded and not think twice about it. The wisdom to use these tools well still has to come from somewhere else, with a developer in the loop.
That’s why Software Developers Will Never Die.
P.S.: My COBOL Playground is open source and available on GitHub: marmelab/cobol-playground. It’s a demo, not a production environment. Don’t deploy it as-is.
The rebel developer secured the COBOL Playground, and the galaxy breathed a little easier. But somewhere in the distance, a new threat was taking shape — one that would challenge everything the Empire of modern frameworks thought it knew about its own dominance…
In the next episode: The Return of the Developer!
Authors
Full-stack web developer at marmelab, Julien follows a customer-centric approach and likes to dig deep into hard problems. He also plugs Legos to computers. He doesn't know who is Irene...