The Joy of Codex

By Gary Kennedy
March 22, 2026

Codex isn’t just writing code — it’s changing how I think about building software.

What I Mean by “Codex”

When I say Codex, I’m not referring to some abstract idea or historical manuscript. I mean the new OpenAI Codex app — a recently released Mac application that brings AI-assisted coding into a more direct, interactive workflow.

How I Got Started

The first thing I did was simple: I pointed Codex at an existing codebase and asked it to review and summarise it.

This is the kind of task that’s usually tedious for a human and can take quite some time. Codex handled it surprisingly well and extremely quickly.

It produced a clear summary of the structure, highlighted key components, and even identified areas that looked like potential issues. These issues were similar to what a static code analyser might identify.

Codex wasn’t just generating code — it was reasoning about it.

Letting Codex Do the Boring Stuff

After that, I started giving it tasks I simply didn’t feel like doing.

As an example, I had recently integrated the fastexcel library to read and write Excel files. I had written the functionality to export data to Excel, but it wasn’t formatted.

Rather than spend time reading documentation and searching for examples, I asked Codex to apply date formatting and decimal formatting to the relevant columns when exporting to Excel.

It read through the fastexcel documentation and updated the exporting code to set formatting on the exported file — all within a few seconds. That was a significant time saving for me.

I then asked it to write a unit test to check the new functionality. That’s where I saw the first problem with Codex.

It wrote a file to memory but could not see the formatting when it read the file back in again. It reasoned that fastexcel was missing the functionality to read formatting information from a file.

Codex knew the structure of an .xlsx file, so it tried to unzip the file itself and read the formatting directly. I found this both absurd and alarming.

Codex then struggled with unzipping the file in memory and asked for permission to create a temporary file on disk and read from that. I said no, and told Codex to look deeper into the fastexcel code and documentation.

Sure enough, there was a setting needed to read in the formatting, and after some prompting, Codex found it.

I’m guessing fastexcel has a fast path for reading Excel files that extracts only the data, and it is the default setting.

Analysing the Entire Codebase

After seeing how quickly it could analyse and summarise the codebase, I wondered if it might be good at restructuring the code into modules.

It really excelled at this.

First, I asked it to review the unit tests in the area of the codebase I wanted to extract into a module. It then added tests until there was fairly comprehensive coverage.

Next, I asked for an analysis of what could be extracted, and it came back in seconds highlighting problems.

We worked through the problem cases — sometimes I refactored the code myself, and sometimes Codex made the changes.

We iterated on this pattern until the extraction to a module was possible. Then I asked Codex to create the module.

In less than a minute, I had the new module integrated with the Maven build. Codex did initially forget to move the unit tests, but after a prompt, it corrected this.

We’ve done this twice now. The module code is very stable, and it’s unlikely we’ll need to make many changes to it. I may make one of the modules open source in the near future.

Where Codex Struggled

I’ve since tested Codex across many parts of the codebase. Out of more than 100 commits, I only needed to intervene in a handful of cases.

It struggled with tight, performance-critical algorithms and with immutability patterns in Java.

With good prompting, it was usually able to resolve the issues. Only very occasionally did I need to edit the code myself.

I suspect there are ways to improve how I configure and use Codex. I was using the out-of-the-box settings (Model = GPT-5.3-Codex, Reasoning = Medium).

With some tuning, I think I could get even more out of it. But there’s no doubt about its value — I’m very impressed with what it can do.

Emotions of Working with Codex

Codex feels like an extra developer — competent, enthusiastic, and very effective, even if it makes the occasional mistake.

However, it can also be draining. It sometimes feels like I’m holding Codex back, which leads me to start earlier and work later.

I also suspect my prompting isn’t fast enough because I’m typing, so I plan to try using a microphone instead of a keyboard.

On the topic of supervision, I understand that Wes McKinney has been experimenting with using two AI agents — one reviewing the other’s work. It’s a novel idea. Could this reduce the human bottleneck, allowing us to review fewer changes while still maintaining quality?

Productivity and the Elimination of Technical Debt

My work with Codex refactoring modules has convinced me that code can be naturally written to higher standards with Codex assistance.

More time is made available for the human to review and analyse the structure of the code.

Is there an overall productivity increase?

Jevons paradox tells us improvements in efficiency often lead to more consumption, not less. That extra capacity gets absorbed by problems we previously wouldn’t have tackled.

At a minimum, I think this means less technical debt — and likely more functionality delivered at a much higher standard.

Will Engineers Be Eliminated?

Will Codex and similar tools lead to the elimination of engineers?

I doubt it — but it will clearly change the nature of their work. The engineer will likely become more of a “prompt engineer”, a “code architect”, a “code reviewer”, and possibly even more of a product manager.

For companies that have let many people go recently “due to AI”, I suspect many had underlying financial problems and found it more convenient to attribute this to AI than to their own decisions.

Final Thoughts

Codex hasn’t replaced development — but it has fundamentally changed how I approach it.
The bottleneck is no longer the mechanics of writing code, but thinking clearly about what should be built and how it should be structured. That shift feels significant.