Codex Built My Startup In A Weekend
Building is no longer the bottleneck
The last 2 months feel like a step change. With the newest coding models, building is no longer the bottleneck. Clarity and architecture are.
I’ve been using Claude Code for most of my development. Last weekend I decided to try Codex after the gpt-5.3-codex release. I pointed it at an old startup codebase of mine, Small Hours, which used LLMs to automate DevOps workflows (root cause analysis, PRs with code fixes, triage, etc.).
“Codex, here is the core library powering Small Hours. Turn it into a CLI for cloud ops and root cause analysis.”
It definitely wasn’t a one-shot prompt to working CLI. It took many back-and-forth iterations. But after about two days of guided prompting, I had a fully functional CLI that essentially replaced what took me months to build two years ago.
Codex needs direction (Claude will infer more)
Codex (gpt-5.3-codex) is a bit different than Claude (opus-4.6).
Opus will take ambiguous direction and still try to reason its way into an “optimal” implementation. It feels more like a senior engineer who fills in gaps and makes assumptions (usually good ones).
Codex follows direction extremely well, but it doesn’t fill in as much on its own. If you don’t give it structure, it will take the shortest path to implementation. It’ll still produce something working, but it will often optimize for “done” instead of “designed.”
The best mental model I found is:
Opus is great when you want the agent to help decide how to implement.
Codex is great when you already know what you want and need it executed quickly.
Both are insanely good. It’s mostly an engineering style preference. You can still enforce an initial plan and iterate.
The process
Getting Codex to build a real end-to-end CLI wasn’t magic. It was just tight direction, clear checkpoints, and a lot of iteration.
What worked for me:
Start with the target experience
What commands exist? What does good output look like? What does a user actually do with it?Force an architecture early
Where does core logic live? What’s the boundary between analysis and integration? How do we add workflows without rewriting everything?Define done so it can’t be faked
Tests, security, example runs, docs, and a few realistic scenarios. If you don’t set acceptance criteria, the model will stop at “seems good.”Have it review its own work and surface issues
Ask it: what’s wrong with this implementation, what’s missing, what will break in production, and what are the top risks. Then iterate on that list.Ship in phases
Scaffold, core workflows, polish, error handling, packaging.
Coding is cheap
Coding is mostly solved. I wouldn’t be surprised if it’s completely solved in the next 12 months.
I don’t mean every complex system can be generated perfectly with one prompt. I mean the part where engineering progress is gated on typing code is disappearing fast.
Agents can build working solutions. And with a little direction and planning, they can build fully functional products in days.
What’s left for engineers?
Be architects.
The engineers who succeed in this new era will probably share a few traits:
They understand their service end-to-end
Not just features. The actual system.They don’t offload all thinking to agents
Agents make implementation cheap. Thinking is still expensive.They can plan infrastructure and failure modes
Scale, cost, security, observability, deployment. That stuff isn’t “extra”. It is the job.They understand what users need
Because if you can build anything fast, the only thing that matters is building the right thing.
“I think coding is going away first… and then the broader task of software engineering will take longer.” - Dario Amodei
Coding is getting commoditized. Engineering judgment is not.
Let coding agents do implementation so you can focus on architecture, product judgment, and scalability. Learn how systems break. Make tradeoffs on purpose.




Great read and appreciated the comparison between Codex and CC