The Vibes

The profusion of hype on the Internet has led me to take a lot of things with a grain of salt, and if you’re a regular reader, you’ll know that generative AI has already added more than a few teaspoons into the broth of -driven coding.

I’ve shared my and more than , and considering it’s been since have become mainstream and we’re still trying to sort out UI interaction and tool integration approaches (don’t worry, I won’t go on about , I ), I thought it was time to share my current workflow and how I think about things.

But I must concede that things have progressed tremendously–any of the “reasoning” models I can run locally is now able to handily beat the solely by itself, integrations have become commonplace, and developer tooling has become a billion-dollar industry… in valuations, if not in actual value.

The Editor Bonanza

I’ve tried pretty much every “AI code editor” out there (, Cursor, Windsurf, and others that came before them, as well as many of the Claude Code variants and offshoots), and I am reminded of the ancient chestnut when Steve Jobs commented that was “a feature, not a product” and passed on acquiring them.

Which, to be honest, seems to have been lost on modern investors. But I digress.

Sure, there are some nice UI features in Cursor and Windsurf’s “flows” are a nice approach at steering the , but behind the scenes it’s still all about providing the right tools for the to explore and manipulate the codebase, and above all good contextual prompting, and you can do that on any editor.

I spend most of my coding time using and (sometimes even using vim in Code’s terminal window…), and with the recent addition of Agent mode to the GitHub Copilot extension, the notion of using a third-party fork of what is essentially the same experience became pointless–leaving as the only interesting alternative GUI editor for me, and aider as the go-to CLI tool when I am coding inside on a VM.

It’s not the editor that makes the difference, and I find the notion that you can “tab tab tab” your way to production inside a second-hand fork of a mainstream editor completely ridiculous, just as I do all the hype around vibe coding.

Like anyone with a music hobby will readily tell you, Gear Acquisition Syndrome (commonly abbreviated as GAS) does not make you a better musician–it only gives you a dopamine hit and the kind of headache that comes with spending all night twiddling the knobs on your fancy new synth instead of actually composing a track.

It’s All About Being Organized

If you’ve done any research on coding with LLMs, you probably came across Harper Reed’s post or Simon Willison’s. Both of them are great reads, and cover different parts of the workflow (brainstorming, documenting as context refinement, and iterating).

My current workflow is quite similar, and I focus a lot on context refinement.

Starting New Projects

For greenfield (new) projects, I usually start with the same few steps:

  • I will write an initial SPEC.md that covers:
    • What the project is about
    • What kind of tools and libraries I want to use aiohttp, sqlalchemy, etc.
    • What kind of code style I prefer (for Python, that is usually functional, minimal OOP, and with explicit imports)
    • How the code and tests should be laid out
  • I iterate on it with the LLM, 20-questions style, so that it refines the SPEC.md . This usually yields a list of features that I typically break out into a TODO.md to avoid having it constantly update the SPEC.md and inevitably break something.

I then ask the to generate a basic project structure (usually a poetry or pipenv project) and a few files to get started, like main.py, __init__.py, and a few test files.

Then I give it both files as context and ask it to implement the first few items on the TODO.md and check them off. Claude, Gemini and o3 (which I use on aider) can usually do this without any significant issue as long as you keep the current context focused on only one or two items.

After a few iterations of this, I already know what I need to fix or provide better instructions for, so I usually update the SPEC.md or create separate NOTES.md files to clarify some things that the will inevitably “forget” about as we progress:

  • What are the key parts of the database schemas (so that it doesn’t need to go digging around in the ORM or SQL schemas I usually draft as reference)
  • What are the key threads/processes/coroutines and what is the division of labor among them
  • How it should handle specific data structures
  • How it should write specific sections of the code (say, event handlers or things that I want it to follow a strict pattern for)
  • How it should run tests

Handling Existing Projects

For existing projects I am picking up or contributing to, the process is a bit different, but the same principles apply. The first thing I do is to go on a fact-finding mission, which usually involves:

  • Asking the model to assess the codebase and answer these basic questions:
    • How is the code structured
    • What are the key data structures
    • What are the key APIs/interfaces it provides and consumes
    • How are errors handled
    • How are tests run
  • Consolidating the findings into a NOTES.md with its (typically Claude’s, because I like its summary style) understanding of the code and a TODO.md with any salient improvements, which I then review and add my own goals to (I’ve found Gemini a bit erratic here, and o3 to be a bit too enthusiastic or verbose, but that’s just personal taste)

I then go into the same loop as for greenfield projects, but with a lot more emphasis on adding logging, error checking and more NOTES.md files on each module or process.

Note: I’ve also started writing little servers to either perform routine tasks or to actually retrieve part of the information I need from the codebase (like the database schema or the test structure) and feed it to the as more refined context, but it’s early days yet and I don’t really know how useful they will be in the long run (aside from being a fun exercise).

This approach works very well inside /GitHub Copilot or with aider, and none of the fancy new AI editors bring anything substantially useful to the table–the ability to do direct code edits and tool using is fast becoming a commodity feature, and the only thing that really matters is how well the can follow instructions and how skilled and methodical you are at providing them.

The Models

Right now there is no cloud-hosted reasoning model that does this amazingly better than any other, although at the very low end Qwen3 on my RTX3060 via ollama or directly on mlx can be surprisingly good at it when compared to cloud-hosted ones (the qwen3:8b variant runs very well on my MacBook, although aider sometimes has a tough time getting decent output from it).

I do prefer Claude at the moment, but that is mostly because Gemini tends to just ignore instructions and/or tools and o3 is very long-winded, so since those first two are included in my Copilot plan I tend to spend more time inside these days.

Blind Coding vs Planning and Experience

But the real challenge is more than presentation or ability to follow immediate instructions.

In particular, I’ve found that there are certain key parts of structural analysis that the will completely forget about once we get down into code generation (which is both due to a lack of context window as we move along and the inability to retrace steps beyond the current task at hand).

This sort of explains why LLMs are pretty bad at architecting code, but the biggest flaw, in my view, is their not “discussing” outcomes and just plowing through implementation without any checkpoints (which is not something I’ve seen anyone implement yet).

The key thing is that this is a process. It requires planning, effort and a lot of writing that needs to be deliberately written to be re-usable and revised throughout the entire project, and is pretty far off the beaten track of the “tab tab tab” autocomplete approach or the blind one-shot prompting approach I’ve seen vibe coding advocates go on about.

More importantly, it requires a good understanding of the problem domain and the libraries and patterns you want to use, instead of just blindly accepting what the puts forward.

In a word, it requires taste.

In two, it requires both taste and experience, which, aside from the discipline that comes with age, is something that eludes all the folk who believe they can replace programmers…