It was a strange month. This update is late since I have had far too much on my mind, and that has also impacted a few projects, but there are a few things I want to note:
Vibe Coding
I have been doing a little experiment with my feed summarizer over the past couple of months, and I’m very sorry to say that the latest version was pretty much all written using AI.
VS Code and Claude have evolved a lot in terms of integration and my staple approach is still valid, but there have been quite a few annoyances along the way.
Here are some of the main issues I’ve noticed:
-
Over-Eagerness to process
TODO
items:
Most models now tend to churn throughTODO
lists unprompted, often without validating whether the resulting code actually works as intended. -
Zero Short-Term Memory:
Even withSPEC
,TODO
, and the new.github/copilot-instructions.md
grounding, coding agents still:- Ignore coding standards.
- Fail to re-use or call previously written code.
- Deviate from established code structures.
- Write overly complex, verbose, and often redundant functions.
-
Untidiness:
Claude, in particular, is prone to:- Producing emoji-laden checklists and proclaiming success before tasks are truly complete.
- Littering the repository with ad hoc
test_
files that are frequently re-written and rarely re-used.
And I think the key word here is “eager”. The chat interactions have become grating and annoying, so I now just say “Implement item #3 in the TODO.md
and write appropriate tests”.
The upshots are that the current codebase has a lot more logging than I would have ever bothered with and an (arguably) better database schema, but I am not overly impressed. Overall, using AI saved me just a little more time than the time I spent working the problem and explaining to it the issues it created, and it can be more mentally burdensome to type out the stuff I want fixed (because I have to write an all-inclusive prompt to avoid deviations) than to fix it myself.
As to the changes I’ve noticed over time, I suspect this is mostly due to tweaks in the scaffolding (updating the system prompts inside the editor, etc.) and nothing I can really ascribe to any model improvements.
I don’t really like the code, but it is stuff I didn’t really want to write myself. Time will tell how correctly it works, but I am a bit worried about overall quality–not just of my stuff, but of the generation of software that is being written today.
Media and Entertainment
We finally finished watching the second season of Andor, which is, together with Rogue One, undoubtedly the best Star Wars I’ve ever seen. In comparison, all the other Star Wars spinoffs feel like kitschy messes and self-serving director playgrounds that might as well be binned, but I don’t suppose Disney will really get this–we were just lucky.
On a lower note, for the first time in a couple of years I am behind on my regularly scheduled reading to the point where I really need to step things up a tad. In my defense, I wasn’t expecting the Blue Ant Series (which I had actually never read in order) to be this long-winded.