The Evolution Of LLM Based Development
On May 6th, I wrote these notes to myself:

Which is the result of an exploration of using AI coding tools for the past couple of years. It's essentially a note to myself that I need to go back to writing code by hand and change how I use LLMs to help me develop better software, not produce more code.
My workflow now is that all code that goes into master has to be written by hand. I use AI before actually writing the code for exploring, searching, comparing, demos and so forth. But as soon as I arrive at a design I'm satisfied with, I put on my headphones and start writing the code.
I'm not alone in my conclusion of going back to writing code. Multiple articles have surfaced on Hacker News around this topic:
So, how did I arrive back where I started?
Magic code
Back in spring last year, Anthropic released their subscription plans where they gave you as much usage as you wanted for, I think, 200 bucks a month.
(Which of course didn't last long as somebody chose to burn through approximately $100k in API equivalent spend. We can't have nice things.)
At this point, I had mainly been using GitHub's Copilot and ChatGPT+Claude in their chat formats. I actually quite liked the Copilot workflow but eventually fell into the copilot pause trap, writing a few lines of code and waiting for the LLM to suggest the complete sentence.
I'd often just tab my way through a complete function that all looked right and seemed to work, but I started to lose touch with the code I wrote. I didn't really know what it did. I of course did code reviews and, as everyone who has done professional software development knows, actually understanding code you didn't write takes time.
And the temptation to simply power through was often too much to handle. I was being productive and the Thing was working after all!
At this point, I was somewhat AI-pilled, not completely, but I definitely had one foot firmly planted on the AI side of software development. The switch really happened when I got the small $20 plan that Anthropic provided.
I remember my mind being absolutely blown. I burned through my tokens fast, but the results were so good that I quickly upgraded to the $200 plan and continued generating code.
I was raving about it to my friends, created a tutorial of running multiple agents in the same codebase, and wrote an article about it as well.
At this point, their subscription plans weren't as popular as they are today, so the agents still weren't kneecapped to ease the pressure on their data centers. The time period from May to August 2025 was an absolutely magical time to use LLMs.
But then, the period of Hype, Doom and Degradation started.
Your Job Is No More
Around August, something changed. The models weren't as capable as they used to be; they'd make mistakes and produce code that I had to spend a lot of time re-prompting to make work.
That was okay-ish since the token limits still weren't as aggressive then as they are now.
But two things started to emerge:
- Overly complex code with a ton of indirections
- The same issue I had with Copilot came around again
The agent would continuously take shortcuts, not read the relevant files, and create logic that was overly convoluted. I predominantly write Go, which is very easy to read and reason about. But whenever I had to dig into the code for something that should be simple, it was always full of little helper functions, even in places where I originally had written the code myself.

I was starting to get skeptical that I actually gained anything from using the agent compared to writing it myself. It didn't feel faster or seem to produce better code.
I didn't have the same understanding of the codebase as I used to have when I was writing and exploring it myself. At this point, both Sam Altman and Dario Amodei had been continuously pushing the narrative that Software Development Is Dead.
I didn't fully buy into that narrative but was convinced that things had changed. This was likely the new way of writing code going forward. Software is a discipline that requires you to continuously learn new things, so this wasn't really super concerning.
New Model -> Hype -> Degrade -> Repeat
I was starting to get skeptical that I actually gained anything from using the agent compared to writing it myself, but it turned out my localized frustrations were just part of a broader macro trend. Over the latter half of 2025, a cynical pattern emerged: the big model providers would release a new model, hype it up as much as possible, and therefore slowly make it less capable in some attempt at lowering the cost of running the model cluster.
Many suspected this. But, it's not something that has been confirmed. Model quality degradation is always blamed on:
a combination of infrastructure mis-routing and configuration change
It just always seemed to happen around a model release.
Token-based billing
Therefore, the final form of this strategy wasn't just silent degradation; it was aggressive gatekeeping. To get the quality we originally fell in love with, we are now being forced into token-based API billing. This just happened with Anthropic's new model, Fable (ironically a good name for what they are promising).
It's available on subscription plans for a limited time only and will move to only being available through their API around the end of June 2026, where they will charge you through the nose to have access to the best available model.
For many, this will never be feasible in terms of what you get out. Multiple people took to Twitter/X talking about how it would use all of their included usage, on a 20x plan, without even finishing the job. So it now takes two usage windows to finish a task and determine if it was even correct.
I have a hard time buying into this paradigm.
Back to manual labor
I still find LLMs useful. But not for automating the production of code. They are incredibly useful for research and rubber ducking/talking through a problem.
We seem to have forgotten that the real work is happening by doing the work. You need to explore a problem area to really understand it and gain domain knowledge about what you're working on. I highly doubt that can be offloaded to an LLM unless you are doing work you already have done before to gain the domain knowledge.
I write, a guesstimate, around 90% of all the code that goes into production myself again. I need to work with the code and see how it evolves to truly understand what I'm working on.
I build MVPs and high-level diagrams using LLMs all the time to evaluate the different directions I can take. But once I'm happy with the direction, I sit down, put on my headphones, and get back to writing.