AI coding is like making a burger with sugar

Jan 17, 2026

Have you ever had a burger, and then noticed something is kind of off. You’re hunting for the reason. Then it’s immediately obvious: they put sugar instead of salt. Ok, this has probably never happened. However, this is the way I’d sum up using an LLM to help you code. It kind of works, but there still those random quirky things you notice in the code, that you want to fix. I recently read an article by Joe Weisenthal on Bloomberg (and you can also read it on LinkedIn here) discussing his experience vibe coding with Claude Code from the perspective of a non coder, so I thought it might try something along the same lines but from a coder viewpoint.

Different LLMs I’ve used

I’ve been using LLMs to code for a while. At the beginning I had been using ChatGPT asking it questions about how to implement small problems in Python. In a sense it replaced a more traditional workflow, of thinking of something to code: this involved Googling the problem, before arriving at some library on GitHub to help, API documentation, StackOverflow etc.

I then switched to GitHub Copilot with PyCharm asking it for help to write code. The advantage of this is that it can look directly at my own code base to give you more informed answers, and ones which more closely fit with my code. On GitHub Copilot, you get access to many different LLMs from OpenAI, Anthropic etc. some are built in models, whilst others are labelled as premium with various levels. It even includes access to Claude Opus 4.5, currently the best performing Anthropic model. I mostly ended up using Claude Sonnet 4.5 which seemed to work quite well, and also had higher usage limits than Claude Opus 4.5.

More recently I started to experiment with agent mode on GitHub Copilot which can modify your code directly, again with Claude Sonnet 4.5 over Christmas (my idea of relaxing over Christmas I suppose…).

Creating a cache for economic and market data

My objective in this experiment was to write a cache when downloading data for the inflation forecasting models we use at Turnleaf Analytics. We collect data from many different sources. For a lot of macroeconomic and market data, we use Macrobond, which is an extremely comprehensive source. Importantly, it also has vintages for macroeconomic data. Hence, if you download economic data you’ll get a different vintage of every time series for each month. Basically whenever macroeconomic data is released, it can be revised. If you take for example inflation, depending on the country, you can sometimes get preliminary and final releases, you can also get revisions later on (for example with seasonally adjusted data). When doing calculations such as computing year-on-year metrics, you need to do it in a point-in-time way. However, if you’re downloading data, you don’t want to fetch every single vintage each day when updating your dataset. For daily market data this is less of an issue (although you do occasionally find that vendors might revise this).

To avoid downloading every single economic data vintage, you need to create a cache. Then you only download the new vintages you don’t already have in the cache. I set about using agent mode with Claude Sonnet 4.5 on GitHub Copilot to create a cache with a SQLite backend, that could be a “golden store” for the JSONs fetched from the data vendor (thanks Jayshan Raghunandan for the suggestion of using a “golden store” here). The idea being that in the future if we wanted to change how we parsed these, we would have the original format ready. Of course, there are many other ways to cache the data. I could have directly saved the JSONs to disk, used MongoDB to store them etc.

I then started to ask follow up questions to the agent to improve the code, and get it closer to what I wanted. When creating a database connection, there wasn’t enough redundancy, which would try again with some sort of backoff, which I asked to be added. Also for performance and disk usage, there was no compression for the JSON. When parsing some of the output, there was a heavy reliance on for loops which are slow in Python, rather than vectorising computation using Pandas or Polars, which results in a lot faster code.

I then asked the agent to create something similar with a DuckDB backend. Whilst it did attempt to do this, rather than recognising the code was going to be similar to the SQLite implementation, it instead wrote a new script, basically copying and pasting code (which is most definitely not code reuse). I then had to rewrite the prompt to make sure it factored out common code and used inheritance. The DuckDB code ended up being a total mess, after repeated prompting, and I had to delete it. I have noticed this quite often, where an LLM writes unnecessarily verbose code.

What about Claude Code?

I’ve also been dabbling more recently with Claude Code, which by default uses the more powerful Claude Opus 4.5 model to do a module to work with CPI data, and I have been impressed. There were similar issues though with copying code unnecessarily though, when for example I asked it to load different variants of the CPI. The only difference in the methods was the ticker, but somehow it ended up copying and pasting many methods to do this, rather than having the type of CPI as a parameter. However, Claude Code ran the code, and by running and testing it it, it was able to at least make some adjustments. It also insisted on littering the code with random constants everywhere, rather than collecting it together in a constants file. In the end though, I still needed to iterate through quite a few prompts (and will still require more).

Should you use LLMs for coding? Yes, but you still need to look at the code!

However, if I had not used these LLMs to code, and done it the traditional way, would it have taken a lot longer? Of course, in particular with these newer and more powerful models! I admit maybe with a better prompt, maybe I would have needed to iterate less though. Maybe I could have told it, why not factor out common code, why not put constants in a different file to make it easier to do etc. These are things that developers should know, but maybe I shouldn’t assume that an LLM just knows to do these things automatically. Whilst developers might not be coding as much, understanding software engineering, designing a system etc. are just as important as it has always been. Maybe this will change over time, but we are definitely not there yet.

The great thing about human language is that it’s very expressive, but this is also why what you might prompt won’t always exactly be interpreted as you want by an LLM (or even other humans).

Do we still need to look at the code generated by an LLM? Of course! If you are committing code, you should understand it, otherwise it will become incredibly difficult to maintain. Writing code is always the easy (or least the fun bit), the time is spent maintaining that code, particularly if we want the code to be in production, as opposed to a one off prototype.

Unlike human language, programming languages are very precise, so you can at least see if what you wanted is actually what was implemented by the LLM. This is what’s great about code generation, the final output is something developers can understand, can test etc. This contrasts to other output from an LLM which is more difficult to verify. So next time you have a burger, you’ve got to ask yourself, have they used sugar or salt…