It’s been my experience that the quality of code is greatly influenced by the quality of your project instructions file, and your prompt. And of course what model you’re using.
I am not necessarily a proponent of AI, I just found myself being reassigned to a team that manages AI for developer use. Part of my responsibilities has been to research how to successfully and productively use the tech.
But at a certain point, it seems like you spend more time babysitting and spoon-feeding the LLM than you do writing productive code.
There’s a lot of busywork that I could see it being good for, like if you’re asked to generate 100 test cases for an API with a bunch of tiny variations, but that kind of work is inherently low value. And in most cases you’re probably better off using a tool designed for the job, like a fuzzer.
But at a certain point, it seems like you spend more time babysitting and spoon-feeding the LLM than you do writing productive code.
I’ve found it pretty effective to not babysit, but instead have the model iterate on it’s instructions file. If it did something wrong or unexpected, I explain what I wanted it to do, and ask it to update it’s project instructions to avoid the pitfall in future. It’s more akin to calm and positive reinforcement.
Obviously YMMV. I am in charge of a large codebase of python cron automations, that interact with a handful of services and APIs. I’ve rolled a ~600 line instructions file, that has allowed me to pretty successfully use Claude to stand up from scratch full object-oriented clients, complete with dep injection, schema and contract data models, unit tests, etc.
I do end up having to make stylistic tweaks, and sometimes reinforce things like DRY, but I actually enjoy that part.
EDIT: Whenever I begin to feel like I’m babysitting, it’s usually due to context pollution and the best course is to start a fresh agent session.
It’s been my experience that the quality of code is greatly influenced by the quality of your project instructions file, and your prompt. And of course what model you’re using.
I am not necessarily a proponent of AI, I just found myself being reassigned to a team that manages AI for developer use. Part of my responsibilities has been to research how to successfully and productively use the tech.
But at a certain point, it seems like you spend more time babysitting and spoon-feeding the LLM than you do writing productive code.
There’s a lot of busywork that I could see it being good for, like if you’re asked to generate 100 test cases for an API with a bunch of tiny variations, but that kind of work is inherently low value. And in most cases you’re probably better off using a tool designed for the job, like a fuzzer.
I’ve found it pretty effective to not babysit, but instead have the model iterate on it’s instructions file. If it did something wrong or unexpected, I explain what I wanted it to do, and ask it to update it’s project instructions to avoid the pitfall in future. It’s more akin to calm and positive reinforcement.
Obviously YMMV. I am in charge of a large codebase of python cron automations, that interact with a handful of services and APIs. I’ve rolled a ~600 line instructions file, that has allowed me to pretty successfully use Claude to stand up from scratch full object-oriented clients, complete with dep injection, schema and contract data models, unit tests, etc.
I do end up having to make stylistic tweaks, and sometimes reinforce things like DRY, but I actually enjoy that part.
EDIT: Whenever I begin to feel like I’m babysitting, it’s usually due to context pollution and the best course is to start a fresh agent session.