Skip to content

My take on engineering with LLMs

Currently, the big AI revolution is underway. Everybody uses "A.I." to make everything better, and suddenly a PhD in Data Science, Machine Learning, or Software Engineering seems a little less necessary if a chatbot can give you an answer in seconds.

Even more surprising, some people now talk as if you do not need to know how to write code anymore because ChatGPT can do that for you.

I do not buy that version of the story.

What I do buy is that LLMs are changing the shape of engineering work quite a lot. They are not replacing the craft. They are changing where the craft shows up.

What LLMs are genuinely good at

There is no point pretending that these tools are toys. They are useful.

In day-to-day work, I find them especially good at:

  • drafting code that follows an obvious pattern
  • producing first versions of tests, scripts, docs, and configuration
  • translating between frameworks, languages, and APIs
  • summarizing a code area after enough context has been provided
  • helping with small refactorings that are tedious but not conceptually difficult

That matters because a lot of engineering work contains exactly that sort of friction. Not every task is deep design. Not every task is beautiful architecture. Sometimes you really do just want a decent first pass for a migration script, a Terraform snippet, or a unit test.

For that kind of work, LLMs are often very effective.

Where they still fail

The trouble starts when people confuse local fluency with real understanding.

LLMs are very good at producing code that looks plausible. They are far less reliable when the task depends on hidden constraints, incomplete context, or trade-offs that only become visible at the system level.

That is exactly where engineering usually becomes interesting.

In my experience, assistants fail most often when:

  • the code base is large and the relevant context is spread across many modules
  • the naming in the repository is inconsistent or misleading
  • the task depends on business rules that are not written down
  • performance, security, deployment, or operability matter as much as correctness
  • the "best" solution depends on what should not be changed

This is why generated code can feel impressive in isolation and still be wrong in the only place that matters: the actual system it is supposed to fit into.

The skill that becomes more important

Because of that, I do not think LLMs eliminate engineering skill. They move the center of gravity.

If a model can draft the syntax, the human contribution shifts more strongly toward:

  • defining the scope precisely
  • providing the right context
  • spotting incorrect assumptions
  • reviewing whether the change fits the architecture
  • deciding what is worth simplifying and what is too risky to automate

That may sound less glamorous than "the AI writes the code", but it is much closer to what I actually see.

The bottleneck is often not typing speed anymore. The bottleneck is whether someone can describe the problem clearly enough and judge the result well enough.

In other words: the value of good engineering judgment increases.

Specification becomes a first-class activity

One thing I like about working with LLMs is that they punish vague thinking very quickly.

If your prompt is fuzzy, the result is fuzzy. If your boundaries are unclear, the assistant will happily cross them. If the repository structure is chaotic, the model will wander.

This makes specification much more important than many teams are used to.

Good prompts alone are not enough, of course. The repository itself has to become easier to navigate. Naming has to be more deliberate. Architecture boundaries have to be more visible. The intended workflows should be documented somewhere.

That is one reason I am interested in things like code maps and repository guides. They help humans, but they also make coding assistants behave more sensibly.

Speed is real, but so is review cost

Another thing I noticed is that LLMs often make the first 80 percent of a task dramatically faster while making the last 20 percent more review-heavy.

The first version appears quickly. But then you still need to answer questions like:

  • Does this actually solve the right problem?
  • Is the code introducing accidental complexity?
  • Is the naming now worse than before?
  • Did the assistant duplicate an existing helper instead of reusing it?
  • Will this behave correctly in production rather than in a toy example?

That means the benefit is real, but it is not "free code". It is accelerated drafting plus concentrated review.

If the reviewer is weak, that is dangerous. If the reviewer is strong, that is powerful.

Juniors, seniors, and the learning problem

I also think LLMs create an awkward tension for less experienced engineers.

On the one hand, they are fantastic learning tools. You can ask for explanations, examples, comparisons, and alternative implementations immediately. That is genuinely useful.

On the other hand, they make it easier to skip the uncomfortable parts of learning that are necessary to build intuition.

If somebody constantly accepts generated code without understanding why it works, they may become faster at shipping fragments while becoming slower at developing judgment.

That is a problem, because judgment is exactly the thing the tool does not reliably replace.

So I would not tell juniors to avoid LLMs. I would tell them to use them actively, but not passively:

  • ask why something is written that way
  • compare two alternative implementations
  • rewrite the result yourself
  • trace the actual runtime behavior
  • treat the answer as a proposal, not as truth

That turns the assistant into a learning partner instead of a crutch.

My practical rule of thumb

The more local, repetitive, and well-bounded the task is, the more I trust an assistant to draft it.

The more cross-cutting, architectural, irreversible, or operationally sensitive the task is, the more I want the model to support my thinking rather than replace it.

That usually means:

  • use LLMs aggressively for scaffolding, drafting, summarization, and small transformations
  • use them carefully for refactoring inside a known boundary
  • use them skeptically for architecture, security-sensitive work, migrations, and hidden business logic

I find that this keeps the upside while reducing the delusion.

So what is engineering with LLMs?

For me, engineering with LLMs is not "letting the machine do the work".

It is a style of working in which:

  • more work starts from specification than from syntax
  • more value comes from review than from raw typing
  • repository clarity matters more because both humans and models depend on it
  • good engineers become amplifiers rather than mere producers of text

That is why I am optimistic about these tools without becoming mystical about them.

They are useful. They are sometimes impressive. They are often worth the effort.

But they do not remove the need for taste, judgment, structure, or responsibility. If anything, they make those qualities easier to notice.