AI Coding Assistants at the End of 2025: What I Actually Use, What Changed, and What’s Coming in…

AI Coding Assistants at the End of 2025: What I Actually Use, What Changed, and What’s Coming in 2026

by Gary Worthington, More Than Monkeys

Earlier this year I compared the usual headline acts: GPT-5, Claude, Gemini, Copilot, and a handful of others. It was a useful snapshot at the time.

By December, the landscape has shifted again. Not because one model is suddenly “the best”, but because the tools around the models have changed how people actually work.

So here’s the end-of-year update, written by someone who spends more time in CI logs than on stage at a conference.

I build and ship software for real teams. That means living in existing codebases, keeping diffs reviewable, making CI happy, and trying not to break production while moving at a decent pace. If an assistant helps with that, it stays. If it turns every ticket into a science project, it’s out.

And for what it’s worth, my day-to-day setup is GitHub Copilot plus Claude Sonnet 4.5.

The thing that mattered most this year: the assistant stopped being “a helper” and started being “a workflow”

A year ago, most assistants were basically two features:

autocomplete that guesses what you meant
chat that writes code when you ask politely

Now, the serious tools are all pushing the same direction:

multi-file edits
“agent” modes that try to complete tasks
PR and code review integration
model switching inside the same tool

That’s a big deal, because it changes the job the assistant is doing.

Autocomplete is about speed. Agent workflows are about delegation. Delegation is only valuable when it stays controlled. Otherwise you get a confident flurry of changes, a broken test suite, and a steaming pile of regret.

The best assistants in late 2025 are not the ones that can produce the cleverest snippet. They are the ones that can operate inside constraints.

What I use and why

Copilot: best for staying in flow

Copilot is still the smoothest “always on” experience inside the editor for most people. It does the small things well:

completing the next few lines
suggesting the shape of a function you were already writing
quick transformations that you would do anyway, just slower

It is also increasingly useful for multi-file work, but I treat that like a power tool. Very handy. Not something you leave running while you make tea.

Claude Sonnet 4.5: best for quality and coherence

When the task is bigger than “finish this function”, I want consistency. I want the assistant to keep the thread across a refactor, tests, and edge cases without wandering off into a different architectural universe.

That’s where Sonnet 4.5 has been strong for me. It tends to:

keep changes coherent across several files
write tests that look like the tests already in the repo
explain trade-offs without turning it into a lecture

If I had to summarise the combo:

Copilot helps me move quickly minute-to-minute.
Sonnet helps me avoid shipping something daft.

The practical reality nobody advertises: quotas, tokens, and getting cut off mid-task

This is the bit that matters when you are using these tools professionally.

Lots of assistants now have limits: quotas, “premium” requests, token budgets, throttles. Whatever they call it, the outcome is the same. You can be halfway through a task and suddenly the tool stops being helpful.

I spent some time using Junie from JetBrains, and I’ve seen the same failure mode: it started reallystrong, but the AI tokens can run out quickly and leave you mid-task.

That is not just annoying. It actively harms the workflow because you end up with:

partial edits
no clean conclusion
and no continuity to finish the job properly

It’s like hiring someone to tidy the house and they leave halfway through, taking the hoover with them.

How I work around this

I assume the assistant might disappear at any moment, so I structure tasks to survive that:

Work in checkpoints that produce a complete artefact
A plan, a small patch, updated tests, a migration script. Something that stands on its own.
Keep diffs small enough that I can take over
If the tool hits a limit, I can still finish in a human amount of time.
Force end-of-step summaries
“List changed files. Explain decisions. What’s left to do. What should I verify.”
Avoid huge “do everything” prompts
They burn budget and encourage broad, messy changes.

If JetBrains want Junie to become a daily driver for serious work, continuity needs to be predictable. Bigger budgets help, but so does tooling that nudges you into smaller, complete steps rather than long, fragile sequences.

What got better since the earlier comparison

1) Assistants are improving at “work”, not just “answers”

The best tools are now trying to behave like a loop:

look at the repo
propose approach
make changes
run tests (or tell you exactly what to run)
iterate until it’s green

This is where you actually save time. Not because the assistant is brilliant, but because it is willing to do the repetitive bits that drain your will to live.

2) Integration is now the deciding factor

Model capability matters, but the integration matters more:

Does it understand your repo structure?
Can it make changes safely without turning it into a refactor festival?
Does it fit your review workflow?
Can it work with your test setup and tooling?

A great model inside a clumsy tool is still a clumsy tool.

3) “Demo good” is still not “production good”

Lots of models look fantastic building a toy app from scratch.

Real work looks like:

“change behaviour without breaking backwards compatibility”
“fix the bug and write the regression test”
“update the dependency without creating a security hole”
“make this Terraform change without nuking prod”

This is where assistants either feel like a genuine boost, or like an intern who has discovered ripgrep and confidence.

The criteria I actually care about

If you are choosing assistants going into 2026, I would focus on these:

Reviewability

If it cannot produce a clean, reviewable diff, it is not helping. Big sweeping changes are rarely a sign of intelligence. They are usually a sign the assistant is bored.

Test discipline

I want an assistant that naturally works in a loop: change, test, fix, test again. If it avoids tests, it is just generating code and hoping.

Consistency with existing patterns

The fastest way to make a team hate AI is letting it introduce a new style every time someone uses it. Good assistants blend in.

Context handling

Not marketing context windows. Actual context. Can it keep hold of what the system is doing and stay consistent across multiple steps?

Control and safety

Agent features need guardrails:

what can it touch?
what can it run?
what can it read?
how do you scope it to a small, safe area?

Treat agent mode like elevated permissions. Useful when you need it. Otherwise, keep it contained.

The workflow I use, and why it works

This is the part you can steal without buying my course, (because I do not have a course).

For a normal ticket

Ask for a plan before code

“What files will you touch, and what tests should prove this works?”

Do the smallest change that passes

Fix behaviour first, then tidy.

Make tests part of the definition of done

“Add or update tests that fail before and pass after.”

End with a review summary

Changed files, key decisions, risks, what I should verify.

For refactors

“Keep behaviour identical.”
“One module at a time.”
“No renames without my go ahead”
“Update typing and tests as you go.”

This keeps the assistant from doing the classic move: rewriting half the codebase because it spotted a slightly unusual function name.

For debugging

“Explain the failure in plain English.”
“Give me two plausible causes.”
“Tell me what evidence would confirm each.”
“Propose the smallest fix.”

This stops you from getting a grand redesign when you needed a missing null check.

What I’m watching for 2026

1) The editor becomes a task manager

Assistants are drifting towards “work orchestration”: multiple threads, task queues, suggested diffs, review workflows. That could be genuinely helpful, if it stays grounded.

2) Teams start formalising how they use AI

Not because they hate AI, but because they hate chaos.

I expect to see lightweight team policies like:

which tasks are safe for agent mode
what must be reviewed manually
rules around secrets and credentials
when premium models are allowed

That is just normal engineering governance, applied to a new tool.

3) Cost becomes an engineering constraint

As token budgets and premium quotas become real, people will stop “trying three models until one gives a good answer”.

They will pick a default for the everyday work, and reserve the expensive options for the tasks where it genuinely pays off.

4) Security will catch up with the hype

The more autonomy you give an assistant, the more you need to think about:

prompt injection via code comments or docs
accidental leakage of secrets
assistants running commands in environments they should not
supply chain risks through automated dependency updates

This is not a reason to avoid the tools. It is a reason to use them like a grown-up.

The honest conclusion

At the end of 2025, I’m not looking for an assistant that impresses me. I’m looking for one that helps me ship:

keeps diffs reviewable
respects existing patterns
supports a test loop
doesn’t wander off mid-task because a token meter hit zero

Copilot plus Claude Sonnet 4.5 has been the best balance I’ve found for that.

And yes, Junie is interesting. But if it runs out of tokens halfway through the work, it becomes a liability. I want tools that finish jobs, not tools that leave me with a half-painted kitchen.

Gary Worthington is a software engineer, delivery consultant, and fractional CTO who helps teams move fast, learn faster, and scale when it matters. He writes about modern engineering, product thinking, and helping teams ship things that matter.

Through his consultancy, More Than Monkeys, Gary helps startups and scaleups improve how they build software — from tech strategy and agile delivery to product validation and team development.

Visit morethanmonkeys.co.uk to learn how we can help you build better, faster.

Follow Gary on LinkedIn for practical insights into engineering leadership, agile delivery, and team performance