AI Coding Assistants at the End of 2025: What I Actually Use, What Changed, and What’s Coming in…
AI Coding Assistants at the End of 2025: What I Actually Use, What Changed, and What’s Coming in 2026
by Gary Worthington, More Than Monkeys
Earlier this year I compared the usual headline acts: GPT-5, Claude, Gemini, Copilot, and a handful of others. It was a useful snapshot at the time.
By December, the landscape has shifted again. Not because one model is suddenly “the best”, but because the tools around the models have changed how people actually work.
So here’s the end-of-year update, written by someone who spends more time in CI logs than on stage at a conference.
I build and ship software for real teams. That means living in existing codebases, keeping diffs reviewable, making CI happy, and trying not to break production while moving at a decent pace. If an assistant helps with that, it stays. If it turns every ticket into a science project, it’s out.
And for what it’s worth, my day-to-day setup is GitHub Copilot plus Claude Sonnet 4.5.
The thing that mattered most this year: the assistant stopped being “a helper” and started being “a workflow”
A year ago, most assistants were basically two features:
- autocomplete that guesses what you meant
- chat that writes code when you ask politely
Now, the serious tools are all pushing the same direction:
- multi-file edits
- “agent” modes that try to complete tasks
- PR and code review integration
- model switching inside the same tool
That’s a big deal, because it changes the job the assistant is doing.
Autocomplete is about speed. Agent workflows are about delegation. Delegation is only valuable when it stays controlled. Otherwise you get a confident flurry of changes, a broken test suite, and a steaming pile of regret.
The best assistants in late 2025 are not the ones that can produce the cleverest snippet. They are the ones that can operate inside constraints.
What I use and why
Copilot: best for staying in flow
Copilot is still the smoothest “always on” experience inside the editor for most people. It does the small things well:
- completing the next few lines
- suggesting the shape of a function you were already writing
- quick transformations that you would do anyway, just slower
It is also increasingly useful for multi-file work, but I treat that like a power tool. Very handy. Not something you leave running while you make tea.
Claude Sonnet 4.5: best for quality and coherence
When the task is bigger than “finish this function”, I want consistency. I want the assistant to keep the thread across a refactor, tests, and edge cases without wandering off into a different architectural universe.
That’s where Sonnet 4.5 has been strong for me. It tends to:
- keep changes coherent across several files
- write tests that look like the tests already in the repo
- explain trade-offs without turning it into a lecture
If I had to summarise the combo:
- Copilot helps me move quickly minute-to-minute.
- Sonnet helps me avoid shipping something daft.
The practical reality nobody advertises: quotas, tokens, and getting cut off mid-task
This is the bit that matters when you are using these tools professionally.
Lots of assistants now have limits: quotas, “premium” requests, token budgets, throttles. Whatever they call it, the outcome is the same. You can be halfway through a task and suddenly the tool stops being helpful.
I spent some time using Junie from JetBrains, and I’ve seen the same failure mode: it started reallystrong, but the AI tokens can run out quickly and leave you mid-task.
That is not just annoying. It actively harms the workflow because you end up with:
- partial edits
- no clean conclusion
- and no continuity to finish the job properly
It’s like hiring someone to tidy the house and they leave halfway through, taking the hoover with them.
How I work around this
I assume the assistant might disappear at any moment, so I structure tasks to survive that:
- Work in checkpoints that produce a complete artefact
A plan, a small patch, updated tests, a migration script. Something that stands on its own. - Keep diffs small enough that I can take over
If the tool hits a limit, I can still finish in a human amount of time. - Force end-of-step summaries
“List changed files. Explain decisions. What’s left to do. What should I verify.” - Avoid huge “do everything” prompts
They burn budget and encourage broad, messy changes.
If JetBrains want Junie to become a daily driver for serious work, continuity needs to be predictable. Bigger budgets help, but so does tooling that nudges you into smaller, complete steps rather than long, fragile sequences.
What got better since the earlier comparison
1) Assistants are improving at “work”, not just “answers”
The best tools are now trying to behave like a loop:
- look at the repo
- propose approach
- make changes
- run tests (or tell you exactly what to run)
- iterate until it’s green
This is where you actually save time. Not because the assistant is brilliant, but because it is willing to do the repetitive bits that drain your will to live.
2) Integration is now the deciding factor
Model capability matters, but the integration matters more:
- Does it understand your repo structure?
- Can it make changes safely without turning it into a refactor festival?
- Does it fit your review workflow?
- Can it work with your test setup and tooling?
A great model inside a clumsy tool is still a clumsy tool.
3) “Demo good” is still not “production good”
Lots of models look fantastic building a toy app from scratch.
Real work looks like:
- “change behaviour without breaking backwards compatibility”
- “fix the bug and write the regression test”
- “update the dependency without creating a security hole”
- “make this Terraform change without nuking prod”
This is where assistants either feel like a genuine boost, or like an intern who has discovered ripgrep and confidence.
The criteria I actually care about
If you are choosing assistants going into 2026, I would focus on these:
Reviewability
If it cannot produce a clean, reviewable diff, it is not helping. Big sweeping changes are rarely a sign of intelligence. They are usually a sign the assistant is bored.
Test discipline
I want an assistant that naturally works in a loop: change, test, fix, test again. If it avoids tests, it is just generating code and hoping.
Consistency with existing patterns
The fastest way to make a team hate AI is letting it introduce a new style every time someone uses it. Good assistants blend in.
Context handling
Not marketing context windows. Actual context. Can it keep hold of what the system is doing and stay consistent across multiple steps?
Control and safety
Agent features need guardrails:
- what can it touch?
- what can it run?
- what can it read?
- how do you scope it to a small, safe area?
Treat agent mode like elevated permissions. Useful when you need it. Otherwise, keep it contained.
The workflow I use, and why it works
This is the part you can steal without buying my course, (because I do not have a course).
For a normal ticket
Ask for a plan before code
- “What files will you touch, and what tests should prove this works?”
Do the smallest change that passes
- Fix behaviour first, then tidy.
Make tests part of the definition of done
- “Add or update tests that fail before and pass after.”
End with a review summary
- Changed files, key decisions, risks, what I should verify.
For refactors
- “Keep behaviour identical.”
- “One module at a time.”
- “No renames without my go ahead”
- “Update typing and tests as you go.”
This keeps the assistant from doing the classic move: rewriting half the codebase because it spotted a slightly unusual function name.
For debugging
- “Explain the failure in plain English.”
- “Give me two plausible causes.”
- “Tell me what evidence would confirm each.”
- “Propose the smallest fix.”
This stops you from getting a grand redesign when you needed a missing null check.
What I’m watching for 2026
1) The editor becomes a task manager
Assistants are drifting towards “work orchestration”: multiple threads, task queues, suggested diffs, review workflows. That could be genuinely helpful, if it stays grounded.
2) Teams start formalising how they use AI
Not because they hate AI, but because they hate chaos.
I expect to see lightweight team policies like:
- which tasks are safe for agent mode
- what must be reviewed manually
- rules around secrets and credentials
- when premium models are allowed
That is just normal engineering governance, applied to a new tool.
3) Cost becomes an engineering constraint
As token budgets and premium quotas become real, people will stop “trying three models until one gives a good answer”.
They will pick a default for the everyday work, and reserve the expensive options for the tasks where it genuinely pays off.
4) Security will catch up with the hype
The more autonomy you give an assistant, the more you need to think about:
- prompt injection via code comments or docs
- accidental leakage of secrets
- assistants running commands in environments they should not
- supply chain risks through automated dependency updates
This is not a reason to avoid the tools. It is a reason to use them like a grown-up.
The honest conclusion
At the end of 2025, I’m not looking for an assistant that impresses me. I’m looking for one that helps me ship:
- keeps diffs reviewable
- respects existing patterns
- supports a test loop
- doesn’t wander off mid-task because a token meter hit zero
Copilot plus Claude Sonnet 4.5 has been the best balance I’ve found for that.
And yes, Junie is interesting. But if it runs out of tokens halfway through the work, it becomes a liability. I want tools that finish jobs, not tools that leave me with a half-painted kitchen.
Gary Worthington is a software engineer, delivery consultant, and fractional CTO who helps teams move fast, learn faster, and scale when it matters. He writes about modern engineering, product thinking, and helping teams ship things that matter.
Through his consultancy, More Than Monkeys, Gary helps startups and scaleups improve how they build software — from tech strategy and agile delivery to product validation and team development.
Visit morethanmonkeys.co.uk to learn how we can help you build better, faster.
Follow Gary on LinkedIn for practical insights into engineering leadership, agile delivery, and team performance