In this post I'm sharing my notes on the webinar "Agentic Engineering in Action with Mitchell Hashimoto". It's part of a series of webinars on agentic engineering hosted by zed.dev. Mitchell Hashimoto is the co-founder of HashiCorp, and currently works on Ghostty.
It was an interesting conversation with lots of practical advice on how to use coding agents, what they are good at and what the limitations are. Overall, the gist I was getting from the webinar maps very well to the state of coding agents as summarized by Simon Willison.
According to Mitchell, coding agents such as Claude Code are already good enough to implement smaller, contained features and fix bugs on junior/mid-level. The prompt he uses contains a concise description of the issue and how to address it, similar to a briefing one would give to a Junior developer (lots of "guard rails").
While the agent is sometimes able to solve the issue on the first shot, in Mitchell's experience, the changes always need a thorough review, follow up prompts and sometimes manual tweaks to get it to "Senior" level quality and make it maintainable.
So far, I've been mostly using GitHub Copilot Chat and Edits, as well as aider. I haven't used any of the more agentic coding assistants such as Claude Code, but after this webinar I'm very intrigued to give it a try.
Below are my raw notes in bullet points.
- task: fix a bug in Ghostty macos: fix undo/redo for closing windows with multiple tabs
- explain the task as if to a junior engineer, quite specific guardrails
- first attempt worked, but not good quality
- most of the work is getting code to more senior quality, like maintainability
- uses Claude Code with Opus, Sonnet
- using different models in "competition", e.g. Gemini 2.5
- current favourite is Claude
- commit message is a description of the problem, and then a log of prompts and assessment of results, including notes on tweaks etc.
- prompts are indented with
>
- prompts are indented with
- good practice to commit often, like save points (jujutsu snapshots), so if the agent goes in the wrong direction, you can manually reset to the previous commit and start over
- NeoVim
- Learn your tools! -> same with AI/agents
- get a feeling what prompts/problems work and what doesn't
- new tools ALWAYS take time to learn
- llms not very good at architecture and folder structure (not meaning your standard React app, but for novel/complex systems)
- llms not very good on complicated data structures and high-performance/low-level stuff
- agentic systems mostly only good for junior/mid-level problems, not senior level
- screenshot to SwiftUI works really well
- doing QA testing manually while the agent goes along
- once the commit is ready, do one more prompt: "Any other suggestions or improvements you would do?"
- llms are good at finding inconsistencies, such as outdated comments, and correct them
- "I never regretted having too many comments in the code"
- have the agent write the commit message, and then tweak by hand
- TDD: prefers to write the test by hand, and then have AI fix it
- Q&A
- message to juniors
- challenge yourself to understand the code and learn from it
- managing people means context-switching all the time
- in contrast, AI agents can wait, return to the session once you're ready
- using Claude code in Ghostty sounds interesting, they have OS notifications when the agent is finished with a task
- structure code to make it easier for the llm?
- more aware of keeping context close
- good practice also for humans, improves performance of llms
- in complicated refactors, keep the old code, so the llm can reference it, then towards the end, remove the old code
- at the end, ask: "Did I forget anything?"
- more aware of keeping context close
- message to juniors