March 9th meeting notes

After quick intros, conversation covered:

Production agentic system for order processing — One participant walked through a system they’re building for a large parts distributor. Inbound emails (structured, PDF, or freeform text) get parsed by an agent that extracts part numbers, searches the catalog and order history, fuzzy-matches against the results, and surfaces a recommendation for human review. Currently in a controlled rollout across a handful of locations. Current focus is evals using LangSmith, including LLM-as-judge to flag and queue suspicious outputs. Built on LangChain. The superpowers set of skills was also mentioned.

Claude Code in production — Another participant described their first month at a new healthcare company largely powered by Claude Code. Use cases included automated PR review, ticket quality scoring for product managers, and a custom command that queries the Sentry MCP alongside the codebase to proactively surface performance issues. No code written by hand in 30 days. Project management via Plane.

MCP vs. CLI + skills — A side thread on when to use an MCP server versus an LLM-friendly CLI with a skill harness. Context bloat was the main argument for CLIs, though Claude Code is addressing that by lazy-loading MCP tools. Google’s CLI (which ships with 100+ built-in skills) came up as a notable recent release. No strong consensus.

Multi-agent coordination — The longest thread. One participant has been experimenting with hierarchical multi-agent debate — an orchestrator spins up subagents, they each respond to a shared document in rounds. This opened into a broader discussion about communication architectures for multi-agent systems: bulletin boards, direct messaging, peer-to-peer. General view was that individual agent memory is mostly solved; multi-agent communication is still an open problem. Google’s Agent-to-Agent protocol came up as worth investigating.