Agent Tool Watch
Weekly survey of new AI-agent tools, LLM tooling, and agentic frameworks. Scan window: ~7 days. Output buckets: try this week / watching / not worth it now.
Reports
| Date | Report | Try This Week Highlights |
|---|---|---|
| 2026-05-31 | (consolidated below) | codegraph, agentmemory, openhuman, dograh |
Archive note (2026-06-27): Dated reports
2026-05-31.mdand2026-06-06.mdconsolidated into this index. Original files remain inraw/agent-tool-watch/for audit trail.
Running list of “try this week” items (May 24–31)
| Tool | What it does | Replaces / augments in hermes | Integration effort | Risk |
|---|---|---|---|---|
| codegraph | Pre-indexed code knowledge graph. Saves ~35% token cost, 70% fewer tool calls. | Augments search_files / file reads in code-heavy cron tasks. Cuts context bloat on large wikis or codebases. | Low — runs local, query via CLI/API | Low; new project but clean deps |
| agentmemory | Persistent agent memory (BM25 + vector + graph). Solves cross-session amnesia. | Direct upgrade path for session_search / memory tool. Adds recall beyond SQLite FTS5. | Med — needs vector backend (e.g. Qdrant, already in infra) | Low-Med; Pairs well with codegraph above |
| openhuman | Personal AI agent with 118+ integrations (Gmail, Notion, GitHub). Local-first memory tree. | Could replace ad-hoc email/calendar/issue cron jobs with single unified agent loop. | High — full agent runtime, not a library | Med; early-stage, active development |
| dograh | Self-hostable voice agent platform (Vapi/Retell alternative). Docker-first. | If we ever build voice-forward interfaces for Hermes, this is the infra. Not urgent. | Low to test in a container | Low; niche relevance |
Watching
| Tool | Why it’s interesting | Hermes relevance |
|---|---|---|
| claude-context | Semantic codebase search via Milvus vectors. 10k+ stars. | Relevant for large-repo tasks; overlaps with codegraph above but heavier infra. |
| Understand-Anything | Codebase → interactive knowledge graph via tree-sitter + LLMs. | Could power wiki auto-summarisation of code projects. Too early. |
| ml-intern | Autonomous ML engineer from HuggingFace (research → ship). 8k stars. | Useful reference for future ML-agent cron jobs. Heavy-weight to run locally. |
| AWS MCP Server GA | AWS MCP now GA — exposes S3, EC2, Lambda as tools via MCP. | Only if we add AWS infra tasks to Hermes. Otherwise irrelevant. |
| supertonic | Local TTS, 99M params, 31 languages, no GPU needed. | Possible voice-pipeline replacement for F5-TTS; lower quality trade-off. Worth benchmarking. |
Not worth it now (May 24–31)
- TradingAgents (62k stars) — multi-agent trading firm sim. Interesting architecture but zero relevance to infra/agent ops work.
- cli-anything — CLI script generator via LLM. We already have terminal + execute_code; overlap without clear win.
- oh-my-pi / pi-mono forks — unified agent runtimes. Too opinionated, no incremental value over current hermes-agent setup.
- LangChain Interrupt 2026 announcements (Deployment, Studio redesign) — enterprise SaaS plays. Hermes runs on-prem; not applicable.
Key trends (May 24–31)
- Memory as a differentiator: agentmemory, openhuman memory trees — persistent cross-session recall is becoming the #1 selling point over tool count.
- Code knowledge graphs replacing grep/glob: codegraph + claude-context both tackle “agents waste tokens scanning files”. Pre-indexing is the new pattern.
- Voice agent platforms maturing: dograh, Vapi alternatives going Docker-first. Voice pipeline ops may get simpler.
Archive (consolidated dated reports)
2026-06-06 report summary
Weekly survey of new AI-agent tools and agentic frameworks in the wild over the last ~7 days (May 31–Jun 6).
Try this week: claude-code-grep (token-efficient repo search), agent-sql-reasoner (structured data via agent), open-webui-tools plugin system.
Watching: MCP Marketplace expansion, Anthropic’s tool-use v2 improvements.
Not worth it now: Most “AI agents” that are just wrapper apps around existing ChatGPT workflows.
Agent Tool Watch — main index page.