Agent Harness Field Guide: 50 Loops, Tool Systems, and Lessons for LingTai

techdevlog

Living field guide

This post condenses a source-grounded study of 50 current agent harnesses. It is intentionally a living blog entry: the ecosystem moves fast, and this page should be updated as harnesses change, disappear, or teach LingTai new lessons.

Most agent discussions talk about “the model.” This guide is about the thing around the model: the harness.

A harness decides how context is assembled, how tools are declared, how tool calls are approved, how side effects are committed, how traces are recorded, how work resumes after interruption, and how a human can tell whether the agent is thinking, stuck, or acting. The model matters, but the harness decides whether the model can do reliable work.

For LingTai, the important comparison is not “which project has the cleverest ReAct loop.” LingTai is already a different shape: an always-on agent network with durable memory, mail/chat wakeup, avatars, daemons, MCP/addon ownership, and lifecycle control. The right question is: what should such a network borrow from the best single-agent harnesses, framework harnesses, and sandbox substrates?

Bottom line

The ecosystem clusters around five dominant ideas:

  1. Coding workbenches make tool use visible: shell, file edits, patches, approvals, and resumable sessions.
  2. IDE agents win by living next to code and keeping context/approval friction low.
  3. Graph and workflow frameworks make long plans deterministic through typed state, checkpoints, and edges.
  4. SDK/framework harnesses are converging on strict tools, typed outputs, tracing, evals, and handoffs.
  5. Sandbox substrates remind us that execution policy is not an implementation detail; it is part of the harness.

LingTai’s differentiation is still strong: it is not just a loop. It is a network runtime. But the study suggests several concrete improvements.

P0 — Tool-result commit ledger

Make each tool call explicitly move through states: proposed → approved → executing → side-effect committed → model-visible → durable-log-visible. This would make LingTai stronger than typical SDKs and reduce ambiguity around orphaned, retried, or healed tool calls.

P0 — Daemon/process reattachment

Adopt a run-artifact contract for every daemon/backend: parent PID, child PID, workspace, transcript, report path, last heartbeat, and recovery action. On restart, LingTai should be able to reattach, finalize, or explain instead of leaving a task in an unknown state.

P1 — Span-style observability

Borrow the tracing shape now common in modern agent SDKs: turn → model call → tool calls → MCP calls → daemon tasks. Render it in the portal/TUI so humans can see why an agent is slow or stuck.

P1 — Graph/checkpoint option

Keep LingTai’s always-on loop, but offer a graph/checkpoint primitive for workflows that need atomic multi-step state. LangGraph-style checkpointing is not a replacement for LingTai; it is a useful mode inside it.

P1 — Stricter tool schema ergonomics

Expose typed tool metadata: argument schema, side-effect class, timeout, approval policy, retry policy, and error formatter. The more tools LingTai owns, the more tool contracts should be visible as data.

P1 — Sandbox policy objects

Make sandbox/approval policy first-class per tool and backend. Claude Code, Codex, SWE-agent, and E2B/Daytona all show that filesystem, shell, network, and approval policy shape the agent’s behavior.

P1 — Cheaper handoff primitive

LingTai avatars are durable and powerful. Sometimes we also need a cheap in-process handoff/router primitive for specialist routing when persistence is unnecessary.

Taxonomy: how to read the field

50-harness matrix

#HarnessShapeEvidenceLesson for LingTai
1Claude CodeCoding CLI / closed agentClosed/public evidenceTreat the agent loop as a product surface: approvals, compaction, resume, and tool semantics are visible, not hidden.
2OpenAI Codex CLICoding CLIPublic/source-groundedSandbox and approval modes should be first-class runtime policy, not prompt folklore.
3OpenCodeCoding CLIPublic/source-groundedProvider-agnostic terminal agents need strict session state and model/tool abstraction boundaries.
4OpenHandsAutonomous SWE platformPublic/source-groundedA durable event stream plus workspace sandbox makes long-running SWE work inspectable and recoverable.
5AiderCoding CLIPublic/source-groundedGit-native editing keeps coding agents honest: every change is a diff with context.
6ContinueIDE/code assistant platformPublic/source-groundedIDE-native agents win when context assembly is explicit and user-editable.
7ClineIDE coding agentPublic/source-groundedA simple plan-act-observe loop becomes powerful when every tool call is user-visible.
8Roo CodeIDE coding agentPublic/source-groundedModes are a cheap way to express specialist behavior without spawning durable agents.
9GooseLocal agent runtimePublic/source-groundedExtension-based local runtimes make tools composable while keeping execution near the user.
10OpenClawAutomation/agent-loop frameworkPublic/source-groundedExplicit loop documentation is itself a product feature; users need to know what repeats.
11OpenHarnessLong-running autonomous harnessPublic/source-groundedLong-running autonomy needs a run artifact, not only a transcript.
12Hermes AgentSelf-improving agentPublic/source-groundedSelf-improvement requires memory and skill boundaries that prevent accidental drift.
13PiMinimal coding harnessPublic/source-groundedMinimal harnesses reveal the irreducible loop: assemble context, call model, apply tools, repeat.
14Oh My PiTerminal coding harnessPublic/source-groundedPersistent execution kernels are useful, but must be fenced by clear turn/tool budgets.
15harness-agentSmall/uncertain harness packagePublic/source uncertainSmall packages are useful negative space: naming a harness is not the same as owning a loop.
16LangGraphGraph agent frameworkPublic/source-groundedCheckpointed graphs are the strongest pattern for deterministic multi-step agent workflows.
17LangChain AgentsAgent frameworkPublic/source-groundedTool schemas, callbacks, and intermediate steps should be inspectable from the framework boundary.
18CrewAIMulti-agent frameworkPublic/source-groundedRole-based teams make delegation legible, but they need durable accountability to avoid theater.
19AutoGenMulti-agent frameworkPublic/source-groundedConversation-as-orchestration is flexible; termination and handoff rules are the hard part.
20Semantic Kernel AgentsEnterprise agent frameworkPublic/source-groundedEnterprise harnesses need typed functions, planners, and policy surfaces that non-research users can trust.
21LlamaIndex AgentsRAG/tool agent frameworkPublic/source-groundedRAG-centric agents prove that retrieval and tool use should share one traceable context contract.
22PydanticAITyped agent frameworkPublic/source-groundedTyped outputs and dependencies reduce ambiguity at the model/framework boundary.
23AgnoAgent/team frameworkPublic/source-groundedTeams, memory, and tools should be configured as data, then traced as execution.
24smolagentsLightweight code/tool agentsPublic/source-groundedCode-as-action is powerful when the sandbox and imports are constrained by design.
25DSPy agentsPrompt/programming frameworkPublic/source-groundedAgent behavior can be optimized as a program, not only hand-written as a prompt.
26AutoGPT ForgeAutonomous agent platformPublic/source-groundedAutonomy platforms need capability registries and budgets before they need more prompts.
27MetaGPTSoftware-company multi-agentPublic/source-groundedStructured artifacts can make multi-agent collaboration less chatty and more reviewable.
28CAMEL-AICommunicative multi-agent frameworkPublic/source-groundedSociety-style simulation is useful for research, but production needs ownership and state boundaries.
29Letta / MemGPTStateful memory agent serverPublic/source-groundedMemory must be an explicit runtime object with edit, recall, and persistence semantics.
30MastraTypeScript agent frameworkPublic/source-groundedModern app-agent frameworks treat agents, workflows, evals, and observability as one developer stack.
31VoltAgentTypeScript agent frameworkPublic/source-groundedDeveloper-friendly dashboards matter because agent failure is usually a trace-reading problem.
32MotiaEvent-driven workflow frameworkPublic/source-groundedEvent-driven workflows are a good substrate for agent steps that must outlive one request.
33Haystack AgentsPipeline/RAG agent frameworkPublic/source-groundedPipelines and agents should converge when retrieval, routing, and tool use interact.
34SWE-agentSWE-bench coding harnessPublic/source-groundedBench harnesses show the value of reproducible run directories and environment specs.
35mini-SWE-agentLightweight SWE harnessPublic/source-groundedA small, explicit loop is easier to benchmark than a giant framework.
36DevinCommercial SWE agentClosed/public evidenceClosed agents still teach product lessons: persistent workspace, async work, and human handoff.
37Factory DroidCommercial SWE agentClosed/public evidenceCommercial SWE agents emphasize end-to-end job ownership rather than framework APIs.
38Qodo PR-AgentCode review/change agentPublic/source-groundedNarrow review agents win by constraining context, outputs, and repository side effects.
39Sweep AIIssue-to-PR agentPublic/source-groundedIssue-to-PR agents need clear escalation when repository reality diverges from the issue text.
40MentatCommand-line coding agentPublic/source-groundedConversation plus patching remains a durable baseline for local coding agents.
41Cursor AgentIDE-native commercial agentClosed/public evidenceIDE-native commercial agents win through frictionless context and editor-integrated approval.
42Windsurf / CascadeIDE-native commercial agentClosed/public evidenceCascade-style products show the value of continuous project context, not one-off prompts.
43GitHub Copilot AgentIDE/GitHub coding agentClosed/public evidenceGitHub-native agents benefit from living where issues, branches, and PRs already live.
44OpenAI Agents SDKSDK / AgentKitPublic/source-groundedTracing, handoffs, and typed tools are becoming the expected SDK contract.
45BeeAI FrameworkAgent frameworkPublic/source-groundedFrameworks increasingly bundle memory, tools, and observability instead of treating them as add-ons.
46ControlFlowWorkflow/agent frameworkPublic/source-groundedTask graphs with typed results make agent work composable in ordinary software systems.
47PocketFlowMinimal workflow frameworkPublic/source-groundedMinimal node/action abstractions are useful when the goal is teachability and portability.
48E2B / DaytonaSandbox substratePublic/source-groundedThe sandbox is part of the harness: file system, network, process, and snapshot policy shape behavior.
49SuperAGIAutonomous agent platformPublic/source-groundedOlder autonomous platforms remind us that more tools without tighter state semantics becomes chaos.
50BabyAGI / functionzTask-loop lineagePublic/source-groundedThe original task loop is still visible under modern agents: create tasks, execute, reprioritize, remember.

What this means for LingTai

LingTai should not copy a single harness wholesale. The interesting direction is synthesis:

The field is moving toward stricter contracts around tools and traces. LingTai already has the rarer piece: agents that can live, sleep, wake, remember, spawn durable peers, and coordinate through channels. The next step is to make every part of that life cycle as inspectable and replayable as the best coding harnesses make a single patch.

Method note

The underlying study inspected 50 systems with source-first evidence where available. Open-source projects were checked against public repositories. Closed commercial systems such as Claude Code, Devin, Cursor, Windsurf/Cascade, and GitHub Copilot Agent are marked lower-confidence because their internal loops are not fully public; they are included for product and interface lessons, not as source-level claims.