This article is machine translated
Show original
Walden, a person in charge of Cognition, the developer of Devin, also published an article about multi-agents. He warned people not to build multi-agents. He pointed out some common methods that are not effective in practice, especially the drawbacks of multi-agent architecture. So he compared how these two companies faced and solved the problem of multi-agents.

歸藏(guizang.ai)
@op7418
06-14
Anthropic 这篇文章值得看看
详细阐述了构建多个智能体来更高效地探索复杂课题所面临的工程挑战,包括代理协调、评估和可靠性。
让 Notebooklm 总结了一下主要内容 x.com/AnthropicAI/st…


The article is here:

Walden
@walden_yan
06-13
I see a lot of people make the same mistakes building agents. So we shared a few of the principles we use
https://cognition.ai/blog/dont-build-multi-agents…
Context Sharing and Conflict Resolution
Cognition's Perspective:
Sub-agents may work inconsistently even when sharing the original task context, as they cannot see what other sub-agents are doing, leading to actions based on conflicting unset assumptions. They emphasize principle 1 as "sharing complete context and complete agent tracking, not just individual messages" and principle 2 as "actions carry implied decisions, and conflicting decisions lead to poor results".
Anthropic First Acknowledges These Limitations:
"Some domains require all agents to share the same context, or involve many dependencies between agents, which are currently unsuitable for multi-agent systems". They specifically mention that "most coding tasks have fewer truly parallelizable tasks compared to research tasks, and LLM agents are currently not good at real-time coordination and delegation to other agents". This echoes the issues mentioned in about Claude Code sub-agents not parallelizing code writing and small models misunderstanding instructions in the "editing application model".
Then Let's Look at How Anthropic Overcomes These Limitations:
Coordination Pattern: Anthropic's system adopts a "coordinator-worker" model, with a main agent coordinating the entire process and delegating tasks to parallel specialized sub-agents. The main agent analyzes the query, develops a strategy, and generates sub-agents to simultaneously explore different aspects. Sub-agents return results to the main agent for synthesis.
Explicit Delegation: They emphasize "teaching the coordinator how to delegate", meaning the main agent needs to provide detailed task descriptions for sub-agents, including objectives, output formats, tools and source guidelines, and clear task boundaries to avoid work duplication, omission, or task misunderstandings. For example, without detailed description, sub-agents might repeat the same search or interpret tasks differently.
Context Management: For long-running tasks and context window overflow issues, Anthropic's solution is for the main agent to save plans to "memory" to persist context and prevent truncation when the context window becomes too large. They also implement agents summarizing key information to external memory after completing work stages and generating new sub-agents when the context approaches limits, maintaining continuity through careful handover.
Minimizing "Telephone Game": They minimize the "game of telephone" by having sub-agents directly save outputs to the file system, rather than all information passing through the main coordinator. This helps improve fidelity and performance and reduces Token overhead required to replicate large outputs through conversation history, thus avoiding information loss.
On single-threaded linear agents vs. multi-agent parallelism
Cognition’s view:
The simplest principle-following approach is to use “single-threaded linear agents” where the context is continuous. They believe that current agents are not as reliable as humans in long-context, proactive communication, so multi-agent collaboration will only lead to fragile systems.
Anthropic’s view:
Anthropic actively embraces multi-agent parallelism, believing it is a “key way to scale performance.”
They believe that for open-ended, unpredictable problems like research, multi-agent systems are particularly suitable because it provides flexibility to adjust methods based on findings and allows sub-agents to operate in parallel, thereby achieving “compression” and “separation of concerns.” They found through internal evaluation that multi-agent research systems perform 90.2% better than single-agent systems on breadth-first queries.
Speed improvement: Anthropic significantly improved research time by introducing two parallelization methods: the main agent launches 3-5 sub-agents in parallel, and the sub-agents use more than 3 tools in parallel, thereby reducing research time for complex queries by 90%.
Token consumption: However, Anthropic also admits that this is a "disadvantage": "These architectures consume tokens quickly in practice", and multi-agent systems usually use about 15 times more tokens than chat interactions. Therefore, multi-agent systems are only suitable for scenarios where "the value of the task is high enough to pay for the increased performance."
Coordination bottleneck: Anthropic's current main agent is a "synchronous execution of sub-agents", that is, it waits for each group of sub-agents to complete before continuing, which simplifies coordination but creates bottlenecks in information flow. They mentioned that asynchronous execution will achieve greater parallelism, but will increase the challenges of result coordination, state consistency, and error propagation, and expect that in the future, when the model can handle longer and more complex research tasks, the performance improvement will prove that its complexity is worthwhile.
From Twitter
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share