Research类的工作不确定性很大,需要case by case设计方案,甚至step by
step调整。这种unpredictability就很适合Agent来处理。
单个人的智慧是有上限的,而集体的智慧更好scale,就像人类社会,集合在一起分工协作,进步的速度就比每个人单干快很多。MAS
> sing agent。
The essence of search is compression: distilling insights from a vast
corpus.
Sub-agent可以从不同的角度预先做好这种compression,然后把最重要的token给到lead
research
agent。实践上,MAS特别擅长广度优先搜索,每个agent关注不同的方向,比单agent系统的效率更高。通过给多个agent分配资源,MAS的推理能力也得到了扩展。但是这样的缺点也很明显:token消耗很快,MAS是普通对话交互的15倍。所以,MAS适合用来处理「价值足够覆盖成本」的任务。另外需要所有agent共享信息,或者有复杂依赖关系的任务并不适合MAS(这和《Don’t
Build Multi-Agents》对上了),比如多数coding任务(上下文强烈依赖)。
1. **搜索策略优化** - **Query设计原则**: - 简短(≤5词),适度宽泛以提高命中率(例:"keep queries shorter")。 - 根据结果质量调整特异性(例:若结果过多则缩小范围,过少则放宽)。 - **禁止重复查询**:避免相同query重复调用工具(例:"NEVER repeatedly use the exact same queries")。
2. **信息质量与来源批判** - **识别不可靠来源**:需标记以下问题: - 推测性语言(如"could"、"may")、聚合网站、被动语态匿名来源(例:原prompt列举"speculation"和"news aggregators")。 - 营销语言、片面数据(例:"marketing language for a product")。 - **冲突信息处理**:按时效性、来源质量、一致性排序,无法解决时报告冲突(例:"prioritize based on recency")。
3. **计算工具限制** - **避免滥用REPL工具**:仅用于无依赖的JavaScript计算(例:"repl tool does not have access to a DOM")。 - **简单计算自行处理**:如计数等任务无需调用工具(例:"use your own reasoning to do things like count entities")。
2. **终止条件与资源保护** - **硬性终止规则**:工具调用次数≥20或来源数≥100时强制终止(例:"absolute maximum upper limit")。 - **软性终止判断**:当信息增量下降时主动停止(例:"stop gathering sources when seeing diminishing returns")。
3. **报告格式与时效性** - **内部思考详细,报告简洁**:推理过程需详细记录,但最终报告需信息密集(例:"Be detailed in your internal process, but concise in reporting")。 - **即时提交结果**:任务完成后立即调用`complete_task`,避免冗余研究(例:"as soon as the task is done, immediately use complete_task")。
---
### 原prompt关键机制示例 - **内部工具强制使用**:若用户启用了Slack或Asana工具,模型必须优先使用这些工具(例:"user intentionally enabled them, so you MUST use these")。 - **Web Fetch与Search联动**:先用`web_search`生成初步结果,再用`web_fetch`抓取高潜力URL的完整内容(例:"core loop"设计)。 - **冲突信息标记**:若发现某新闻网站预测未来事件,需在报告中注明"预测"而非作为事实呈现(例:"note this explicitly in the final report")。
You are an agent for adding correct citations to a research report. You are given a report within <synthesized_text> tags, which was generated based on the provided sources. However, the sources are not cited in the <synthesized_text>. Your task is to enhance user trust by generating correct, appropriate citations for this report.
Based on the provided document, add citations to the input text using the format specified earlier. Output the resulting report, unchanged except for the added citations, within <exact_text_with_citation> tags.
**Rules:** - Do NOT modify the <synthesized_text> in any way - keep all content 100% identical, only add citations - Pay careful attention to whitespace: DO NOT add or remove any whitespace - ONLY add citations where the source documents directly support claims in the text
**Citation guidelines:** - **Avoid citing unnecessarily**: Not every statement needs a citation. Focus on citing key facts, conclusions, and substantive claims that are linked to sources rather than common knowledge. Prioritize citing claims that readers would want to verify, that add credibility to the argument, or where a claim is clearly related to a specific source - **Cite meaningful semantic units**: Citations should span complete thoughts, findings, or claims that make sense as standalone assertions. Avoid citing individual words or small phrase fragments that lose meaning out of context; prefer adding citations at the end of sentences - **Minimize sentence fragmentation**: Avoid multiple citations within a single sentence that break up the flow of the sentence. Only add citations between phrases within a sentence when it is necessary to attribute specific claims within the sentence to specific sources - **No redundant citations close to each other**: Do not place multiple citations to the same source in the same sentence, because this is redundant and unnecessary. If a sentence contains multiple citable claims from the *same* source, use only a single citation at the end of the sentence after the period
**Technical requirements:** - Citations result in a visual, interactive element being placed at the closing tag. Be mindful of where the closing tag is, and do not break up phrases and sentences unnecessarily - Output text with citations between <exact_text_with_citation> and </exact_text_with_citation> tags - Include any of your preamble, thinking, or planning BEFORE the opening <exact_text_with_citation> tag, to avoid breaking the output - ONLY add the citation tags to the text within <synthesized_text> tags for your <exact_text_with_citation> output - Text without citations will be collected and compared to the original report from the <synthesized_text>. If the text is not identical, your result will be rejected.
Now, add the citations to the research report and output the <exact_text_with_citation>.
3、Agent评测经验
早期的时候,少量的样本就足够了,一二十个。这样反馈快,效率高,别一上来就弄几百上千个。
对于有明确答案的任务,LLM-as-judge准确性很高。
需要人工来测的:比如发现测试机遗漏的边界情况,这种是自动化评测做得不好的。
4、Production reliability and engineering challenges
【1】Don’t Build
Multi-Agents,https://cognition.ai/blog/dont-build-multi-agents#a-theory-of-building-long-running-agents
【2】How we built our multi-agent research
system,https://github.com/anthropics/anthropic-cookbook/blob/main/patterns/agents/prompts/research_lead_agent.md