極限套娃，Agent自動設計Agentic系統！

發布于 2024-8-21 11:20

瀏覽

0收藏

Agent智能體系統正在作為通用工具被廣泛研究和應用，解決復雜問題通常需要由多個組件組成的復合智能體系統，而手工設計的解決方案最終會被學習到的更高效的解決方案所取代。

為此，提出了自動化設計智能體系統（ADAS：Automated Design of Agentic Systems，已開源）的新研究領域，目標是自動創建強大的智能體系統設計。

通過代碼定義整個智能體系統，并由一個“元Agent”自動發現新的智能體，理論上允許ADAS算法發現任何可能的構建塊和智能體系統。

元Agent搜索的概述以及發現的Agent示例。指導元Agent迭代地編程新代理，測試它們在任務上的性能，將它們添加到已發現Agent的存檔中，并使用這個存檔來通知后續迭代中的元Agent。展示了三次運行中的三個示例Agent，所有名稱都由元Agent生成。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

自動化設計智能體系統（Automated Design of Agentic Systems）：

ADAS的定義和目標

ADAS旨在自動發明新的構建塊，并設計功能強大的智能體系統。智能體系統涉及使用基礎模型（Foundation Models，簡稱FMs）作為模塊，通過規劃、使用工具和執行多步驟的迭代處理來完成任務。

ADAS的三個關鍵組成部分

自動化智能體系統設計（ADAS）的三個關鍵組成部分。搜索空間決定了ADAS中可以表示哪些Agent系統。搜索算法指定了ADAS方法如何探索搜索空間。評估函數定義了如何根據目標目標（如性能）評估候選Agent。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

搜索空間（Search Space）：定義了ADAS中可以表示哪些智能體系統。例如，一些研究只變異智能體的文本提示，而其他組件（如控制流）保持不變。
搜索算法（Search Algorithm）：指定了ADAS方法如何探索搜索空間。由于搜索空間通常非常大甚至無界，需要考慮探索與利用的權衡。
評估函數（Evaluation Function）：根據ADAS算法的應用，可能考慮不同的目標來優化，如性能、成本、延遲或智能體的安全性。評估函數定義了如何在這些目標上評估候選智能體。

通過在編碼、科學和數學等多個領域的廣泛實驗，展示了該算法能夠逐步發明具有新穎設計的智能體，這些智能體的性能大大超過了手工設計的最先進智能體。

元智能體搜索在ARC挑戰上的結果。(a) 元智能體搜索基于不斷增長的先前發現的存檔，逐步發現高性能智能體。通過五次評估智能體，在保留的測試集上報告中位數準確度和95%的自舉置信區間。(b) 元智能體搜索在ARC挑戰上發現的最佳智能體的可視化。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

來自ARC挑戰的一個示例任務。給定輸入-輸出網格示例，人工智能系統被要求學習轉換規則，然后將這些學到的規則應用于測試網格，以預測最終答案。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

Meta Agent Search與多個領域內最先進的手工設計智能體之間的性能比較。Meta Agent Search在每個領域中發現的智能體都優于基線。報告了在保留的測試集上的測試準確度和95%自舉置信區間。每個領域的搜索是獨立進行的。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

將MGSM中的頂級智能體轉移到其他數學領域時的性能。元智能體搜索發現的智能體在不同數學領域中始終優于基線。我們報告了測試準確度和95%自舉置信區間。頂級智能體的名稱由元智能體搜索生成。

極限套娃，Agent自動設計Agentic系統！-AI.x社區

附錄

Meta Agent系統Prompt

You are a helpful assistant. Make sure to return in a WELL-FORMED JSON object.

使用以下提示來指導元智能體基于先前發現的智能體存檔來設計新智能體。

Meta Agent核心Prompt

You are an expert machine learning researcher testing various agentic systems. Your objective is to design
building blocks such as prompts and control flows within these systems to solve complex tasks. Your aim
is to design an optimal agent performing well on [Brief Description of the Domain].
[Framework Code]
[Output Instructions and Examples]
[Discovered Agent Archive] (initialized with baselines, updated at every iteration)
# Your task
You are deeply familiar with prompting techniques and the agent works from the literature. Your goal is
to maximize the specified performance metrics by proposing interestingly new agents.
Observe the discovered agents carefully and think about what insights, lessons, or stepping stones can be
learned from them.
Be creative when thinking about the next interesting agent to try. You are encouraged to draw inspiration
from related agent papers or academic papers from other research areas.
Use the knowledge from the archive and inspiration from academic literature to propose the next
interesting agentic system design.
THINK OUTSIDE THE BOX.

使用以下提示來指導和格式化元智能體的輸出。在這里，收集并呈現了元智能體在提示中可能犯的一些常見錯誤，這在提高生成代碼的質量方面是有效的。

輸出指令和示例

# Output Instruction and Example:
The first key should be (“thought”), and it should capture your thought process for designing the
next function. In the “thought” section, first reason about what the next interesting agent to try
should be, then describe your reasoning and the overall concept behind the agent design, and
finally detail the implementation steps. The second key (“name”) corresponds to the name of
your next agent architecture. Finally, the last key (“code”) corresponds to the exact “forward()”
function in Python code that you would like to try. You must write COMPLETE CODE in “code”:
Your code will be part of the entire project, so please implement complete, reliable, reusable code snippets.
Here is an example of the output format for the next agent:
{“thought”: “**Insights:** Your insights on what should be the next interesting agent. **Overall Idea:**
your reasoning and the overall concept behind the agent design. **Implementation:** describe the
implementation step by step.”,
“name”: “Name of your proposed agent”,
“code”: “def forward(self, taskInfo): # Your code here”}
## WRONG Implementation examples:
[Examples of potential mistakes the meta agent may make in implementation]

在元智能體的第一次響應之后，進行兩輪自我反思，以使生成的智能體新穎且無錯誤。

自我反思第一輪的提示

[Generated Agent from Previous Iteration]
Carefully review the proposed new architecture and reflect on the following points:
1. **Interestingness**: Assess whether your proposed architecture is interesting or innovative compared
to existing methods in the archive. If you determine that the proposed architecture is not interesting,
suggest a new architecture that addresses these shortcomings.
- Make sure to check the difference between the proposed architecture and previous attempts.
- Compare the proposal and the architectures in the archive CAREFULLY, including their actual differences
in the implementation.
- Decide whether the current architecture is innovative.
- USE CRITICAL THINKING!
2. **Implementation Mistakes**: Identify any mistakes you may have made in the implementation.
Review the code carefully, debug any issues you find, and provide a corrected version. REMEMBER
checking "## WRONG Implementation examples" in the prompt.
3. **Improvement**: Based on the proposed architecture, suggest improvements in the detailed
implementation that could increase its performance or effectiveness. In this step, focus on refining and
optimizing the existing implementation without altering the overall design framework, except if you
want to propose a different architecture if the current is not interesting.
- Observe carefully about whether the implementation is actually doing what it is supposed to do.
- Check if there is redundant code or unnecessary steps in the implementation. Replace them with
effective implementation.
- Try to avoid the implementation being too similar to the previous agent.
And then, you need to improve or revise the implementation, or implement the new proposed architecture
based on the reflection.
Your response should be organized as follows:
"reflection": Provide your thoughts on the interestingness of the architecture, identify any mistakes in the
implementation, and suggest improvements.
"thought": Revise your previous proposal or propose a new architecture if necessary, using the same
format as the example response.
"name": Provide a name for the revised or new architecture. (Don’t put words like "new" or "improved"
in the name.)
"code": Provide the corrected code or an improved implementation. Make sure you actually implement
your fix and improvement in this code.

自我反思第二輪的提示

Using the tips in “## WRONG Implementation examples” section, further revise the code.
Your response should be organized as follows:
Include your updated reflections in the “reflection”. Repeat the previous “thought” and “name”. Update
the corrected version of the code in the “code” section.

當在執行生成的代碼期間遇到錯誤時，會進行反思并重新運行代碼。如果錯誤持續存在，這個過程會重復進行，最多五次。以下是用于自我反思任何運行時錯誤的提示：

運行時錯誤發生時的自我反思提示

Error during evaluation:
[Runtime errors]
Carefully consider where you went wrong in your latest implementation. Using insights from previous
attempts, try to debug the current code to implement the same thought. Repeat your previous thought in
“thought”, and put your thinking for debugging in “debug_thought”.

https://arxiv.org/pdf/2408.08435
Automated Design of Agentic Systems
https://github.com/ShengranHu/ADAS

本文轉載自??PaperAgent??

標簽

Agent

Agentic

系統

贊

回復

舉報

回復

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

51CTO

51CTO博客

51CTO學堂

極限套娃，Agent自動設計Agentic系統！

ADAS的定義和目標

ADAS的三個關鍵組成部分

目錄