MarkTechPost@AI 08月24日
A Full Code Implementation to Design a Graph-Structured AI Agent with Gemini for Task Planning, Retrieval, Computation, and Self-Critique
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了如何利用GraphAgent框架和Gemini 1.5 Flash模型构建一个先进的图结构AI代理。该代理通过定义包含规划、路由、研究、数学计算、写作和评审等功能的节点,实现了任务的模块化处理。文章展示了如何集成Gemini模型处理结构化JSON提示,并通过本地Python函数(如安全数学评估和文档搜索)作为工具,确保了计算和检索的可靠性。整个流程端到端执行,有效整合了推理、检索和验证,形成了一个统一且强大的AI系统。

✨ **模块化AI代理设计**:文章展示了一个基于GraphAgent框架的AI代理,该代理将复杂任务分解为一系列相互连接的节点(规划、路由、研究、数学、写作、评审),每个节点负责特定功能,并通过图结构进行协调,实现了任务处理的模块化和条理化。

🚀 **Gemini 1.5 Flash集成与工具调用**:代理的核心是Gemini 1.5 Flash模型,通过专门的包装器处理结构化JSON提示。同时,集成了本地Python函数作为安全数学评估(使用ast模块)和文档搜索的工具,增强了代理的计算和信息获取能力。

🧭 **动态路由与迭代优化**:代理包含一个智能的路由节点,能够根据任务需求和当前状态动态选择下一步执行的研究或数学计算节点。此外,通过一个评审节点对输出进行事实性、清晰度和完整性检查,并进行迭代优化,确保最终结果的质量。

📊 **状态管理与流程可视化**:文章定义了一个`State`数据类来持久化任务、计划、证据和中间笔记,实现了代理的透明状态管理。同时,提供了ASCII图来可视化代理的控制流程,便于理解其工作机制。

💡 **通用性与可扩展性**:该实现展示了如何将LLM的推理能力与图编排相结合,创建出确定性工作流。这种方法易于扩展,可以集成自定义工具链、实现多轮记忆或并行节点执行,为构建更复杂的AI应用奠定了基础。

In this tutorial, we implement an advanced graph-based AI agent using the GraphAgent framework and the Gemini 1.5 Flash model. We define a directed graph of nodes, each responsible for a specific function: a planner to break down the task, a router to control flow, research and math nodes to provide external evidence and computation, a writer to synthesize the answer, and a critic to validate and refine the output. We integrate Gemini through a wrapper that handles structured JSON prompts, while local Python functions act as tools for safe math evaluation and document search. By executing this pipeline end-to-end, we demonstrate how reasoning, retrieval, and validation are modularized within a single cohesive system. Check out the FULL CODES here.

import os, json, time, ast, math, getpassfrom dataclasses import dataclass, fieldfrom typing import Dict, List, Callable, Anyimport google.generativeai as genaitry:   import networkx as nxexcept ImportError:   nx = None

We begin by importing core Python libraries for data handling, timing, and safe evaluation, along with dataclasses and typing helpers to structure our state. We also load the google.generativeai client to access Gemini and, optionally, NetworkX for graph visualization. Check out the FULL CODES here.

def make_model(api_key: str, model_name: str = "gemini-1.5-flash"):   genai.configure(api_key=api_key)   return genai.GenerativeModel(model_name, system_instruction=(       "You are GraphAgent, a principled planner-executor. "       "Prefer structured, concise outputs; use provided tools when asked."   ))def call_llm(model, prompt: str, temperature=0.2) -> str:   r = model.generate_content(prompt, generation_config={"temperature": temperature})   return (r.text or "").strip()

We define a helper to configure and return a Gemini model with a custom system instruction, and another function that calls the LLM with a prompt while controlling temperature. We use this setup to ensure our agent receives structured, concise outputs consistently. Check out the FULL CODES here.

def safe_eval_math(expr: str) -> str:   node = ast.parse(expr, mode="eval")   allowed = (ast.Expression, ast.BinOp, ast.UnaryOp, ast.Num, ast.Constant,              ast.Add, ast.Sub, ast.Mult, ast.Div, ast.Pow, ast.Mod,              ast.USub, ast.UAdd, ast.FloorDiv, ast.AST)   def check(n):       if not isinstance(n, allowed): raise ValueError("Unsafe expression")       for c in ast.iter_child_nodes(n): check(c)   check(node)   return str(eval(compile(node, "<math>", "eval"), {"__builtins__": {}}, {}))DOCS = [   "Solar panels convert sunlight to electricity; capacity factor ~20%.",   "Wind turbines harvest kinetic energy; onshore capacity factor ~35%.",   "RAG = retrieval-augmented generation joins search with prompting.",   "LangGraph enables cyclic graphs of agents; good for tool orchestration.",]def search_docs(q: str, k: int = 3) -> List[str]:   ql = q.lower()   scored = sorted(DOCS, key=lambda d: -sum(w in d.lower() for w in ql.split()))   return scored[:k]

We implement two key tools for the agent: a safe math evaluator that parses and checks arithmetic expressions with ast before execution, and a simple document search that retrieves the most relevant snippets from a small in-memory corpus. We use these to give the agent reliable computation and retrieval capabilities without external dependencies. Check out the FULL CODES here.

@dataclassclass State:   task: str   plan: str = ""   scratch: List[str] = field(default_factory=list)   evidence: List[str] = field(default_factory=list)   result: str = ""   step: int = 0   done: bool = Falsedef node_plan(state: State, model) -> str:   prompt = f"""Plan step-by-step to solve the user task.Task: {state.task}Return JSON: {{"subtasks": ["..."], "tools": {{"search": true/false, "math": true/false}}, "success_criteria": ["..."]}}"""   js = call_llm(model, prompt)   try:       plan = json.loads(js[js.find("{"): js.rfind("}")+1])   except Exception:       plan = {"subtasks": ["Research", "Synthesize"], "tools": {"search": True, "math": False}, "success_criteria": ["clear answer"]}   state.plan = json.dumps(plan, indent=2)   state.scratch.append("PLAN:\n"+state.plan)   return "route"def node_route(state: State, model) -> str:   prompt = f"""You are a router. Decide next node.Context scratch:\n{chr(10).join(state.scratch[-5:])}If math needed -> 'math', if research needed -> 'research', if ready -> 'write'.Return one token from [research, math, write]. Task: {state.task}"""   choice = call_llm(model, prompt).lower()   if "math" in choice and any(ch.isdigit() for ch in state.task):       return "math"   if "research" in choice or not state.evidence:       return "research"   return "write"def node_research(state: State, model) -> str:   prompt = f"""Generate 3 focused search queries for:Task: {state.task}Return as JSON list of strings."""   qjson = call_llm(model, prompt)   try:       queries = json.loads(qjson[qjson.find("["): qjson.rfind("]")+1])[:3]   except Exception:       queries = [state.task, "background "+state.task, "pros cons "+state.task]   hits = []   for q in queries:       hits.extend(search_docs(q, k=2))   state.evidence.extend(list(dict.fromkeys(hits)))   state.scratch.append("EVIDENCE:\n- " + "\n- ".join(hits))   return "route"def node_math(state: State, model) -> str:   prompt = "Extract a single arithmetic expression from this task:\n"+state.task   expr = call_llm(model, prompt)   expr = "".join(ch for ch in expr if ch in "0123456789+-*/().%^ ")   try:       val = safe_eval_math(expr)       state.scratch.append(f"MATH: {expr} = {val}")   except Exception as e:       state.scratch.append(f"MATH-ERROR: {expr} ({e})")   return "route"def node_write(state: State, model) -> str:   prompt = f"""Write the final answer.Task: {state.task}Use the evidence and any math results below, cite inline like [1],[2].Evidence:\n{chr(10).join(f'[{i+1}] '+e for i,e in enumerate(state.evidence))}Notes:\n{chr(10).join(state.scratch[-5:])}Return a concise, structured answer."""   draft = call_llm(model, prompt, temperature=0.3)   state.result = draft   state.scratch.append("DRAFT:\n"+draft)   return "critic"def node_critic(state: State, model) -> str:   prompt = f"""Critique and improve the answer for factuality, missing steps, and clarity.If fix needed, return improved answer. Else return 'OK'.Answer:\n{state.result}\nCriteria:\n{state.plan}"""   crit = call_llm(model, prompt)   if crit.strip().upper() != "OK" and len(crit) > 30:       state.result = crit.strip()       state.scratch.append("REVISED")   state.done = True   return "end"NODES: Dict[str, Callable[[State, Any], str]] = {   "plan": node_plan, "route": node_route, "research": node_research,   "math": node_math, "write": node_write, "critic": node_critic}def run_graph(task: str, api_key: str) -> State:   model = make_model(api_key)   state = State(task=task)   cur = "plan"   max_steps = 12   while not state.done and state.step < max_steps:       state.step += 1       nxt = NODES[cur](state, model)       if nxt == "end": break       cur = nxt   return statedef ascii_graph():   return """START -> plan -> route -> (research <-> route) & (math <-> route) -> write -> critic -> END"""

We define a typed State dataclass to persist the task, plan, evidence, scratch notes, and control flags as the graph executes. We implement node functions, a planner, a router, research, math, a writer, and a critic. These functions mutate the state and return the next node label. We then register them in NODES and iterate in run_graph until done. We also expose ascii_graph() to visualize the control flow we follow as we route between research/math and finalize with a critique. Check out the FULL CODES here.

if __name__ == "__main__":   key = os.getenv("GEMINI_API_KEY") or getpass.getpass(" Enter GEMINI_API_KEY: ")   task = input(" Enter your task: ").strip() or "Compare solar vs wind for reliability; compute 5*7."   t0 = time.time()   state = run_graph(task, key)   dt = time.time() - t0   print("\n=== GRAPH ===", ascii_graph())   print(f"\n Result in {dt:.2f}s:\n{state.result}\n")   print("---- Evidence ----")   print("\n".join(state.evidence))   print("\n---- Scratch (last 5) ----")   print("\n".join(state.scratch[-5:]))

We define the program’s entry point: we securely read the Gemini API key, take a task as input, and then run the graph through run_graph. We measure execution time, print the ASCII graph of the workflow, display the final result, and also output supporting evidence and the last few scratch notes for transparency. Check out the FULL CODES here.

In conclusion, we demonstrate how a graph-structured agent enables the design of deterministic workflows around a probabilistic LLM. We observe how the planner node enforces task decomposition, the router dynamically selects between research and math, and the critic provides iterative improvement for factuality and clarity. Gemini acts as the central reasoning engine, while the graph nodes supply structure, safety checks, and transparent state management. We conclude with a fully functional agent that showcases the benefits of combining graph orchestration with a modern LLM, enabling extensions such as custom toolchains, multi-turn memory, or parallel node execution in more complex deployments.


Check out the FULL CODES here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post A Full Code Implementation to Design a Graph-Structured AI Agent with Gemini for Task Planning, Retrieval, Computation, and Self-Critique appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

GraphAgent Gemini 1.5 Flash AI代理 任务规划 检索增强生成
相关文章