In traditional automated processes, each tool call is a pre-orchestrated, fixed action. However, when facing complex problems, this rigid structure is like forcing a pianist to mechanically stick to the score. While a workflow is mainly used to constrain how tasks are carried out, the growing reasoning power of LLMs means that parts of the workflow can gradually be entrusted to the LLM. Recently, Dify officially introduced a new plugin type—Agent Strategy—which we’ll explore below.

Core Concept: The Relationship Between Agent Node and Strategy

In a Dify Workflow, the Agent Node takes certain steps out of the fixed flow and tool pattern, handing them over to the LLM for autonomous decisions and judgments. An Agent Strategy is an extensible template that defines the standardized input and output formats. By developing custom interfaces for these strategies, you can implement various solutions such as CoT (Chain-of-Thought), ToT (Tree-of-Thought), GoT (Graph-of-Thought), BoT (Pillars-of-Thought), and even more advanced strategies like semantic kernels.

In Dify, the Agent Node embeds the Agent Strategy and connects with upstream and downstream nodes. Like an LLM node, it tackles a specific task and returns a final response to the next node.

Agent Node (execution unit)

The “decision center” of a workflow. It allocates resources, manages states, and logs the entire reasoning process.

Agent Strategy (decision logic)

A pluggable reasoning algorithm module that defines how tools are used and how problems are solved.

This decoupled design is like separating a car’s engine from its control system—developers can upgrade the “powertrain” without affecting the overall vehicle architecture. We currently provide two classic Agent Strategies:

ReAct: A classic chain of “Think–Act–Observe”
Function Calling: Precise function-based calling

You can download both strategies directly from the Marketplace. More importantly, we’ve released an open standard for strategy development. In Dify, any developer can:

Quickly create strategy plugins with the CLI tool
Customize configuration forms and visualization components
Integrate cutting-edge academic algorithms (e.g., Tree-of-Thoughts)

This effectively turns Dify into an “innovation testbed” for AI reasoning strategies, allowing every user to benefit from community-driven advancements.

Feature Overview

Within a Workflow, the Agent Node enables autonomous thinking for multi-step tool reasoning. A minimal Agent Strategy must at least define how to use the LLM API and how to call tools.

For Non-Technical Users

Drag-and-Drop Setup

Simply drag an Agent Node from the tool panel and configure it in three steps:

Choose a reasoning strategy
Link the tool/model
Set the prompt template

Transparent Reasoning

One of Dify’s powerful features is its built-in logging mechanism, which creates a tree-like structure of the agent’s thought process. This structure enables you to:

Visualize the agent’s execution path for debugging complex multi-step reasoning

View in real time:

Total time and token usage
Each round of reasoning
Tool invocation traces

For Developers

Defining an Agent Strategy involves specifying how the language model will:

Handle user queries
Select the right tools
Use the correct parameters to run those tools
Process the results
Decide when the task is complete

We provide a Standardized Development Kit, which includes a strategy configuration component library (e.g., Model Selector/Tool Editor), a structured logging interface, and a sandbox testing environment.

Specifically, a strategy definition covers its identity and metadata, the required parameters (model, tools, query, etc.), parameter types and constraints, and the location of the source code.

An agent executes in three main stages: initialization, iterative looping, and final response. During initialization, the system sets up all necessary parameters, tools, and contexts. Then, in the iterative loop, the system prepares a prompt with the current context and calls the LLM with information about the available tools. It parses the LLM’s response to see if a tool call is needed or if a final answer has already been reached. If a tool call is required, the system executes that tool and updates the context with its output. This loop continues until the task is complete or the preset iteration limit is reached. Finally, in the last phase, the system returns the final answer or result.

For example, a function_calling.yaml file might look like this:

parameters:  - name: model    type: model-selector          scope: tool-call&llm      - name: tools    type: array[tools]        - name: max_iterations    type: number                default: 5extra:  python:    source: function_calling.py

Thanks to this declarative architecture, configuring a strategy feels like filling out a simple form, with support for:

Dynamic parameter validation (type/range/dependencies)
Automatic multilingual label rendering

For more details, read our documentation.

Future Outlook: Expanding Features

We plan to iterate further and add more developer-friendly components, such as:

Knowledge base integration
Memory support in Chatflow
Error handling and retries
Additional official Agent Strategies

Users can freely download different strategies from the community and load them into different nodes to tackle various tasks. To understand the Agent Node’s capabilities, you can start with a simple three-node Chatflow to see how it works and experience something akin to an Agent app. For more complex tasks, consider a routing and handoff approach, treating the Agent Node as an extension of the LLM Node to break down problem-solving into steps.

For instance, the Agent Node can enable OpenAI ChatGPT-4o with Task capabilities (image courtesy of community contributor Pascal).

We look forward to incorporating even more features in Version 1.0.0, and we welcome everyone to contribute additional Agent Strategies!

Core Concept: The Relationship Between Agent Node and Strategy

Feature Overview

For Non-Technical Users

For Developers

Future Outlook: Expanding Features

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签