Agents¶
Agent is Draive's lightweight abstraction for building reusable async workers that:
- accept typed multimodal input,
- stream visible output chunks and
ProcessingEvents, - preserve conversation thread and metadata through
ctx.scope(...), and - can be exposed to other agents as tools through
AgentsGroup.
The package is intentionally small. It builds on top of existing Draive primitives:
Stepfor custom execution pipelines,GenerativeModelfor model-backed execution,Toolboxfor tool handling,MultimodalContentfor input and output transport.
Runtime Model¶
The agents API is intentionally built as a thin layer over existing Draive runtime abstractions.
AgentIdentitydescribes the agent instance:uri,name,description,meta.AgentMessageis the fully prepared input payload:thread,created,content,meta.AgentContextis the scoped runtime state propagated throughctx.scope(...).AgentExecutingis the executor protocol:AgentMessage -> AsyncIterable[MultimodalContentPart | ProcessingEvent].
In other words, Agent itself is not a stateful conversation object. It is an immutable wrapper
that runs an executor inside a scoped agent context and streams output.
1. Build An Agent From Steps¶
Use Agent.steps(...) when you already have a Step pipeline and want to expose it as an
agent.
from collections.abc import AsyncIterable
from draive import Agent, ProcessingEvent
from draive.multimodal import TextContent
from draive.steps import Step, StepState
async def execute(
state: StepState,
) -> AsyncIterable[ProcessingEvent | TextContent | StepState]:
yield ProcessingEvent.of("progress", "Analyzing request...")
yield TextContent.of("Done")
yield state
worker: Agent = Agent.steps(
Step(execute),
name="worker",
description="Handles a small processing task",
)
Call the agent inside a context scope and consume the stream.
from collections.abc import AsyncIterable
from draive import ctx
from draive.multimodal import MultimodalContentPart
from draive.utils import ProcessingEvent
async with ctx.scope("agents.step"):
stream: AsyncIterable[MultimodalContentPart | ProcessingEvent] = worker.call(
input="Please help"
)
async for chunk in stream:
print(chunk)
If you need lower-level control, build AgentMessage yourself and call respond(...) directly.
from collections.abc import AsyncIterable
from draive import AgentMessage
from draive.multimodal import MultimodalContentPart
from draive.utils import ProcessingEvent
message: AgentMessage = AgentMessage.of("Please help")
async with ctx.scope("agents.respond"):
stream: AsyncIterable[MultimodalContentPart | ProcessingEvent] = worker.respond(
message
)
async for chunk in stream:
print(chunk)
What steps(...) Does¶
- Prepends the incoming agent message into step state with
Step.appending_input(...). - Executes your step pipeline.
- Filters out
ModelReasoningChunk. - Treats leaked
ModelToolRequestandModelToolResponsechunks as an internal contract violation. - Streams only user-visible content and
ProcessingEvents.
This makes step-backed agents a good fit when you want deterministic orchestration and typed state updates, but a clean public output stream.
One important implication: if your step emits reasoning, callers of the agent will not see it. If
your step emits tool protocol chunks, Agent.steps(...) will raise AssertionError in debug mode
instead of exposing them to callers. Agent.steps(...) is intentionally a public-facing wrapper
over a more verbose internal step stream.
2. Build A Generative Model-Backed Agent¶
Use Agent.generative(...) when the agent should directly call the configured
GenerativeModel.completion(...).
from collections.abc import AsyncIterable
from draive import Agent, ctx, load_env, tool
from draive.multimodal import MultimodalContentPart
from draive.openai import OpenAI, OpenAIResponsesConfig
from draive.utils import ProcessingEvent
load_env()
@tool(description="Return current system status")
async def system_status() -> str:
return "All systems operational"
assistant: Agent = Agent.generative(
name="support",
description="Answers product support questions",
instructions="You are a concise support assistant. Use tools when useful.",
tools=[system_status],
)
async with ctx.scope(
"agents.generative",
OpenAIResponsesConfig(model="gpt-5-mini"),
disposables=(OpenAI(),),
):
stream: AsyncIterable[MultimodalContentPart | ProcessingEvent] = assistant.call(
input="Check the current system status"
)
async for chunk in stream:
print(chunk)
How The Generative Loop Works¶
For each call, the agent:
- converts the incoming message into
ModelInput, - calls
GenerativeModel.completion(...), - collects any
ModelToolRequests, - executes them through
Toolbox.handle(...), - appends
ModelToolResponses back into model context, - repeats until the model produces a final answer.
If a tool uses handling="output", the tool can stream visible output directly and terminate the
loop early.
This API is request-scoped because the model context is local to a single request. The agent does not persist prior turns by itself. If you need longer-lived conversation semantics, use higher level conversation APIs or pass the required context explicitly.
3. Preserve Thread And Metadata¶
Agent.call(...) automatically reuses the current AgentContext when present. That allows nested
agent calls to share a logical thread and metadata.
from collections.abc import AsyncIterable
from uuid import uuid4
from draive import Agent, AgentIdentity, AgentMessage, ctx
from draive.agents.types import AgentContext
from draive.multimodal import MultimodalContentPart, TextContent
from draive.utils import ProcessingEvent
async def echo(
message: AgentMessage,
) -> AsyncIterable[MultimodalContentPart | ProcessingEvent]:
context = ctx.state(AgentContext)
yield TextContent.of(
f"thread={context.thread} source={context.meta.get_str('source')}"
)
agent: Agent = Agent(
identity=AgentIdentity.of(name="echo"),
executing=echo,
)
async with ctx.scope(
"agents.context",
AgentContext.of(thread=uuid4(), meta={"source": "outer"}),
):
stream: AsyncIterable[MultimodalContentPart | ProcessingEvent] = agent.call(
input="hello",
meta={"request": "nested"},
)
async for chunk in stream:
print(chunk)
In practice:
thread=oncall(...)overrides the current context thread,meta=is merged with the currentAgentContext.meta,respond(...)is useful when you already have a preparedAgentMessage.
This matters when agents call other agents. Nested calls inherit the active thread and metadata by default, which makes it easier to correlate delegation chains in observability and preserve request context without global state.
4. Expose Agents As Tools¶
AgentsGroup lets one agent delegate work to another using the same tool infrastructure used for
regular model tools.
from draive import Agent, AgentsGroup
from draive.multimodal import TextContent
from draive.steps import Step
researcher: Agent = Agent.steps(
Step.emitting(TextContent.of("Collected facts")),
name="researcher",
description="Collects background information",
)
writer: Agent = Agent.generative(
name="writer",
description="Writes the final response",
instructions="Delegate research first, then answer clearly.",
tools=[AgentsGroup.of(researcher).as_tool()],
)
The generated tool takes:
agent: selected agent name,task: plain-text request sent to that agent.
Agent names must be unique inside a group. AgentsGroup.of(...) raises ValueError if duplicate
names are provided.
You can also expose a single agent directly as a tool when you do not need a group registry.
5. Choose Response vs Output Handling¶
Both Agent.as_tool(...) and AgentsGroup.as_tool(...) support two delegation modes through the
handling= argument.
handling="response"¶
Use handling="response" when the delegated agent result should come back as a normal tool
response and be fed into the caller's model loop.
This behaves like any regular handling="response" tool.
Choose this mode when the caller agent should inspect the delegated result and continue its own reasoning loop.
handling="output"¶
Use handling="output" when the delegated agent should stream final output directly to the user.
This behaves like a handling="output" tool. Output chunks from the selected agent are streamed
immediately, and the tool still finishes by returning a ModelToolResponse with
handling="output".
Choose this mode when delegation should feel like a transfer of control rather than a background lookup.
6. Typical Multi-Agent Pattern¶
from draive import Agent, AgentsGroup
researcher: Agent = Agent.generative(
name="researcher",
description="Finds facts and prepares structured findings",
instructions="Gather only the information needed for the task.",
)
reviewer: Agent = Agent.generative(
name="reviewer",
description="Checks output for completeness and correctness",
instructions="Review the provided answer and suggest corrections.",
)
coordinator: Agent = Agent.generative(
name="coordinator",
description="Routes tasks between specialized agents",
instructions=(
"Use `agent_request` to delegate work to the specialized agents. "
"Combine their results into one final answer."
),
tools=[AgentsGroup.of(researcher, reviewer).as_tool()],
)
This pattern works well when:
- one model-facing agent orchestrates the workflow,
- specialized agents have narrow responsibilities,
- you want delegation without introducing a separate orchestration framework.
Keep the specialized agents narrow. AgentsGroup is most useful when each delegated agent has a
clear role and the coordinator can select among them by name from the generated tool schema.
7. When To Use Each API¶
- Prefer
Agent.steps(...)for deterministic pipelines, typed artifacts, and explicit control. - Try
Agent.generative(...)for prompt-first, tool-aware model agents. - Expose one concrete agent via
Agent.as_tool(...)when a model should call it directly. - Delegate using
AgentsGroup.as_tool(handling="response")when the caller should continue reasoning after delegation. - Delegate using
AgentsGroup.as_tool(handling="output")when the delegated agent should take over visible output.
Avoid using Agent.generative(...) as a substitute for persistent chat history. By design it loops
only within one request while tools are being resolved.
8. Public Types¶
The public agents API exported from draive includes:
AgentAgentExecutingAgentIdentityAgentMessageProcessingEventAgentsGroup
Additional agent-specific exceptions are available from draive.agents:
AgentExceptionAgentUnavailable