Requests Overview
Languages: English · 中文
A single Agently request has four moving parts:
- Prompt — what you say to the model. Built from layered slots:
role/system,info,instruct,input,outputschema. See Prompt Management. - Output schema — the structure you want back. Authored as nested dicts of
(type, "desc", ensure)leaves. See Schema as Prompt. - Validation pipeline —
output()strict parse →ensure_keys→.validate(...)custom handlers → retry. See Output Control. - Result — text, structured data, metadata, and streaming events. Reusable via
get_result(). See Model Result.
The minimum shape
from agently import Agently
agent = Agently.create_agent()
result = (
agent
.input("Summarize this article in three bullets.")
.output({
"title": (str, "Title", True),
"bullets": [(str, "Bullet point", True)],
})
.start()
)This single chain covers all four parts. input() fills the prompt's input slot, output() defines the schema (with ensure flags), and start() runs the request, applies the validation pipeline, retries if needed, and returns the parsed dict.
Image input
For VLM requests, use .image(...) when the request is a question plus one or more images. It accepts local image files and remote image URLs:
from agently import Agently
agent = Agently.create_agent()
result = (
agent
.image(
question="Compare these two screenshots and list the visible differences.",
files=["./before.png", "./after.png"],
)
.start()
)Use file="..." or url="..." for a single image, and files=[...] or urls=[...] for multiple images. Local files are converted to data:<mime>;base64,... image URLs before being sent through the existing rich-content prompt path. Supported local image MIME types are PNG, JPEG, WebP, GIF, and BMP.
.attachment([...]) remains the low-level input for callers that already have provider-style rich content blocks or need exact mixed ordering. Common non-image files such as PDF, Markdown/text, Word, presentations, and spreadsheets are a 4.1.4 target rather than part of the 4.1.3.3 image slice.
When to reach for which page
| You want to … | Read |
|---|---|
| Layer prompts across the agent and one request | Prompt Management |
Understand the (type, "desc", True) leaf and YAML form | Schema as Prompt |
| Add custom business validation, control retries, fail open or hard | Output Control |
| Reuse one response for text + data + metadata, or stream fields | Model Response |
| Carry chat history and memo across turns | Session Memory |
| Inject background information cleanly | Context Engineering |
Sync vs async
The chain above is sync because it ends in .start(). For services and streaming UI, use .async_start() or pull a reusable result = ....get_result() and consume it with await result.async_get_data(). See Async First.
Where this fits in the stack
A request is the smallest unit Agently ships. Multiple requests can share a Session (multi-turn). When you need branching, concurrency, or pause/resume across requests, you graduate to TriggerFlow. When you need the model to call out to tools or MCP servers, you wire in Action Runtime.
But every layer above eventually lives or dies on the request layer doing its job. Get this layer right first.