Requests Overview

Languages: English · 中文

A single Agently request has four moving parts:

Prompt — what you say to the model. Built from layered slots: role / system, info, instruct, input, output schema. See Prompt Management.
Output schema — the structure you want back. Authored as nested dicts of (type, "desc", ensure) leaves. See Schema as Prompt.
Validation pipeline — output() strict parse → ensure_keys → .validate(...) custom handlers → retry. See Output Control.
Result — text, structured data, metadata, and streaming events. Reusable via get_result(). See Model Result.

The minimum shape

python

from agently import Agently

agent = Agently.create_agent()

result = (
    agent
    .input("Summarize this article in three bullets.")
    .output({
        "title": (str, "Title", True),
        "bullets": [(str, "Bullet point", True)],
    })
    .start()
)

This single chain covers all four parts. input() fills the prompt's input slot, output() defines the schema (with ensure flags), and start() runs the request, applies the validation pipeline, retries if needed, and returns the parsed dict.

Image input

For VLM requests, use .image(...) when the request is a question plus one or more images. It accepts local image files and remote image URLs:

python

from agently import Agently

agent = Agently.create_agent()

result = (
    agent
    .image(
        question="Compare these two screenshots and list the visible differences.",
        files=["./before.png", "./after.png"],
    )
    .start()
)

Use file="..." or url="..." for a single image, and files=[...] or urls=[...] for multiple images. Local files are converted to data:<mime>;base64,... image URLs before being sent through the existing rich-content prompt path. Supported local image MIME types are PNG, JPEG, WebP, GIF, and BMP.

.attachment([...]) remains the low-level input for callers that already have provider-style rich content blocks or need exact mixed ordering. Common non-image files such as PDF, Markdown/text, Word, presentations, and spreadsheets are a 4.1.4 target rather than part of the 4.1.3.3 image slice.

When to reach for which page

You want to …	Read
Layer prompts across the agent and one request	Prompt Management
Understand the `(type, "desc", True)` leaf and YAML form	Schema as Prompt
Add custom business validation, control retries, fail open or hard	Output Control
Reuse one response for text + data + metadata, or stream fields	Model Response
Carry chat history and memo across turns	Session Memory
Inject background information cleanly	Context Engineering

Sync vs async

The chain above is sync because it ends in .start(). For services and streaming UI, use .async_start() or pull a reusable result = ....get_result() and consume it with await result.async_get_data(). See Async First.

Where this fits in the stack

A request is the smallest unit Agently ships. Multiple requests can share a Session (multi-turn). When you need branching, concurrency, or pause/resume across requests, you graduate to TriggerFlow. When you need the model to call out to tools or MCP servers, you wire in Action Runtime.

But every layer above eventually lives or dies on the request layer doing its job. Get this layer right first.

Requests Overview ​

The minimum shape ​

Image input ​

When to reach for which page ​