Skip to content

Requests Overview

Languages: English · 中文

A single Agently request has four moving parts:

  1. Prompt — what you say to the model. Built from layered slots: role / system, info, instruct, input, output schema. See Prompt Management.
  2. Output schema — the structure you want back. Authored as nested dicts of (type, "desc", ensure) leaves. See Schema as Prompt.
  3. Validation pipelineoutput() strict parse → ensure_keys.validate(...) custom handlers → retry. See Output Control.
  4. Result — text, structured data, metadata, and streaming events. Reusable via get_result(). See Model Result.

The minimum shape

python
from agently import Agently

agent = Agently.create_agent()

result = (
    agent
    .input("Summarize this article in three bullets.")
    .output({
        "title": (str, "Title", True),
        "bullets": [(str, "Bullet point", True)],
    })
    .start()
)

This single chain covers all four parts. input() fills the prompt's input slot, output() defines the schema (with ensure flags), and start() runs the request, applies the validation pipeline, retries if needed, and returns the parsed dict.

Image input

For VLM requests, use .image(...) when the request is a question plus one or more images. It accepts local image files and remote image URLs:

python
from agently import Agently

agent = Agently.create_agent()

result = (
    agent
    .image(
        question="Compare these two screenshots and list the visible differences.",
        files=["./before.png", "./after.png"],
    )
    .start()
)

Use file="..." or url="..." for a single image, and files=[...] or urls=[...] for multiple images. Local files are converted to data:<mime>;base64,... image URLs before being sent through the existing rich-content prompt path. Supported local image MIME types are PNG, JPEG, WebP, GIF, and BMP.

.attachment([...]) remains the low-level input for callers that already have provider-style rich content blocks or need exact mixed ordering. Common non-image files such as PDF, Markdown/text, Word, presentations, and spreadsheets are a 4.1.4 target rather than part of the 4.1.3.3 image slice.

When to reach for which page

You want to …Read
Layer prompts across the agent and one requestPrompt Management
Understand the (type, "desc", True) leaf and YAML formSchema as Prompt
Add custom business validation, control retries, fail open or hardOutput Control
Reuse one response for text + data + metadata, or stream fieldsModel Response
Carry chat history and memo across turnsSession Memory
Inject background information cleanlyContext Engineering

Sync vs async

The chain above is sync because it ends in .start(). For services and streaming UI, use .async_start() or pull a reusable result = ....get_result() and consume it with await result.async_get_data(). See Async First.

Where this fits in the stack

A request is the smallest unit Agently ships. Multiple requests can share a Session (multi-turn). When you need branching, concurrency, or pause/resume across requests, you graduate to TriggerFlow. When you need the model to call out to tools or MCP servers, you wire in Action Runtime.

But every layer above eventually lives or dies on the request layer doing its job. Get this layer right first.