Skip to content

Playwright 浏览器工具

适用版本:4.0.8.1+

Playwright 是 Agently 的内置浏览器工具,适合处理“仅靠 HTTP 抓取不够”的页面:

  • 需要执行前端 JS 后再读取内容
  • 需要拿到页面标题、最终跳转 URL、状态码
  • 需要可选截图或链接提取

1. 初始化参数

python
from agently.builtins.tools import Playwright

playwright = Playwright(
    headless=True,
    timeout=30000,
    proxy=None,
    user_agent=None,
    response_mode="markdown",  # "markdown" | "text"
    max_content_length=8000,
    include_links=False,
    max_links=120,
    screenshot_path=None,
)

核心参数:

  • response_modemarkdown 会把 <a> 转成 markdown 链接;text 返回纯文本
  • include_links:是否额外返回 links
  • screenshot_path:设置后会保存全页截图

2. 直接调用

python
import asyncio
from agently.builtins.tools import Playwright

playwright = Playwright(headless=True, response_mode="markdown")

async def main():
    result = await playwright.open("https://agently.tech")
    print(result)

asyncio.run(main())

3. 作为 Agent 工具接入

python
from agently import Agently
from agently.builtins.tools import Playwright

agent = Agently.create_agent()
playwright = Playwright(headless=True, response_mode="markdown")

agent.use_tools([playwright.open])
result = agent.input("先浏览 Agently 官网,再总结 TriggerFlow 的作用").start()
print(result)

通过 tool_info_list 注册时,工具名为 playwright_open

4. 返回结构(成功)

典型字段:

  • ok
  • requested_url
  • normalized_url
  • url(最终 URL)
  • status
  • title
  • content_format
  • content
  • screenshot_path
  • links(仅当 include_links=True

失败时返回 ok=Falseerror

5. 使用建议

  • 首次使用前先安装浏览器驱动(playwright install
  • 抓取稳定性优先时建议设置 timeoutproxy
  • 若后续需要精确读取元素,不建议只依赖 content 文本,可结合专用抓取流程