Browser use
An agent capability where the LLM drives a real web browser to read, click, and fill forms on live websites.
Browser use is the capability that lets agents work on the public web. Instead of calling APIs, the agent renders a page, screenshots it (or reads the DOM), decides what to click or type, and observes the result.
It is the slowest and most fragile capability — a single layout change breaks workflows — but it is also the only way to reach the long tail of services without APIs.
In 2026, browser use is most common in research agents (Manus, Perplexity Labs), in autonomous coding agents that need to read documentation, and in sales agents doing account research.
Frequently asked
How reliable is browser use in production?+
Moderate. Well-known sites (Google, GitHub, LinkedIn) work reliably; niche sites with frequent layout changes break weekly. Plan for ~5% failure rate and explicit fallback.