🧰Capabilitiesalso: computer use, claude computer use, computer use agent

Computer use

An agent capability where the LLM controls a computer's mouse, keyboard, and screen directly — interpreting screenshots, clicking, typing, and navigating arbitrary desktop and browser apps.

Computer use is Anthropic's flagship 2024–2026 capability: the model takes screenshots, decides where to click and what to type, and executes those actions on a real (usually sandboxed) computer. Unlike traditional browser automation, the agent works with the visual UI the way a human would — no DOM scraping required.

The capability unlocks an enormous workflow surface: any app with a UI becomes scriptable by an LLM. RPA without the brittle selectors. Cross-app workflows that weave together five SaaS tools no one bothered to API-integrate. Testing flows that exercise the real product instead of brittle synthetic actions.

In 2026 the production-ready stack is still maturing. Claude's computer-use mode is the most capable; OpenAI's Operator covers similar ground in browser-only contexts; open-source frameworks like Anthropic's Claude Computer-Use SDK and OpenAI's Agent SDK make it deployable. Reliability is the main gap — 5–15% action error rates that require retry logic.

Frequently asked

What is the difference between computer use and browser use?+

Browser use is scoped to a web browser (DOM scraping or visual). Computer use covers the full desktop — any app the operating system runs. Computer use is the superset.

Is computer use reliable enough for production?+

For high-stakes flows: not without heavy retry logic and human review. For internal automation and prototype work: yes, with the caveat that 5–15% of actions will need re-attempting.

Computer use

Frequently asked

Agents that use computer use

Related terms