ScreenAgent

Rating:

Rate it!

Category:AI Agent Development Frameworks

Overview

Open‑source VLM agent to control computer GUIs via mouse/keyboard planning and execution.

Visit website

One more link

Best For Professions:

software developers AI researchers automation engineers HCI researchers

ScreenAgent is an open‑source Vision Language Model agent that interacts with real computer screens via screenshot observation and mouse/keyboard actions, following a planning‑execution‑reflection loop. It supports multi‑step GUI tasks, dataset collection, and achieves positioning accuracy comparable to GPT‑4V.

Autonomy level

81%

Reasoning: ScreenAgent demonstrates high autonomy through its three-phase operational framework: planning, acting, and reflecting. It enables continuous interaction with computer environments without human intervention by autonomously assessing execution status and adjusting actions in real-time. The reflection phase allows self-evaluation of action outcomes,...

Comparisons

Custom Comparisons

Some of the use cases of ScreenAgent:

Automating desktop tasks with LLM-based GUI control.
Building VLM agents that plan, act and reflect over screen state.
Collecting and leveraging GUI interaction datasets.
Researching multi-step visual task execution.

Loading Community Opinions...

Pricing model:

free

Code access:

open-source

Popularity level: 44%

Industries:

Software Development Human‑Computer Interaction Research Automation

Tags:

vision-language model GUI automation mouse & keyboard control planning-execution-reflection open-source

Free credibility widget

Show prospects that ScreenAgent has a public place where they can check product details, pricing, ratings, and reviews.

Build confidence

Give buyers a third-party profile to explore.

Reduce hesitation

Put validation beside your strongest CTA.

Earn discovery

Every badge links prospects to your listing.

Choose your style

Preview it, then copy the complete embed code.

Live previewReady to embed

Shows buyers where to validate your product, pricing, and reputation.

Plain HTML. No signup, script, or maintenance required.

All AI Agents

Make it work for you

Describe the job. Get an AI worker you can actually message.

We create the setup, keep it running after your laptop closes, and save its memory. Test in the browser, then add Telegram, WhatsApp, or Slack.

Runs without your laptopBrowser + messaging appsCredits, keys, or subscriptionsMemory survives restarts

Describe my agent See Agent Teams

Plans start at $29/month. Cancel anytime.

Hosted agent

OpenClaw or Hermes

saved state

Browser

Slack

“I checked the inbox, handled the routine messages, and sent you the one question that needs a decision.”

Create an AI worker that keeps running after this tab closes.

Open Agent Teams

Did you find this page useful?

Not useful

Could be better

Neutral

Useful

Loved it!