How Auten Learns: Screen Graphs and Plan Replay

Auten gets faster and cheaper the more you use it. A deep dive into the screen-graph and plan-replay system that turns repeat tasks into instant, free replays — and keeps up as apps change.

Auten Team

May 31, 20267 min read

A network graph of connected glowing screen nodes with gold and teal edges

Most automation costs the same on the thousandth run as on the first. Auten gets cheaper and faster with use, because it remembers how it solved things. Two mechanisms make that possible: the screen graph and plan replay. This deep dive explains both, why they matter, and how the system stays correct as apps change.

The screen graph

Each screen is reduced to a fingerprint — a stable hash that deliberately ignores noise like clocks, badges, and scroll position, so the same logical screen always maps to the same fingerprint. Every action then records an edge: from this fingerprint, this action led to that fingerprint. Over time, each device builds a map of how its apps actually connect — a graph of screens and the actions that move between them.

Why fingerprints ignore noise

If the fingerprint changed every time a clock ticked or a badge count updated, no two visits to the same screen would match, and nothing could be reused. By canonicalizing away the volatile bits and keeping the structural ones, Auten recognizes "this is the inbox screen" whether you have 3 unread or 30 — which is what makes the graph useful.

Plan replay

When a task succeeds, Auten extracts a clean plan — the minimal sequence of actions that achieved the goal, with the dead ends, re-observations, and false starts stripped out. A similar task later replays that plan directly and deterministically, with no model call in the loop. This is where the dramatic speed-up comes from.

What you get from it

Speed — repeat tasks run roughly 30x faster than the first time.
Cost — cached replays are free; you only pay when the AI must think.
Reliability — proven paths accumulate, so success rates climb with use.
Consistency — deterministic replays do the same thing every time.

Fail-safe by design

If an app updated and a cached plan no longer fits the screen it expects, Auten does not blindly push forward. It detects the mismatch, falls back to the full AI agent, re-solves the task, and updates the plan — so the learning keeps pace with app changes instead of breaking on them.

How it stays correct as apps change

Apps ship updates that move things around. The fail-safe above is the key: a replay is only trusted while the screens it expects still match. The moment they do not, the system re-derives the plan with the full agent and replaces the stale one. You get the speed of caching without the brittleness that usually comes with it.

Why this is the real moat

Anyone can call a vision model in a loop. The durable advantage is the accumulated knowledge: the more a device is used, the better and cheaper it gets at the tasks you actually run. That compounding is what makes phone automation practical at scale rather than just impressive in a one-off demo. It is also why cost goes down over time instead of up.

Grab an API key at auten.ai, connect a phone or spin up a hosted virtual device, and send your first natural-language task in minutes. The free tier needs no credit card.

Share this article

How Auten Learns: Screen Graphs and Plan Replay

The screen graph

Why fingerprints ignore noise

Plan replay

What you get from it

How it stays correct as apps change

Why this is the real moat

Frequently asked questions

Does replay ever do the wrong thing after an app update?

Do replays cost anything?

Is the learning shared across my devices?

More from the blog

Cloud Android Devices: Hosted Phones for Automation