Auten SDK
A programmable Android phone platform — send natural-language tasks, manage devices, stream the live screen.
Auten lets your code (or your AI agent) drive real Android phones through natural-language tasks. Phones connect to the relay server (https://relay.auten.ai) over an outbound WebSocket — so they work anywhere, even behind NAT or on mobile networks.
import { Auten } from "@autenai/sdk"; const auten = new Auten({ apiKey: process.env.AUTEN_API_KEY! }); const phone = await auten.devices.firstOnline();const result = await auten.tasks.run({ device: phone!.serial, prompt: "Open Calculator and compute 999 ÷ 3", speed: "lightning",}); console.log(result.verified, result.result?.summary);// → true, "Calculator displays 333."Quickstart
From install to your first task on a phone in a few minutes.
1. Install the package
npm install -g @autenai/sdk2. Get an API key
Sign up at auten.ai/settings/api-keys or talk to your relay operator. Keys look like sk_live_<48 hex>.
$ auten loginRelay base URL [https://relay.auten.ai]:API key: ******************************** ✓ Authenticated as my-team — 1 device(s), 0 task(s). ✓ Saved to ~/.autenrcThe CLI also reads AUTEN_API_KEY and AUTEN_BASE_URL from the environment — CI doesn't need an .autenrc file.
3. Pair a phone
Install the Auten APK on an Android phone (build it with auten build-apk or download a prebuilt release), then pair it interactively:
$ auten add-phone → USB pair wizard. Confirm the permissions on the phone. ✓ Pixel 7 (P3K2H7N) connected to the relay.4. Run your first task
import { Auten } from "@autenai/sdk"; const auten = new Auten({ apiKey: process.env.AUTEN_API_KEY! }); const task = await auten.tasks.run({ device: (await auten.devices.firstOnline())!.serial, prompt: "Open Chrome and search for 'auten.ai'",}); console.log(task.status, task.result?.summary);Concepts
Core ideas worth knowing before you build.
| Concept | Meaning |
|---|---|
| Relay | Server (https://relay.auten.ai) that sits between your code and the phones. Phones connect outbound, so it works behind NAT. |
| Owner | The person or team that owns an API key. All resources (devices, tasks, sessions, credentials) are filtered server-side by ownerId — your key only sees your own data. |
| Device | A physical Android phone running the Auten APK. Registered to a single owner. |
| Task | A natural-language goal sent to a phone ("Open Chrome and search for X"). The relay agent decomposes it into per-tap actions. |
| Plan | A cleaned action sequence extracted from a verified task. Future tasks with similar prompts replay the plan deterministically (cheaper + faster) before falling back to an LLM. |
| Screen graph | A per-device DAG of (fromFP, action, toFP) edges learned from successful taps. Enables cached replay on familiar screens. |
| Speed preset | A single knob (fast / instant / lightning) controlling artificial delays during replay. |
Tasks are solved in this order, cheapest first:
- Synthesize from the per-screen Step KB if all required labels are already visible. No LLM call.
- Replay a similar past task's
cleanPlanJson. Deterministic, label-based, auto-scrolls. - Delegate to Claude Opus 4.7 through the engine loop — only when the first two miss.
SDK reference
All methods and types.
new Auten(config)
new Auten({ apiKey: string; // sk_live_... or sk_test_... baseUrl?: string; // default: https://relay.auten.ai timeout?: number; // ms; default 30_000})auten.me()
Identify the calling key:
const me = await auten.me();// { ownerId, name, deviceCount, taskCount, plan }auten.devices
auten.devices.list(): Promise<Device[]>auten.devices.get(serial: string): Promise<Device | null>auten.devices.firstOnline(): Promise<Device | null>auten.devices.stats(serial: string): Promise<{ serial: string; online: boolean; graph_edges: number; task_count: number; cache_hit_rate: number; // 0..1}>Device type
type Device = { serial: string; model: string | null; online: boolean; type: string; // "physical" | "emulator" lastSeenAt: string | null; androidVersion: string | null; screenW: number | null; screenH: number | null;};online flips when the phone disconnects from the relay WS reverse tunnel. Pollable; the relay updates within a few seconds.
auten.tasks
The main API — creating and observing tasks.
type Speed = "fast" | "instant" | "lightning";type TaskMode = "task" | "explore";type TaskStatus = | "queued" | "running" | "completed" | "failed" | "cancelled"; auten.tasks.create(input: { device: string; prompt: string; mode?: TaskMode; speed?: Speed; webhook_url?: string; webhook_secret?: string; // HMAC-SHA256 timeout_seconds?: number; // default 300; 0 = unlimited}): Promise<{ task_id: string; status: string; watch_url: string }> auten.tasks.get(id: string): Promise<Task>auten.tasks.list(opts?: { device?: string; status?: TaskStatus; limit?: number; // 1..100, default 25}): Promise<Task[]>auten.tasks.cancel(id: string): Promise<{ task_id: string; status: string }> // Poll until a terminal state. Returns the final Task.auten.tasks.wait(id: string, opts?: { intervalMs?: number; // default 1000 timeoutMs?: number; // default 300_000}): Promise<Task> // Sugar: create + wait.auten.tasks.run(input, waitOpts?): Promise<Task>Speed presets
| Speed | Per-action delay | When to use |
|---|---|---|
| fast | ~600ms | Default — reliable |
| instant | ~300ms | Faster — for replays on known screens |
| lightning | ~50ms | Max — skips the per-action look() |
auten.keys
Manage API keys and stored credentials.
auten.keys.list(): Promise<ApiKey[]>auten.keys.create(name?: string): Promise<{ id: string; secret: string }>auten.keys.revoke(id: string): Promise<void> // Encrypted credentials per device:auten.keys.credentials.save(opts: { device: string; service: string; fields: Record<string, string>; // e.g. { username, password, 2fa }}): Promise<void>auten.keys.credentials.list(device: string): Promise<string[]>auten.keys.credentials.get(device: string, service: string): Promise<Record<string,string>>auten.keys.credentials.remove(device: string, service: string): Promise<void>All credentials are encrypted at rest with AES-256-GCM, unlocked only when a task needs to auto-login to a specific service.
auten.phone(serial)
Low-level per-tap control for precision work (debugging, not production flows).
const phone = auten.phone(serial); await phone.tap(x, y);await phone.swipe({ x1, y1, x2, y2, duration_ms });await phone.type("hello");await phone.key("BACK"); // BACK, HOME, RECENTS, ENTERawait phone.look(); // current screen + UI treeawait phone.screenshot(); // PNG bufferawait phone.proxy("GET", "/info"); // direct HTTP to the phoneawait phone.clipboardSet("text");await phone.appLaunch("com.android.chrome");CLI reference
Installing the package globally gives you the `auten` command. You can also run it as `npx @autenai/sdk <cmd>`.
| Command | Aliases | Description |
|---|---|---|
| auten login | Save API key + relay URL to ~/.autenrc. | |
| auten me | whoami | Owner info + counts. |
| auten devices | list, ls | List all your devices. |
| auten add-phone | add | Interactive USB pairing wizard. |
| auten build-apk | build | Build a fresh APK on the relay host. |
| auten task "..." | run | Dispatch a task and follow until done. |
| auten tasks | List recent tasks. | |
| auten tasks <id> | Show one task in detail. | |
| auten creds add | save | Save a service login (interactive, password masked). |
| auten creds ls | list | Saved services for a device. |
| auten keys | List API keys. | |
| auten keys create | add, new | New key — secret shown once. |
| auten version | -v | Package version. |
Shared flags for creds / task commands: --device <serial> pins a specific phone (default — the last one used).
REST API
Direct HTTP API if you don't want to use the Node SDK.
Auth
Every endpoint requires a Bearer header with an API key:
curl -H "Authorization: Bearer sk_live_..." https://relay.auten.ai/v1/meEndpoints
| Method | Path | Purpose |
|---|---|---|
| GET | /v1/me | Owner info + counts |
| GET | /v1/devices | List devices |
| POST | /v1/tasks | Create a task |
| GET | /v1/tasks/{id} | Task status + turns |
| POST | /v1/tasks/{id}/cancel | Cancel |
| GET | /v1/devices/{serial}/screen | Live screen PNG |
curl example
curl -X POST https://relay.auten.ai/v1/tasks \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "device": "P3K2H7N", "prompt": "Open Calculator and compute 999 ÷ 3", "speed": "lightning" }' # → {"task_id":"...","status":"running","watch_url":"/w/.../?t=..."}Webhook deliveries
If you include a webhook_url in the task create body, you'll get a POST to that URL on the terminal state with an X-Auten-Signature: sha256=... header (HMAC-SHA256 of the body, signed with your webhook_secret).
Recipes
Typical usage patterns.
A multi-step task
const task = await auten.tasks.run({ device: phone.serial, prompt: ` Open Gmail. Reply with "On it — will send by EOD." Mark as read. `.trim(), speed: "fast", timeout_seconds: 120,});Store a login — the agent uses it automatically
await auten.keys.credentials.save({ device: phone.serial, service: "instagram", fields: { username: "myhandle", password: process.env.IG_PASS!, "2fa": process.env.IG_TOTP_SECRET!, // optional TOTP },}); // Later, the agent auto-logs in:await auten.tasks.run({ device: phone.serial, prompt: "Open Instagram and check my DMs",});Stream live progress to a browser
const { task_id, watch_url } = await auten.tasks.create({ device: phone.serial, prompt: "Open Settings and toggle Dark Mode",}); // watch_url has a short-lived token — drop it into an iframe or img:res.send(`<iframe src="${watch_url}" allow="autoplay"></iframe>`);Webhook-driven async pipeline
await auten.tasks.create({ device: phone.serial, prompt: "Take a 30s screen recording of TikTok scrolling", webhook_url: "https://my-app.com/webhooks/auten", webhook_secret: process.env.WEBHOOK_SECRET!, timeout_seconds: 60,});// → returns immediately with task_id. Webhook fires when finished.Errors & pitfalls
Common gotchas and how to think about them.
phone.type() returns ok: true but the text doesn't appear
The Android IME can lose the input field between screens. Make sure the input is focused before calling type() — call tap() on it first.
phone.clipboardSet() on Android 10+
Android 10+ restricts clipboard access from the background. If the APK isn't in the foreground this can silently fail — bring the APK to the foreground first with look() or appLaunch() on the Auten package.
"lightning" speed skips the per-action look()
Fast, but risky: if the UI changed since replay was recorded (a modal appeared, the screen drifted) the task verify will fail. Use it only in well-stabilized flows.
Plans expire when the UI drifts
Auten caches successful plans by screen fingerprint. An app update can change the UI and invalidate the plan — the next task will be more expensive (LLM delegate), but it self-learns.
auten.tasks.wait — polls, no SSE yet
For now wait polls every intervalMs (default 1s). If you need low latency, subscribe to webhooks.
For AI agents
If you are the AI (Claude, GPT, Cursor) — here's the context you need.
Verify you're using the current SDK
npm ls @autenai/sdk# Required — @autenai/[email protected] or higherDefault to high-level prompts, not low-level calls
Use auten.tasks.run({ prompt: "..." }) whenever possible. Avoid auten.phone().tap(x,y) unless you're debugging — coordinates differ between phones and are unstable across app updates.
Owner scoping summary
Your API key only sees your own devices, tasks, sessions, and credentials. Server-side filtering — you don't need to worry about isolation.
Deprecated — DO NOT USE
auten.providers— the old OAuth provider manager (v0.2). Replaced byauten.tasks.auten.execute()— the old AI exec entry point. Replaced byauten.tasks.run().
Questions? Reach out at [email protected]