Auten SDK

A programmable Android phone platform — send natural-language tasks, manage devices, stream the live screen.

Auten lets your code (or your AI agent) drive real Android phones through natural-language tasks. Phones connect to the relay server (https://relay.auten.ai) over an outbound WebSocket — so they work anywhere, even behind NAT or on mobile networks.

typescript
import { Auten } from "@autenai/sdk";
 
const auten = new Auten({ apiKey: process.env.AUTEN_API_KEY! });
 
const phone = await auten.devices.firstOnline();
const result = await auten.tasks.run({
device: phone!.serial,
prompt: "Open Calculator and compute 999 ÷ 3",
speed: "lightning",
});
 
console.log(result.verified, result.result?.summary);
// → true, "Calculator displays 333."
@autenai/sdkSDK + CLINode 18+TypeScript

Quickstart

From install to your first task on a phone in a few minutes.

1. Install the package

bash
npm install -g @autenai/sdk

2. Get an API key

Sign up at auten.ai/settings/api-keys or talk to your relay operator. Keys look like sk_live_<48 hex>.

bash
$ auten login
Relay base URL [https://relay.auten.ai]:
API key: ********************************
✓ Authenticated as my-team — 1 device(s), 0 task(s).
✓ Saved to ~/.autenrc

The CLI also reads AUTEN_API_KEY and AUTEN_BASE_URL from the environment — CI doesn't need an .autenrc file.

3. Pair a phone

Install the Auten APK on an Android phone (build it with auten build-apk or download a prebuilt release), then pair it interactively:

bash
$ auten add-phone
→ USB pair wizard. Confirm the permissions on the phone.
✓ Pixel 7 (P3K2H7N) connected to the relay.

4. Run your first task

typescript
import { Auten } from "@autenai/sdk";
 
const auten = new Auten({ apiKey: process.env.AUTEN_API_KEY! });
 
const task = await auten.tasks.run({
device: (await auten.devices.firstOnline())!.serial,
prompt: "Open Chrome and search for 'auten.ai'",
});
 
console.log(task.status, task.result?.summary);

Concepts

Core ideas worth knowing before you build.

ConceptMeaning
RelayServer (https://relay.auten.ai) that sits between your code and the phones. Phones connect outbound, so it works behind NAT.
OwnerThe person or team that owns an API key. All resources (devices, tasks, sessions, credentials) are filtered server-side by ownerId — your key only sees your own data.
DeviceA physical Android phone running the Auten APK. Registered to a single owner.
TaskA natural-language goal sent to a phone ("Open Chrome and search for X"). The relay agent decomposes it into per-tap actions.
PlanA cleaned action sequence extracted from a verified task. Future tasks with similar prompts replay the plan deterministically (cheaper + faster) before falling back to an LLM.
Screen graphA per-device DAG of (fromFP, action, toFP) edges learned from successful taps. Enables cached replay on familiar screens.
Speed presetA single knob (fast / instant / lightning) controlling artificial delays during replay.

Tasks are solved in this order, cheapest first:

  1. Synthesize from the per-screen Step KB if all required labels are already visible. No LLM call.
  2. Replay a similar past task's cleanPlanJson. Deterministic, label-based, auto-scrolls.
  3. Delegate to Claude Opus 4.7 through the engine loop — only when the first two miss.

SDK reference

All methods and types.

new Auten(config)

typescript
new Auten({
apiKey: string; // sk_live_... or sk_test_...
baseUrl?: string; // default: https://relay.auten.ai
timeout?: number; // ms; default 30_000
})

auten.me()

Identify the calling key:

typescript
const me = await auten.me();
// { ownerId, name, deviceCount, taskCount, plan }

auten.devices

typescript
auten.devices.list(): Promise<Device[]>
auten.devices.get(serial: string): Promise<Device | null>
auten.devices.firstOnline(): Promise<Device | null>
auten.devices.stats(serial: string): Promise<{
serial: string;
online: boolean;
graph_edges: number;
task_count: number;
cache_hit_rate: number; // 0..1
}>

Device type

typescript
type Device = {
serial: string;
model: string | null;
online: boolean;
type: string; // "physical" | "emulator"
lastSeenAt: string | null;
androidVersion: string | null;
screenW: number | null;
screenH: number | null;
};

online flips when the phone disconnects from the relay WS reverse tunnel. Pollable; the relay updates within a few seconds.

auten.tasks

The main API — creating and observing tasks.

typescript
type Speed = "fast" | "instant" | "lightning";
type TaskMode = "task" | "explore";
type TaskStatus =
| "queued" | "running" | "completed" | "failed" | "cancelled";
 
auten.tasks.create(input: {
device: string;
prompt: string;
mode?: TaskMode;
speed?: Speed;
webhook_url?: string;
webhook_secret?: string; // HMAC-SHA256
timeout_seconds?: number; // default 300; 0 = unlimited
}): Promise<{ task_id: string; status: string; watch_url: string }>
 
auten.tasks.get(id: string): Promise<Task>
auten.tasks.list(opts?: {
device?: string;
status?: TaskStatus;
limit?: number; // 1..100, default 25
}): Promise<Task[]>
auten.tasks.cancel(id: string): Promise<{ task_id: string; status: string }>
 
// Poll until a terminal state. Returns the final Task.
auten.tasks.wait(id: string, opts?: {
intervalMs?: number; // default 1000
timeoutMs?: number; // default 300_000
}): Promise<Task>
 
// Sugar: create + wait.
auten.tasks.run(input, waitOpts?): Promise<Task>

Speed presets

SpeedPer-action delayWhen to use
fast~600msDefault — reliable
instant~300msFaster — for replays on known screens
lightning~50msMax — skips the per-action look()

auten.keys

Manage API keys and stored credentials.

typescript
auten.keys.list(): Promise<ApiKey[]>
auten.keys.create(name?: string): Promise<{ id: string; secret: string }>
auten.keys.revoke(id: string): Promise<void>
 
// Encrypted credentials per device:
auten.keys.credentials.save(opts: {
device: string;
service: string;
fields: Record<string, string>; // e.g. { username, password, 2fa }
}): Promise<void>
auten.keys.credentials.list(device: string): Promise<string[]>
auten.keys.credentials.get(device: string, service: string): Promise<Record<string,string>>
auten.keys.credentials.remove(device: string, service: string): Promise<void>

All credentials are encrypted at rest with AES-256-GCM, unlocked only when a task needs to auto-login to a specific service.

auten.phone(serial)

Low-level per-tap control for precision work (debugging, not production flows).

typescript
const phone = auten.phone(serial);
 
await phone.tap(x, y);
await phone.swipe({ x1, y1, x2, y2, duration_ms });
await phone.type("hello");
await phone.key("BACK"); // BACK, HOME, RECENTS, ENTER
await phone.look(); // current screen + UI tree
await phone.screenshot(); // PNG buffer
await phone.proxy("GET", "/info"); // direct HTTP to the phone
await phone.clipboardSet("text");
await phone.appLaunch("com.android.chrome");

CLI reference

Installing the package globally gives you the `auten` command. You can also run it as `npx @autenai/sdk <cmd>`.

CommandAliasesDescription
auten loginSave API key + relay URL to ~/.autenrc.
auten mewhoamiOwner info + counts.
auten deviceslist, lsList all your devices.
auten add-phoneaddInteractive USB pairing wizard.
auten build-apkbuildBuild a fresh APK on the relay host.
auten task "..."runDispatch a task and follow until done.
auten tasksList recent tasks.
auten tasks <id>Show one task in detail.
auten creds addsaveSave a service login (interactive, password masked).
auten creds lslistSaved services for a device.
auten keysList API keys.
auten keys createadd, newNew key — secret shown once.
auten version-vPackage version.

Shared flags for creds / task commands: --device <serial> pins a specific phone (default — the last one used).

REST API

Direct HTTP API if you don't want to use the Node SDK.

Auth

Every endpoint requires a Bearer header with an API key:

bash
curl -H "Authorization: Bearer sk_live_..." https://relay.auten.ai/v1/me

Endpoints

MethodPathPurpose
GET/v1/meOwner info + counts
GET/v1/devicesList devices
POST/v1/tasksCreate a task
GET/v1/tasks/{id}Task status + turns
POST/v1/tasks/{id}/cancelCancel
GET/v1/devices/{serial}/screenLive screen PNG

curl example

bash
curl -X POST https://relay.auten.ai/v1/tasks \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"device": "P3K2H7N",
"prompt": "Open Calculator and compute 999 ÷ 3",
"speed": "lightning"
}'
 
# → {"task_id":"...","status":"running","watch_url":"/w/.../?t=..."}

Webhook deliveries

If you include a webhook_url in the task create body, you'll get a POST to that URL on the terminal state with an X-Auten-Signature: sha256=... header (HMAC-SHA256 of the body, signed with your webhook_secret).

Recipes

Typical usage patterns.

A multi-step task

typescript
const task = await auten.tasks.run({
device: phone.serial,
prompt: `
Open Gmail.
Find the latest email from "[email protected]".
Reply with "On it — will send by EOD."
Mark as read.
`.trim(),
speed: "fast",
timeout_seconds: 120,
});

Store a login — the agent uses it automatically

typescript
await auten.keys.credentials.save({
device: phone.serial,
service: "instagram",
fields: {
username: "myhandle",
password: process.env.IG_PASS!,
"2fa": process.env.IG_TOTP_SECRET!, // optional TOTP
},
});
 
// Later, the agent auto-logs in:
await auten.tasks.run({
device: phone.serial,
prompt: "Open Instagram and check my DMs",
});

Stream live progress to a browser

typescript
const { task_id, watch_url } = await auten.tasks.create({
device: phone.serial,
prompt: "Open Settings and toggle Dark Mode",
});
 
// watch_url has a short-lived token — drop it into an iframe or img:
res.send(`<iframe src="${watch_url}" allow="autoplay"></iframe>`);

Webhook-driven async pipeline

typescript
await auten.tasks.create({
device: phone.serial,
prompt: "Take a 30s screen recording of TikTok scrolling",
webhook_url: "https://my-app.com/webhooks/auten",
webhook_secret: process.env.WEBHOOK_SECRET!,
timeout_seconds: 60,
});
// → returns immediately with task_id. Webhook fires when finished.

Errors & pitfalls

Common gotchas and how to think about them.

phone.type() returns ok: true but the text doesn't appear

The Android IME can lose the input field between screens. Make sure the input is focused before calling type() — call tap() on it first.

phone.clipboardSet() on Android 10+

Android 10+ restricts clipboard access from the background. If the APK isn't in the foreground this can silently fail — bring the APK to the foreground first with look() or appLaunch() on the Auten package.

"lightning" speed skips the per-action look()

Fast, but risky: if the UI changed since replay was recorded (a modal appeared, the screen drifted) the task verify will fail. Use it only in well-stabilized flows.

Plans expire when the UI drifts

Auten caches successful plans by screen fingerprint. An app update can change the UI and invalidate the plan — the next task will be more expensive (LLM delegate), but it self-learns.

auten.tasks.wait — polls, no SSE yet

For now wait polls every intervalMs (default 1s). If you need low latency, subscribe to webhooks.

For AI agents

If you are the AI (Claude, GPT, Cursor) — here's the context you need.

Verify you're using the current SDK

bash
npm ls @autenai/sdk
# Required — @autenai/[email protected] or higher

Default to high-level prompts, not low-level calls

Use auten.tasks.run({ prompt: "..." }) whenever possible. Avoid auten.phone().tap(x,y) unless you're debugging — coordinates differ between phones and are unstable across app updates.

Owner scoping summary

Your API key only sees your own devices, tasks, sessions, and credentials. Server-side filtering — you don't need to worry about isolation.

Deprecated — DO NOT USE

  • auten.providers — the old OAuth provider manager (v0.2). Replaced by auten.tasks.
  • auten.execute() — the old AI exec entry point. Replaced by auten.tasks.run().