Autonomy Levels — AGENTOLOGY

In developmental psychology, we track the stages through which children gain independence: from total dependence on caregivers, to supervised exploration, to adolescent semi-autonomy, to adult self-direction.

Autonomy levels provide an analogous framework for AI agents—a taxonomy of how much independence an agent exercises and how much human oversight it requires.

The Autonomy Spectrum

Agents don’t exist in a binary state of “autonomous” or “not autonomous.” They occupy positions on a spectrum:

graph LR
  L0["Level 0<br/>TOOL<br/>human does all"]
  L1["Level 1<br/>ASSISTED<br/>human decides<br/>agent assists"]
  L2["Level 2<br/>PARTIAL<br/>shared control"]
  L3["Level 3<br/>SUPERVISED<br/>agent decides<br/>human approves"]
  L4["Level 4<br/>FULL<br/>agent decides<br/>and acts"]

  L0 --> L1 --> L2 --> L3 --> L4

  style L0 fill:#0a0a0a,stroke:#666666,stroke-width:1px,color:#cccccc
  style L1 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style L2 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style L3 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style L4 fill:#0a0a0a,stroke:#00ff00,stroke-width:2px,color:#cccccc

human_control ← → agent_autonomy

Level 0: Tool Mode

The agent has no autonomy. It responds to specific queries with specific outputs. The human maintains complete control over all decisions and actions.

Characteristics:

Single-turn interactions
No persistent state
No action-taking capability
Human interprets and acts on outputs

Examples:

Basic chatbots
Code completion (suggesting, not executing)
Translation services

Anthropological parallel: An infant, entirely dependent on caregivers for all needs.

Level 1: Assisted Mode

The agent provides recommendations, but the human makes all decisions and takes all actions. The agent augments human capability without replacing human judgment.

Characteristics:

Can propose multi-step plans
Recommendations require human approval
No direct environmental access
Human remains fully in the loop

Examples:

Code review assistants
Research summarizers
Decision support systems

Anthropological parallel: A young child who can express preferences but whose guardians make all significant decisions.

Level 2: Partial Autonomy

The agent can take some actions independently, while others require human approval. The boundary between autonomous and supervised actions is explicitly defined.

Characteristics:

Some tools available without approval
High-stakes actions require confirmation
Operates within defined constraints
Human can intervene at any point

Examples:

Coding agents that can read/write files but need approval to execute
Email drafting agents that require send confirmation
Research agents with read access but no write access

Anthropological parallel: An older child with some independence (can walk to school alone) but restrictions on major decisions.

Level 3: Supervised Autonomy

The agent operates independently but under observation. It makes decisions and takes actions, but humans monitor its behavior and can intervene if needed.

Characteristics:

Broad action-taking authority
Logging and transparency requirements
Human review of outcomes (not inputs)
Intervention capability preserved

Examples:

Autonomous coding sessions with audit logs
Customer service agents with escalation protocols
Trading systems with position limits

Anthropological parallel: A teenager with significant freedom but ongoing parental oversight and the possibility of intervention.

Level 4: Full Autonomy

The agent operates independently without real-time human oversight. It sets its own subgoals, takes actions, and handles consequences. Humans may set high-level objectives but don’t supervise execution.

Characteristics:

Self-directed goal pursuit
No approval requirements
Long time horizons
Human involvement only at setup and review

Examples:

Long-running research agents
Fully autonomous vehicles
Self-improving systems (theoretical)

Anthropological parallel: An adult, fully independent and responsible for their own decisions.

Factors Determining Appropriate Level

What autonomy level is appropriate for a given agent? Key factors include:

Capability

Can the agent reliably accomplish the task? Higher capability enables higher autonomy.

Reversibility

Can mistakes be undone? Reversible actions (editing a draft) tolerate more autonomy than irreversible ones (sending an email).

Stakes

What’s the cost of failure? Higher stakes demand more oversight.

Domain

Is the domain well-specified or open-ended? Constrained domains are safer for autonomy.

Trust

Has the agent demonstrated reliability? Trust is earned through track record.

graph TD
  subgraph Primary["primary_factor"]
      S1["high_stakes → lower_autonomy"]
      S2["low_stakes → higher_autonomy"]
  end

  subgraph Secondary["secondary_factors"]
      F1["irreversible → lower"]
      F2["reversible → higher"]
      F3["low_trust → lower"]
      F4["high_trust → higher"]
      F5["open_domain → lower"]
      F6["constrained → higher"]
      F7["low_capability → lower"]
      F8["high_capability → higher"]
  end

  style Primary fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style Secondary fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#cccccc
  style S1 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style S2 fill:#0a0a0a,stroke:#00ff00,stroke-width:1px,color:#cccccc
  style F1 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F2 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F3 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F4 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F5 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F6 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F7 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666
  style F8 fill:#0a0a0a,stroke:#333333,stroke-width:1px,color:#666666

autonomy_level_decision_factors

Dynamic Autonomy

Autonomy levels need not be static. Sophisticated systems adjust autonomy based on:

Task phase: More oversight during critical moments
Confidence: Higher autonomy when the agent is certain
Track record: Earned autonomy over time
Anomaly detection: Reduced autonomy when behavior is unusual

This mirrors human development: we grant children more independence as they demonstrate readiness.

The Autonomy-Alignment Relationship

A crucial insight: autonomy and alignment must scale together.

An agent with high autonomy but poor alignment is dangerous. An agent with high alignment but low autonomy is merely underutilized.

Alignment	Low Autonomy	High Autonomy
Low	Safe but limited	Dangerous
High	Underutilized	Ideal

This is why the field progresses carefully: we increase autonomy only as alignment techniques improve.

Classification in Practice

When categorizing an agent, consider:

What actions can it take without approval?
What approval mechanisms exist?
What oversight is applied during operation?
How are outcomes reviewed?
Can autonomy level be adjusted dynamically?

The answers position the agent on the spectrum—and reveal the implicit trust assumptions built into its design.

The Autonomy Spectrum

Level 0: Tool Mode

Level 1: Assisted Mode

Level 2: Partial Autonomy

Level 3: Supervised Autonomy

Level 4: Full Autonomy

Factors Determining Appropriate Level

Capability

Reversibility

Stakes

Domain

Trust

Dynamic Autonomy

The Autonomy-Alignment Relationship

Classification in Practice

See Also