PMAI PM Playbook

Agent PM starter pack

Guide 09

Use this when the AI can call tools, take actions, or run multi-step workflows. Agents need stricter product definition than chatbots because they create side effects.

The starter checklist

QuestionArtifact
What actions can the agent take, and at what autonomy level?AI PRD
Which tools can it call, with what constraints?AI PRD
What can the agent never do?AI PRD + Operating AI Products
Which actions require approval, rollback, or audit trail?Human Review Workflow
How will trajectory quality be evaluated?Eval Plan + Agentic Products
What is the cost ceiling per task?Cost Model
What happens when the agent gets stuck or confidence is low?AI PRD + Launch Gate Checklist

1. Map autonomy by action

Do not assign one autonomy level to the whole agent. Assign a level to each action:

ActionRecommended starting levelWhy
Draft contentDraftUser can review before anything happens
Recommend a next stepSuggestUser approves or rejects
Update reversible internal stateActRequires audit trail and rollback
Send external communicationDraft or SuggestCustomer-facing mistakes damage trust
Process money, permissions, identity, legal, health, or deletionHuman-only unless proven otherwiseHigh-severity harm or irreversible action

Start lower than you think. Let reliability, user trust, and operations earn higher autonomy over time.

2. Define tool boundaries

For every tool, document:

  • What data it can read
  • What data it can write
  • Which tenants, accounts, or objects it can access
  • Whether the action has side effects
  • Whether the action is reversible
  • What happens when the tool fails

If a tool can send, delete, charge, approve, publish, or change permissions, it needs confirmation or rollback.

3. Design rollback before action

An agent should not take an action unless the team knows how to undo it or has decided the action must be reviewed first.

Action typeRequired control
Reversible internal updateChange log, undo path, monitoring owner
Customer-visible outputHuman approval or safe draft state
Irreversible actionHuman approval before action
Tool failure or partial completionStop, report state, and ask for review
Repeated failed attemptsCost ceiling and loop termination

Rollback is part of the product design, not an implementation detail.

4. Evaluate the trajectory, not only the outcome

For agents, a correct final answer can still be a product failure if the path was unsafe.

Evaluate

  • Tool calls: did the agent use only allowed tools?
  • Parameters: did it pass safe and correct arguments?
  • Sequence: did each step follow from the prior state?
  • Efficiency: did it avoid unnecessary retries and loops?
  • Boundary handling: did it stop or escalate when out of scope?
  • Final output: did the result satisfy the user goal?

Add trajectory failures to the eval plan and error analysis loop.

5. Set cost ceilings

Agent costs compound through tool calls, retries, and growing context. Define:

  • Maximum cost per task
  • Maximum number of tool calls
  • Maximum retries per tool
  • Timeout or stop condition
  • User-facing behavior when the ceiling is hit

If the agent reaches the ceiling, it should stop, summarize progress, and ask whether to continue or hand off.

6. Make handoff explicit

The agent needs a clear path when it cannot safely continue:

  • Hand off to a human reviewer
  • Ask the user for missing information
  • Produce a draft instead of taking action
  • Show uncertainty and stop
  • Fall back to the existing non-AI workflow

"Try again" is not an escalation policy.

Launch blockers for agents

Do not launch an agent beyond internal testing if:

  • Tool boundaries are not defined
  • Any high-impact action has no approval or rollback path
  • Trajectory evals do not exist
  • Cost ceilings are missing
  • There is no audit trail for tool calls and actions
  • The human handoff path is not staffed or tested

Use the Launch Gates guide when deciding whether the agent can move from prototype to pilot, limited production, or scale.

Link copied