PMAI PM Playbook

Post-launch review (week 2)

Customer Support Copilot

Feature: Customer Support Copilot
Pilot week: Week 2
Dates: 2026-03-04 to 2026-03-10
Author: Maya Patel, PM

Summary

The pilot is useful, but not ready to expand beyond the top 10 intents. Agents like drafts when KB coverage is strong. Quality drops on multi-intent tickets and older KB articles, so the next move is quality cleanup, not broader rollout.

What shipped

  • Low-confidence fallback now suppresses drafts below 0.6 and shows the top 3 KB articles.
  • Accept, edit, reject, escalation, latency, and cost events are live in the dashboard.
  • Added 18 pilot failures to the regression set.

Metrics

MetricWeek 2TargetStatus
Active pilot agents88On track
Drafts shown219N/AInformational
Accept rate68%>= 70%Slightly below
Edit rate24%< 25%On track
Reject rate8%< 8%Watch
Low-confidence fallback rate14%10-20%On track
Escalation rate4%< 5%On track
Hallucinated policy claims00On track
p95 latency4.8s< 4sBelow
Cost per draft$0.011< $0.03On track

Quality Review

Manual review sample: 40 draft-vs-sent pairs.

FindingCountNotes
Accurate and sent with minor edits27Strongest on billing and plan-change intents
Missing secondary question6Mostly multi-intent tickets
Tone too generic4Agents edited before sending
Cited stale KB article3All from cancellation policy articles

No confirmed hallucinations. The main issue is incompleteness, not fabrication.

User Feedback

Agents say the drafts save time on routine tickets, but they do not trust the feature for multi-topic tickets yet. Two agents asked for clearer visual separation between “ready to edit” drafts and “use KB only” fallback states.

Incidents

None.

Top Failure Modes

  1. Multi-intent tickets: the draft answers the first issue and misses the second.
  2. Stale KB retrieval: old cancellation articles still appear in retrieval results.
  3. Latency spikes: p95 misses target during morning queue peaks.

Decisions

  • Do not expand to top 50 intents yet.
  • Keep the pilot on the current 8-agent cohort for one more week.
  • Treat multi-intent handling and KB freshness as blockers before production.

Next Actions

ActionOwnerDue
Add 25 multi-intent examples to eval setML Lead2026-03-14
Remove deprecated cancellation articles from retrieval indexSupport Lead2026-03-12
Investigate p95 latency during morning peakEngineering2026-03-13
Update review UI state labels for draft vs fallbackDesign2026-03-16
Re-run full eval after KB cleanupPM + ML Lead2026-03-17
Link copied