The Prompt Improvements page closes the loop on the failures TruLayer observes in production. The platform clusters production failures, asks an LLM to synthesize a candidate prompt diff, replays a held-out trace set against both the live prompt and the candidate, and proposes the diff for review.Documentation Index
Fetch the complete documentation index at: https://docs.trulayer.ai/llms.txt
Use this file to discover all available pages before exploring further.
State machine
A prompt deployment moves through a fixed state machine. Every transition is recorded; you can see the current state on every row of the list.| State | What it means |
|---|---|
proposed | The cluster→diff worker synthesized a candidate prompt. Awaiting A/B replay. |
ab_running | The A/B harness is replaying the held-out trace set. |
ab_passed | A/B replay finished and the candidate is recommended for ship. |
ab_failed | A/B replay finished and the candidate did not improve the metric set. |
shipped | An owner approved the candidate and it is now live for the project. |
monitoring | The shipped prompt is being observed against the post-ship rolling window. |
regressed | The rolling window crossed the regression threshold. A regression banner appears across the dashboard. |
rolled_back | An owner (or the auto-rollback monitor) rolled the shipped prompt back to its predecessor. |
rejected | An owner rejected the candidate before it was shipped. |
Reviewing a candidate
Click any row to open the detail page. The detail page shows:- Candidate diff — a unified-diff view of the proposed prompt against the currently shipped prompt. Added lines are highlighted green; removed lines red.
- A/B replay report — per-metric baseline, candidate, and delta. A passing report (green badge) is the default precondition for shipping.
- Decision controls — owner-only buttons to approve, reject, or roll back.
- Post-ship monitoring — the rolling regression metric over time. A dashed reference line marks the A/B candidate baseline so divergence is visible at a glance.
Approving and rolling back
Approve a candidate by clicking Approve & ship. The prompt becomes live immediately and enters themonitoring state. Until you approve, no production traffic is affected — the candidate is sandboxed in the A/B replay only.
Reject a candidate or roll back a shipped prompt by clicking Reject or Roll back. Both require a free-text reason that is recorded on the deployment.
Only users with the owner role can approve, reject, or roll back. See Team settings for roles.
Regression alerts
When a shipped prompt’s rolling regression metric crosses the configured threshold, the deployment moves toregressed and a banner appears across every dashboard page. Click the banner to jump to the regressed deployment and decide whether to roll back manually or let the auto-rollback monitor act.