Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.trulayer.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Prompt Improvements page closes the loop on the failures TruLayer observes in production. The platform clusters production failures, asks an LLM to synthesize a candidate prompt diff, replays a held-out trace set against both the live prompt and the candidate, and proposes the diff for review.

State machine

A prompt deployment moves through a fixed state machine. Every transition is recorded; you can see the current state on every row of the list.
StateWhat it means
proposedThe cluster→diff worker synthesized a candidate prompt. Awaiting A/B replay.
ab_runningThe A/B harness is replaying the held-out trace set.
ab_passedA/B replay finished and the candidate is recommended for ship.
ab_failedA/B replay finished and the candidate did not improve the metric set.
shippedAn owner approved the candidate and it is now live for the project.
monitoringThe shipped prompt is being observed against the post-ship rolling window.
regressedThe rolling window crossed the regression threshold. A regression banner appears across the dashboard.
rolled_backAn owner (or the auto-rollback monitor) rolled the shipped prompt back to its predecessor.
rejectedAn owner rejected the candidate before it was shipped.

Reviewing a candidate

Click any row to open the detail page. The detail page shows:
  • Candidate diff — a unified-diff view of the proposed prompt against the currently shipped prompt. Added lines are highlighted green; removed lines red.
  • A/B replay report — per-metric baseline, candidate, and delta. A passing report (green badge) is the default precondition for shipping.
  • Decision controls — owner-only buttons to approve, reject, or roll back.
  • Post-ship monitoring — the rolling regression metric over time. A dashed reference line marks the A/B candidate baseline so divergence is visible at a glance.

Approving and rolling back

Approve a candidate by clicking Approve & ship. The prompt becomes live immediately and enters the monitoring state. Until you approve, no production traffic is affected — the candidate is sandboxed in the A/B replay only. Reject a candidate or roll back a shipped prompt by clicking Reject or Roll back. Both require a free-text reason that is recorded on the deployment. Only users with the owner role can approve, reject, or roll back. See Team settings for roles.

Regression alerts

When a shipped prompt’s rolling regression metric crosses the configured threshold, the deployment moves to regressed and a banner appears across every dashboard page. Click the banner to jump to the regressed deployment and decide whether to roll back manually or let the auto-rollback monitor act.

Plan availability

Closed-loop prompt improvement is part of the Control plane feature set and is gated on the Team plan or above (same gate as Control, Policies, Alert Rules, and Model Routing). Starter and Pro projects see an upgrade CTA in place of the page.