Every AI response receives a confidence score — a percentage reflecting how well the knowledge base supports the answer. This score drives automated decisions about whether to deliver the response, flag it for review, or escalate it to a human. Confidence scoring and escalation rules work together to ensure users always get reliable answers, and your team always knows when to intervene.
Confidence Bands
Confidence scores fall into three configurable bands. You set the thresholds in Settings — the defaults are shown here.
| Band | Default range | Typical behavior |
|---|---|---|
| High | 90% and above | The assistant is confident in its answer. Response is delivered normally. |
| Medium | 60% – 89% | The assistant has a reasonable answer but some uncertainty. Response may be delivered with a review flag. |
| Low | Below 60% | The assistant cannot adequately answer. The conversation is typically escalated. |
Threshold boundaries are fully configurable per tenant. When you adjust them, a reclassification preview shows how existing conversations would be recategorized — so you can see the impact before saving.
Escalation Rules
Escalation rules define what happens when a response falls into a particular confidence band. Each rule specifies a condition (confidence band) and an action to take.
Four Escalation Actions
| Action | What it does |
|---|---|
| Create Ticket | Generates a support ticket in the Tickets queue for human follow-up. The ticket includes the original question, the AI response, confidence score, and conversation context. |
| Flag for Review | Marks the conversation for curator review in the Conversation Logs. The response is still delivered to the user. |
| Suppress Response | Blocks the AI response entirely and notifies the user that their question requires human assistance. |
| Auto Respond | Delivers the AI response without any escalation, regardless of confidence. Use this for bands where you trust the assistant's judgment. |
Rule Ordering
When multiple escalation rules match (for example, if two rules both apply to the Medium band), the rules are evaluated in priority order. The first matching rule's action is executed.
Every confidence band should have at least one matching escalation rule. If no rule matches a given band, the platform defaults to delivering the response without escalation — which may not be the behavior you want for low-confidence answers.
Confidence in Practice
The confidence score is not a simple keyword match percentage — it reflects how well the retrieved knowledge base content answers the user's specific question, considering:
- Relevance — how closely the retrieved content matches the query
- Coverage — whether the knowledge base has enough information to fully answer the question
- Consistency — whether multiple retrieved items agree or contradict each other
A high score means the knowledge base has strong, relevant, consistent content for this question. A low score means the assistant is uncertain — perhaps the question is outside the knowledge base's scope, or the available content is ambiguous.
Testing Escalation Rules
Use the Sandbox to test how confidence scoring and escalation rules behave with real queries. The sandbox's audit preview tab shows:
- The confidence score the query received
- Which escalation rule matched
- What action would be taken in production
This lets you tune thresholds and rules before they affect live conversations.
Next Steps
- Settings — configure confidence thresholds and escalation rules
- Tickets — view and manage escalated conversations
- Sandbox — test confidence and escalation behavior
- Governance & Audit — learn how confidence data feeds into analytics