Guardrails | AskRAI Docs

Active AI Guardrails (AAIG) are the safety layer between user queries and AI responses. Every message is evaluated against your configured guardrail rules before the assistant responds. If any rule is violated, the response is blocked — protecting your organization from unsafe, non-compliant, or off-topic interactions.

How Guardrails Work

Each guardrail contains a natural-language rule that a language model evaluates independently. Guardrails run concurrently (up to three at a time) to minimize response latency, and each evaluation is recorded in the audit trail.

Three key principles govern how guardrails operate:

Independent evaluation — each guardrail is evaluated by a separate LLM call, so rules never interfere with one another. A safety guardrail cannot affect how a compliance guardrail evaluates the same message.
Tripwire blocking — if any single guardrail fails, the entire response is blocked. There is no "majority rules" — one violation is enough to prevent delivery.
Concurrent execution — guardrails run in parallel to keep response times low. The system does not wait for one guardrail to finish before starting the next.

Five Categories

Guardrails are organized into five categories, each targeting a different type of risk:

Safety guardrails prevent harmful, abusive, or dangerous content from entering or leaving the system.

Example rules:

Block requests that contain threats or harassment
Prevent the assistant from providing medical, legal, or financial advice
Flag messages that discuss self-harm

Compliance guardrails enforce regulatory and organizational policy requirements.

Example rules:

Ensure responses include required disclaimers for regulated topics
Block discussion of topics outside the organization's mandate
Enforce data handling policies for sensitive information

Quality guardrails maintain the standard and relevance of AI responses.

Example rules:

Reject queries that are too vague to produce a useful response
Ensure responses stay on-topic for the organization's domain
Block nonsensical or spam-like input

Privacy guardrails protect personally identifiable information (PII) and sensitive data.

Example rules:

Block messages containing social security numbers, credit card numbers, or addresses
Prevent the assistant from requesting personal information
Redact or reject queries that expose sensitive employee data

Custom guardrails address organization-specific requirements that don't fit the other categories.

Example rules:

Restrict responses to a specific language or dialect
Enforce branding guidelines in AI-generated text
Block discussion of competitors or sensitive internal topics

Priority and Reporting

Each guardrail has two additional configuration options:

Priority (0–10) — indicates criticality for admin triage. Higher-priority guardrails appear first in dashboards and alerts. Priority does not affect evaluation order — all guardrails run in parallel regardless of priority.
Reporting level — controls what gets logged to the audit trail:

Level	What is recorded
None	Evaluate silently — no audit entries
Alerts	Log only violations (failed evaluations)
All	Log every evaluation result, pass or fail

Guardrails and Groups

Guardrails are assigned to groups, not applied globally. This means different user populations can have different safety rules. For example:

An internal employees group might have relaxed quality guardrails but strict compliance rules
A public access group might have aggressive safety and privacy guardrails
An admin group might bypass certain guardrails for testing purposes

A user's effective guardrails are the union of all guardrails from every group they belong to.

False Positive Tracking

Not every guardrail trigger is a real violation. AskRAI supports false positive marking in the conversation logs, allowing you to:

Identify guardrails that fire too aggressively
Track false positive rates over time in the guardrail analytics
Refine guardrail prompts to reduce false triggers without weakening protection

Next Steps

Guardrails — create, edit, and monitor guardrails in the admin console
Confidence & Escalation — understand what happens after guardrails pass
Governance & Audit — learn how guardrail evaluations feed into the audit trail
Sandbox — test guardrail behavior before deploying to production

How Guardrails Work

Three key principles govern how guardrails operate:

Independent evaluation — each guardrail is evaluated by a separate LLM call, so rules never interfere with one another. A safety guardrail cannot affect how a compliance guardrail evaluates the same message.
Tripwire blocking — if any single guardrail fails, the entire response is blocked. There is no "majority rules" — one violation is enough to prevent delivery.
Concurrent execution — guardrails run in parallel to keep response times low. The system does not wait for one guardrail to finish before starting the next.

Five Categories

Guardrails are organized into five categories, each targeting a different type of risk:

Safety guardrails prevent harmful, abusive, or dangerous content from entering or leaving the system.

Example rules:

Block requests that contain threats or harassment
Prevent the assistant from providing medical, legal, or financial advice
Flag messages that discuss self-harm

Compliance guardrails enforce regulatory and organizational policy requirements.

Example rules:

Ensure responses include required disclaimers for regulated topics
Block discussion of topics outside the organization's mandate
Enforce data handling policies for sensitive information

Quality guardrails maintain the standard and relevance of AI responses.

Example rules:

Reject queries that are too vague to produce a useful response
Ensure responses stay on-topic for the organization's domain
Block nonsensical or spam-like input

Privacy guardrails protect personally identifiable information (PII) and sensitive data.

Example rules:

Block messages containing social security numbers, credit card numbers, or addresses
Prevent the assistant from requesting personal information
Redact or reject queries that expose sensitive employee data

Custom guardrails address organization-specific requirements that don't fit the other categories.

Example rules:

Restrict responses to a specific language or dialect
Enforce branding guidelines in AI-generated text
Block discussion of competitors or sensitive internal topics

Priority and Reporting

Each guardrail has two additional configuration options:

Priority (0–10) — indicates criticality for admin triage. Higher-priority guardrails appear first in dashboards and alerts. Priority does not affect evaluation order — all guardrails run in parallel regardless of priority.
Reporting level — controls what gets logged to the audit trail:

Level	What is recorded
None	Evaluate silently — no audit entries
Alerts	Log only violations (failed evaluations)
All	Log every evaluation result, pass or fail

Guardrails and Groups

Guardrails are assigned to groups, not applied globally. This means different user populations can have different safety rules. For example:

An internal employees group might have relaxed quality guardrails but strict compliance rules
A public access group might have aggressive safety and privacy guardrails
An admin group might bypass certain guardrails for testing purposes

A user's effective guardrails are the union of all guardrails from every group they belong to.

False Positive Tracking

Not every guardrail trigger is a real violation. AskRAI supports false positive marking in the conversation logs, allowing you to:

Identify guardrails that fire too aggressively
Track false positive rates over time in the guardrail analytics
Refine guardrail prompts to reduce false triggers without weakening protection

Next Steps

Guardrails — create, edit, and monitor guardrails in the admin console
Confidence & Escalation — understand what happens after guardrails pass
Governance & Audit — learn how guardrail evaluations feed into the audit trail
Sandbox — test guardrail behavior before deploying to production

How Guardrails Work

Five Categories

Priority and Reporting

Guardrails and Groups

False Positive Tracking

Next Steps

On this page

How Guardrails Work

Five Categories

Priority and Reporting

Guardrails and Groups

False Positive Tracking

Next Steps

On this page