Model Routing | AskRAI Docs

Model Routing controls which AI models process each stage of the response pipeline. Configure default models, enable automatic complexity-based routing, and create rules that match requests by channel, authentication status, device type, or user group. Changes follow a draft-and-publish workflow with full version history and rollback support.

Model Routing page showing the publish bar, default models, auto-route, and routing rules sections

The Routing page displays four configuration sections: publish controls, default models, auto-route, and routing rules.

Key Concepts

Before configuring routing, understand the three layers that determine which model handles a request. They are evaluated in order of precedence:

Layer	Description	Precedence
Routing Rules	Match requests by context (channel, auth status, device type, group, role, complexity) and assign specific models	Highest
Auto-Route	Classify query complexity as simple, moderate, or complex, then map each tier to a model	Medium
Defaults	Tenant-level fallback models for each pipeline stage	Lowest

If no layer assigns a model for a given stage, the platform default is used.

Pipeline Stages

Each request passes through up to three pipeline stages. You can assign a different model to each:

Stage	Purpose	Typical Model
Summarization	Synthesizes the final response from search results	High-quality model (e.g., GPT 5.2)
Query Refinement	Resolves conversational references into self-contained queries	Fast model (e.g., GPT 4.1 Mini)
Guardrail	Evaluates input against safety policies before processing	Fast model (e.g., GPT 4.1 Nano)

Draft and Publish Workflow

Routing configuration uses a draft-and-publish workflow. Changes you make on the page are local edits until you save them as a draft. The draft must then be published to take effect in production.

The publish bar at the top of the page shows:

Control	Description
Validate	Checks the configuration for errors (e.g., rules without a target model or empty conditions)
Save Draft	Persists your edits as a draft without affecting production
Discard	Reverts all unsaved local edits back to the last saved draft
History	Opens the version history drawer to compare or rollback versions
Publish	Promotes the saved draft to production (disabled until a valid draft is saved)

Publishing a routing configuration takes effect immediately for all new requests in your tenant. Test changes in the Sandbox before publishing to production.

Default Models

The Default Models section sets tenant-level fallback models for each pipeline stage. If no routing rule or auto-route tier assigns a model for a stage, the default is used.

Select a model from the Summarization Model dropdown to set the primary default. Expand the Advanced section to override the Query Refinement and Guardrail models separately.

Leave any dropdown set to Platform Default to use the model configured by your platform administrator.

Model dropdowns only show models that support the relevant pipeline stage. If no models appear, contact your platform administrator to verify model availability.

Auto-Route

Auto-route automatically classifies each incoming query by complexity and routes it to the most appropriate model. This optimizes cost and performance by sending simple queries to faster models and reserving powerful models for complex requests.

Routing page with auto-route enabled, showing the classifier model and tier mapping sections

When auto-route is enabled, a classifier model and three tier mappings appear.

Enabling Auto-Route

Toggle Enable auto-route classification to activate this feature. When enabled, two additional configuration areas appear:

Classifier Model — the model that evaluates query complexity. Choose a fast model with the fast capability for minimal latency overhead.
Tier Mapping — assign a summarization model (and optionally query refinement and guardrail models) for each complexity tier.

Tier Configuration

Tier	When Used
Simple	Straightforward factual queries that require minimal reasoning
Moderate	Multi-step questions that need some analysis or comparison
Complex	Queries requiring deep reasoning, multi-document synthesis, or specialized knowledge

For each tier, select a primary model from the dropdown. Expand Advanced under each tier to set stage-specific overrides for query refinement and guardrail models.

Routing Rules

Routing rules let you assign models based on request context. Rules are evaluated by priority — lower numbers run first. Multiple matching rules can contribute different stages (one rule might set the summarization model while another sets the guardrail model).

Creating a Rule

Open the Rule Editor

Click Add Rule in the top-right corner of the Routing Rules section.

New Routing Rule dialog with fields for name, priority, target model, conditions, and stage overrides

The rule editor dialog with fields for name, priority, conditions, and model assignments.

Set the Rule Name and Priority

Enter a descriptive name for the rule. Set the Priority number — rules with lower priority values are evaluated first. Toggle Enabled to activate or deactivate the rule without deleting it.

Choose the Target Model

Select a Target Model (Summarization) from the dropdown. This is the primary model assignment for matching requests.

Expand Stage Overrides to optionally set different models for query refinement and guardrail stages.

Define Conditions

Add one or more conditions that a request must match for this rule to apply. All conditions must match (AND logic).

Save the Rule

Click Save to add the rule to your configuration. The rule appears in the list sorted by priority.

Condition Fields

Each condition matches a request attribute against a value:

Field	Description	Example Values
Channel	The communication channel the request arrived on	Teams, Web Chat, Mobile App, WhatsApp, SMS
Auth Status	Whether the user is authenticated	Authenticated, Unauthenticated, API Key, Executive SSO
Device Type	The device category	Desktop, Mobile, Tablet, API
Group ID	The user's assigned group identifier	Free-text group ID
Role Category	The user's role classification	Standard, Premium, Internal, Admin
Complexity Tier	The auto-route complexity classification (requires auto-route enabled)	Simple, Moderate, Complex

Operators

Operator	Description
Equals	Exact match against a single value
Not Equals	Matches any value except the specified one
In List	Matches any of the specified values
Not In List	Matches none of the specified values

Rules override auto-route and defaults. If a rule assigns a summarization model, it takes precedence over the auto-route tier mapping and the default model for that stage.

Testing Rules

The Test Rules section at the bottom of the page lets you simulate a request and see which rules match and which models are resolved for each stage.

Select values for the context fields (channel, auth status, device type, group ID, role category, and complexity tier if auto-route is enabled), then click Evaluate. The result shows:

Source — where the final model selection came from (Platform Default, Tenant Default, Routing Rule, Auto-Route, or Merged Rules)
Resolved Models — which model handles each pipeline stage (summarization, refinement, guardrail) and the source of each assignment
Matched Rules — which rules matched the test inputs, sorted by priority

The rule tester evaluates against your current unsaved configuration. This lets you test changes before saving or publishing.

Version History

Click History in the publish bar to open the version history drawer. Each published configuration is stored as a numbered version.

For each version, you can:

Compare an archived version with the current published version to see what changed
Rollback to restore an archived version as the active configuration

Rolling back to an earlier version permanently deletes all versions newer than the rollback target. This action cannot be undone.

Settings — configure confidence thresholds and escalation rules
Sandbox — test AI responses with different model configurations
Conversation Logs — review audit records showing which models and rules were used
Guardrails — configure safety policies evaluated by the guardrail model
Users, Roles & Groups — manage groups and roles used as routing rule conditions

Model Routing page showing the publish bar, default models, auto-route, and routing rules sections

The Routing page displays four configuration sections: publish controls, default models, auto-route, and routing rules.

Key Concepts

Before configuring routing, understand the three layers that determine which model handles a request. They are evaluated in order of precedence:

Layer	Description	Precedence
Routing Rules	Match requests by context (channel, auth status, device type, group, role, complexity) and assign specific models	Highest
Auto-Route	Classify query complexity as simple, moderate, or complex, then map each tier to a model	Medium
Defaults	Tenant-level fallback models for each pipeline stage	Lowest

If no layer assigns a model for a given stage, the platform default is used.

Pipeline Stages

Each request passes through up to three pipeline stages. You can assign a different model to each:

Stage	Purpose	Typical Model
Summarization	Synthesizes the final response from search results	High-quality model (e.g., GPT 5.2)
Query Refinement	Resolves conversational references into self-contained queries	Fast model (e.g., GPT 4.1 Mini)
Guardrail	Evaluates input against safety policies before processing	Fast model (e.g., GPT 4.1 Nano)

Draft and Publish Workflow

Routing configuration uses a draft-and-publish workflow. Changes you make on the page are local edits until you save them as a draft. The draft must then be published to take effect in production.

The publish bar at the top of the page shows:

Control	Description
Validate	Checks the configuration for errors (e.g., rules without a target model or empty conditions)
Save Draft	Persists your edits as a draft without affecting production
Discard	Reverts all unsaved local edits back to the last saved draft
History	Opens the version history drawer to compare or rollback versions
Publish	Promotes the saved draft to production (disabled until a valid draft is saved)

Publishing a routing configuration takes effect immediately for all new requests in your tenant. Test changes in the Sandbox before publishing to production.

Default Models

The Default Models section sets tenant-level fallback models for each pipeline stage. If no routing rule or auto-route tier assigns a model for a stage, the default is used.

Select a model from the Summarization Model dropdown to set the primary default. Expand the Advanced section to override the Query Refinement and Guardrail models separately.

Leave any dropdown set to Platform Default to use the model configured by your platform administrator.

Model dropdowns only show models that support the relevant pipeline stage. If no models appear, contact your platform administrator to verify model availability.

Auto-Route

Routing page with auto-route enabled, showing the classifier model and tier mapping sections

When auto-route is enabled, a classifier model and three tier mappings appear.

Enabling Auto-Route

Toggle Enable auto-route classification to activate this feature. When enabled, two additional configuration areas appear:

Classifier Model — the model that evaluates query complexity. Choose a fast model with the fast capability for minimal latency overhead.
Tier Mapping — assign a summarization model (and optionally query refinement and guardrail models) for each complexity tier.

Tier Configuration

Tier	When Used
Simple	Straightforward factual queries that require minimal reasoning
Moderate	Multi-step questions that need some analysis or comparison
Complex	Queries requiring deep reasoning, multi-document synthesis, or specialized knowledge

For each tier, select a primary model from the dropdown. Expand Advanced under each tier to set stage-specific overrides for query refinement and guardrail models.

Routing Rules

Creating a Rule

Open the Rule Editor

Click Add Rule in the top-right corner of the Routing Rules section.

New Routing Rule dialog with fields for name, priority, target model, conditions, and stage overrides

The rule editor dialog with fields for name, priority, conditions, and model assignments.

Set the Rule Name and Priority

Enter a descriptive name for the rule. Set the Priority number — rules with lower priority values are evaluated first. Toggle Enabled to activate or deactivate the rule without deleting it.

Choose the Target Model

Select a Target Model (Summarization) from the dropdown. This is the primary model assignment for matching requests.

Expand Stage Overrides to optionally set different models for query refinement and guardrail stages.

Define Conditions

Add one or more conditions that a request must match for this rule to apply. All conditions must match (AND logic).

Save the Rule

Click Save to add the rule to your configuration. The rule appears in the list sorted by priority.

Condition Fields

Each condition matches a request attribute against a value:

Field	Description	Example Values
Channel	The communication channel the request arrived on	Teams, Web Chat, Mobile App, WhatsApp, SMS
Auth Status	Whether the user is authenticated	Authenticated, Unauthenticated, API Key, Executive SSO
Device Type	The device category	Desktop, Mobile, Tablet, API
Group ID	The user's assigned group identifier	Free-text group ID
Role Category	The user's role classification	Standard, Premium, Internal, Admin
Complexity Tier	The auto-route complexity classification (requires auto-route enabled)	Simple, Moderate, Complex

Operators

Operator	Description
Equals	Exact match against a single value
Not Equals	Matches any value except the specified one
In List	Matches any of the specified values
Not In List	Matches none of the specified values

Rules override auto-route and defaults. If a rule assigns a summarization model, it takes precedence over the auto-route tier mapping and the default model for that stage.

Testing Rules

The Test Rules section at the bottom of the page lets you simulate a request and see which rules match and which models are resolved for each stage.

Select values for the context fields (channel, auth status, device type, group ID, role category, and complexity tier if auto-route is enabled), then click Evaluate. The result shows:

Source — where the final model selection came from (Platform Default, Tenant Default, Routing Rule, Auto-Route, or Merged Rules)
Resolved Models — which model handles each pipeline stage (summarization, refinement, guardrail) and the source of each assignment
Matched Rules — which rules matched the test inputs, sorted by priority

The rule tester evaluates against your current unsaved configuration. This lets you test changes before saving or publishing.

Version History

Click History in the publish bar to open the version history drawer. Each published configuration is stored as a numbered version.

For each version, you can:

Compare an archived version with the current published version to see what changed
Rollback to restore an archived version as the active configuration

Rolling back to an earlier version permanently deletes all versions newer than the rollback target. This action cannot be undone.

Settings — configure confidence thresholds and escalation rules
Sandbox — test AI responses with different model configurations
Conversation Logs — review audit records showing which models and rules were used
Guardrails — configure safety policies evaluated by the guardrail model
Users, Roles & Groups — manage groups and roles used as routing rule conditions

Key Concepts

Pipeline Stages

Draft and Publish Workflow

Default Models

Auto-Route

Enabling Auto-Route

Tier Configuration

Routing Rules

Creating a Rule

Open the Rule Editor

Set the Rule Name and Priority

Choose the Target Model

Define Conditions

Save the Rule

Condition Fields

Operators

Testing Rules

Version History

On this page

Key Concepts

Pipeline Stages

Draft and Publish Workflow

Default Models

Auto-Route

Enabling Auto-Route

Tier Configuration

Routing Rules

Creating a Rule

Open the Rule Editor

Set the Rule Name and Priority

Choose the Target Model

Define Conditions

Save the Rule

Condition Fields

Operators

Testing Rules

Version History

On this page