Skip to main content
Astrolabe Cloud routing is stack-driven. The requested model defines the candidate set, and the active routing stack filters and scores those candidates.

Routing inputs

The gateway considers:
  • requested model or virtual model
  • workspace and API-key context
  • active routing stack
  • request overrides in metadata.astrolabe
  • estimated input and output tokens
  • task category, action class, complexity, and route modifiers
  • hosted model catalog and pricing

Stack resolution

Resolution order:
  1. Request override: metadata.astrolabe.stack
  2. API-key assignment
  3. Workspace assignment
  4. Managed default
The selected stack version is included in route traces and response headers.

Candidate selection

For virtual models, Astrolabe starts from the routed candidate list for the virtual model. For concrete model requests, Astrolabe starts with that concrete model. The stack then applies:
  • provider and model allowlists
  • provider and model blocklists
  • required capabilities
  • data and retention policy labels
  • expected cost ceilings
  • quality thresholds
  • latency and reliability preferences
If every candidate is rejected, the request is blocked before provider execution.

Decision strategy

Stack objectives include:
  • lowest-cost model that clears the quality bar
  • fastest model that clears the quality bar
  • highest quality within budget
  • fixed priority order
  • custom rules
The stack returns a selected provider, selected model, decision kind, policy status, and candidate explanation.

Fallback

Fallback can be disabled or limited by stack policy. When enabled, fallback may run after provider errors, schema failures, or low-confidence verification. Fallback candidates remain constrained by stack boundaries when fallbackMustStayInPolicy is enabled.

Verification

Verification modes:
  • off
  • auto
  • light
  • strict
  • strong
Verification can be configured globally for a stack and separately for schema or high-stakes requests. Request-level metadata can override verification mode for one call.

Request overrides

Supported request metadata:
{
  "metadata": {
    "astrolabe": {
      "stack": "code-agent",
      "quality": "high",
      "cost_mode": "balanced",
      "latency_slo_ms": 3000,
      "verification": "strict",
      "fallback": true,
      "explain": true
    }
  }
}
See Request Stack Selection for examples.

Response headers

Important routing headers include:
  • x-astrolabe-cloud-request-id
  • x-astrolabe-model
  • x-astrolabe-stack-id
  • x-astrolabe-stack-name
  • x-astrolabe-stack-version
  • x-astrolabe-policy-status
  • x-astrolabe-decision
  • x-astrolabe-selected-provider
  • x-astrolabe-selected-model
  • x-astrolabe-category
  • x-astrolabe-action-class
  • x-astrolabe-complexity
  • x-astrolabe-lane
  • x-astrolabe-verification-status
  • x-astrolabe-confidence-score
  • x-astrolabe-estimated-baseline-cost-usd
  • x-astrolabe-estimated-savings-usd
Log the Cloud request id for support and trace lookup.