🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Model Routing Orchestrator

by @stephensu66

Route each user request to the most cost-effective model or multi-model workflow based on task type, complexity, risk, latency, budget, tool needs, and verif...

Versionv2026.3.27
Installs1
⚑ When to Use
TriggerAction
- which model should answer a given user request
- whether a cheap model is enough
- when to escalate to a stronger reasoning model
- when to use one model versus multiple models
- when to use tools instead of relying on pure model reasoning
- how to handle complex calculations, code, multimodal input, long context, or high-risk tasks
- how to balance cost, speed, and answer quality in production
πŸ’‘ Examples

Example 1: Simple rewrite

Request: "Rewrite this email to sound more professional."

Output: Routing Decision: single-economy

Primary Reason: This is a low-risk rewriting task where a low-cost model should meet the quality bar.

Task Profile:

  • taskType: simple generation
  • complexity: low
  • exactness: low
  • risk: low
  • latency: urgent
  • budget: minimize-cost
  • contextLoad: short
  • modality: text
  • Recommended Execution Plan: 1. Use an economy model to rewrite the email. 2. Preserve intent and tone constraints. 3. Return one polished version and one shorter variant.

    Model Role Assignment:

  • planner: not-needed
  • generator: economy
  • verifier: none
  • Why This Is Cost-Effective:

  • The task is structurally simple.
  • Premium reasoning is unnecessary.
  • Fast low-cost generation is sufficient.
  • Escalation Rule: Escalate only if the user requests highly nuanced persuasion, legal sensitivity, or multiple tone variants.

    Fallback Rule: If the first rewrite is too generic, retry once with stricter tone instructions using a balanced model.

    Example 2: Complex system design

    Request: "Design an AI agent architecture for long-form writing with memory, retrieval, conflict resolution, and chapter planning."

    Output: Routing Decision: single-premium

    Primary Reason: This task requires deep reasoning, abstraction, and multi-part architecture tradeoff analysis.

    Task Profile:

  • taskType: deep reasoning
  • complexity: high
  • exactness: medium
  • risk: medium
  • latency: normal
  • budget: balanced
  • contextLoad: moderate
  • modality: text
  • Recommended Execution Plan: 1. Use a premium model to identify major architecture components and constraints. 2. Structure the answer into memory layers, retrieval flow, consistency controls, and orchestration. 3. Include tradeoffs and failure modes. 4. Add implementation recommendations only after the design is coherent.

    Model Role Assignment:

  • planner: premium
  • generator: premium
  • verifier: none
  • Why This Is Cost-Effective:

  • A weaker model may produce shallow or inconsistent architecture.
  • One premium pass is cheaper than multiple failed lower-tier retries.
  • The task value justifies stronger reasoning.
  • Escalation Rule: Escalate to parallel-comparison only if multiple competing architecture strategies are explicitly required.

    Fallback Rule: If the first answer is too generic, rerun with a staged outline-first then full-design workflow.

    Example 3: Financial calculation with policy impact

    Request: "Calculate the projected monthly burn and runway under three hiring scenarios and explain the assumptions."

    Output: Routing Decision: tool-assisted

    Primary Reason: This request depends on exact calculations, so deterministic computation is safer and more cost-effective than pure model reasoning.

    Task Profile:

  • taskType: exact calculation or formal logic
  • complexity: high
  • exactness: critical
  • risk: high
  • latency: normal
  • budget: balanced
  • contextLoad: moderate
  • modality: text
  • Recommended Execution Plan: 1. Use a balanced model to extract the variables, hiring scenarios, and required formulas. 2. Use a deterministic calculator or spreadsheet-capable path to compute burn and runway. 3. Run a verification check on formulas, assumptions, and units. 4. Use the model to present the results clearly with assumptions and scenario breakdowns.

    Model Role Assignment:

  • planner: balanced
  • generator: tool-assisted
  • verifier: tool
  • Why This Is Cost-Effective:

  • Exact arithmetic should not depend on freeform model reasoning.
  • Deterministic calculation reduces error risk.
  • A smaller model can still explain the results after computation.
  • Escalation Rule: Escalate to premium review only if the scenario assumptions are ambiguous or strategically sensitive.

    Fallback Rule: If tool-based computation is unavailable, return assumptions explicitly and mark the result as unverified instead of presenting certainty.

    Example 4: Multi-document review

    Request: "Compare these five candidate proposals and recommend the best one based on cost, timeline, technical fit, and risk."

    Output: Routing Decision: staged-pipeline

    Primary Reason: The request is large and decomposable, so staged extraction and synthesis is more cost-effective than sending everything directly to a premium model.

    Task Profile:

  • taskType: long-context synthesis
  • complexity: high
  • exactness: medium
  • risk: medium
  • latency: relaxed
  • budget: balanced
  • contextLoad: long
  • modality: text
  • Recommended Execution Plan: 1. Use an economy or balanced model to extract structured facts from each proposal. 2. Normalize the proposals into a common comparison table. 3. Use a stronger model to synthesize tradeoffs and recommend the best option. 4. Add a brief verifier pass if the recommendation is high stakes.

    Model Role Assignment:

  • planner: balanced
  • generator: staged-pipeline
  • verifier: balanced
  • Why This Is Cost-Effective:

  • Cheap extraction lowers total token cost.
  • Structured normalization improves synthesis quality.
  • Premium reasoning is reserved for the part that truly needs it.
  • Escalation Rule: Escalate to consensus-check if the recommendation will drive a major decision or if proposal differences are subtle.

    Fallback Rule: If extraction quality is poor, rerun the extraction stage with a stronger model before recomputing the final recommendation.

    View on ClawHub
    TERMINAL
    clawhub install model-routing-orchestrator

    πŸ§ͺ Use this skill with your agent

    Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

    πŸ” Can't find the right skill?

    Search 60,000+ AI agent skills β€” free, no login needed.

    Search Skills β†’