Model Routing Orchestrator
by @stephensu66
Route each user request to the most cost-effective model or multi-model workflow based on task type, complexity, risk, latency, budget, tool needs, and verif...
Example 1: Simple rewrite
Request: "Rewrite this email to sound more professional."Output: Routing Decision: single-economy
Primary Reason: This is a low-risk rewriting task where a low-cost model should meet the quality bar.
Task Profile:
Recommended Execution Plan: 1. Use an economy model to rewrite the email. 2. Preserve intent and tone constraints. 3. Return one polished version and one shorter variant.
Model Role Assignment:
Why This Is Cost-Effective:
Escalation Rule: Escalate only if the user requests highly nuanced persuasion, legal sensitivity, or multiple tone variants.
Fallback Rule: If the first rewrite is too generic, retry once with stricter tone instructions using a balanced model.
Example 2: Complex system design
Request: "Design an AI agent architecture for long-form writing with memory, retrieval, conflict resolution, and chapter planning."Output: Routing Decision: single-premium
Primary Reason: This task requires deep reasoning, abstraction, and multi-part architecture tradeoff analysis.
Task Profile:
Recommended Execution Plan: 1. Use a premium model to identify major architecture components and constraints. 2. Structure the answer into memory layers, retrieval flow, consistency controls, and orchestration. 3. Include tradeoffs and failure modes. 4. Add implementation recommendations only after the design is coherent.
Model Role Assignment:
Why This Is Cost-Effective:
Escalation Rule: Escalate to parallel-comparison only if multiple competing architecture strategies are explicitly required.
Fallback Rule: If the first answer is too generic, rerun with a staged outline-first then full-design workflow.
Example 3: Financial calculation with policy impact
Request: "Calculate the projected monthly burn and runway under three hiring scenarios and explain the assumptions."Output: Routing Decision: tool-assisted
Primary Reason: This request depends on exact calculations, so deterministic computation is safer and more cost-effective than pure model reasoning.
Task Profile:
Recommended Execution Plan: 1. Use a balanced model to extract the variables, hiring scenarios, and required formulas. 2. Use a deterministic calculator or spreadsheet-capable path to compute burn and runway. 3. Run a verification check on formulas, assumptions, and units. 4. Use the model to present the results clearly with assumptions and scenario breakdowns.
Model Role Assignment:
Why This Is Cost-Effective:
Escalation Rule: Escalate to premium review only if the scenario assumptions are ambiguous or strategically sensitive.
Fallback Rule: If tool-based computation is unavailable, return assumptions explicitly and mark the result as unverified instead of presenting certainty.
Example 4: Multi-document review
Request: "Compare these five candidate proposals and recommend the best one based on cost, timeline, technical fit, and risk."Output: Routing Decision: staged-pipeline
Primary Reason: The request is large and decomposable, so staged extraction and synthesis is more cost-effective than sending everything directly to a premium model.
Task Profile:
Recommended Execution Plan: 1. Use an economy or balanced model to extract structured facts from each proposal. 2. Normalize the proposals into a common comparison table. 3. Use a stronger model to synthesize tradeoffs and recommend the best option. 4. Add a brief verifier pass if the recommendation is high stakes.
Model Role Assignment:
Why This Is Cost-Effective:
Escalation Rule: Escalate to consensus-check if the recommendation will drive a major decision or if proposal differences are subtle.
Fallback Rule: If extraction quality is poor, rerun the extraction stage with a stronger model before recomputing the final recommendation.
clawhub install model-routing-orchestrator