Automatically route queries to the optimal AI model based on task requirements
Stop choosing between dozens of AI models for every query. AutoLLM Router analyzes each request and automatically selects the best model based on cost, performance, and capabilities.
from autollm_router import LLMRouter
# Initialize with models from your config
router = LLMRouter()
# Let AutoLLM choose the optimal model
response = await router.generate(
"Explain quantum computing to a 10-year-old"
)
print(f"Selected model: {response.model_id}")
print(f"Cost: ${response.estimated_cost:.5f}")
print(response.content)
Reduce your API costs by up to 70% by automatically routing to cost-effective models when premium capabilities aren't needed.
Get better responses by leveraging the unique strengths of different models for specific types of queries.
One API to access all major LLM providers, with smart routing handled automatically behind the scenes.
AutoLLM Router intelligently selects more affordable models for simple tasks, automatically switching to premium models only when needed.
Save up to 70% on API costs
Your query is analyzed to determine its requirements
The best model is selected based on capabilities and constraints
The request is handled with appropriate provider-specific settings
Response is returned with metadata on model selection
AutoLLM Router maintains a registry of models with detailed capability scores, performance metrics, and cost data.
When a query arrives, the system analyzes its characteristics and matches them against available models, considering:
# Excerpt from the AutoLLM Router model registry
models = [
{
"id": "gpt-4-turbo",
"provider": "OPENAI",
"capabilities": {
"coding": 0.95,
"math": 0.92,
"writing": 0.97,
"creative": 0.95,
"analysis": 0.96
},
"performance": {
"avg_latency": 2.8,
"cost_per_1k_tokens": 0.01
}
},
{
"id": "claude-3-opus",
"provider": "ANTHROPIC",
"capabilities": {
"coding": 0.94,
"math": 0.88,
"writing": 0.98,
"creative": 0.92,
"analysis": 0.95
}
},
{
"id": "llama3-70b-8192",
"provider": "GROQ",
"capabilities": {
"coding": 0.93,
"math": 0.86,
"writing": 0.92,
"creative": 0.86,
"analysis": 0.89
},
"performance": {
"avg_latency": 0.8,
"cost_per_1k_tokens": 0.0001
}
}
]
# The Query Analyzer in action
async def analyze_query(query: str, constraints: dict):
"""Analyze query characteristics to find the best model"""
prompt = f"""You are an expert AI model selector.
Available LLMs and their metrics:
{models_formatted}
User query: "{query}"
Constraints: {constraints}
Analyze this query and select the most appropriate model.
Consider query domain, complexity, and user constraints.
"""
# Use a small, fast model for the selection process
selector_model = "gpt-3.5-turbo"
response = await client.generate(selector_model, prompt)
# Parse the response to get the selected model
selected = parse_selection(response)
return {
"selected_model": selected["model_id"],
"reasoning": selected["reasoning"],
"estimated_cost": selected["estimated_cost"]
}
"Write a Python function to calculate the Fibonacci sequence using dynamic programming"
Selected: llama3-70b-8192 (via Groq)
Reasoning: Coding task with medium complexity, fast execution preferred, low cost solution adequate
"Explain different approaches to solving the P vs NP problem"
Selected: Claude 3.7 (via Anthropic)
Reasoning: Complex theoretical CS topic requiring advanced reasoning and accuracy
Comprehensive catalog of available LLMs with detailed capability scores and performance metrics.
Using an LLM to analyze queries and determine the best model for each specific task.
Unified API for multiple LLM providers with built-in token counting and cost estimation.
Flexible interfaces for both command line usage and direct integration into your applications.
$ autollm-router query "Explain quantum entanglement simply"
AutoLLM Router: Analyzing query...
Selected model: claude-3-haiku (Anthropic)
Reason: Explanatory query requiring clarity; medium complexity; cost efficiency prioritized
Quantum entanglement is like having two magical coins that always match each other...
Tokens: 245 | Cost: $0.00025 | Processing time: 0.7s
Create more powerful and cost-effective AI products by leveraging the right model for each task.
Example: An AI writing assistant that uses affordable models for drafting but premium models for final editing.
Optimize AI costs while maintaining quality across different business applications.
Example: An internal tool that routes customer service queries to affordable models but uses specialized models for technical or complex issues.
Experiment with multiple models without constantly switching APIs and configurations.
Example: A research project comparing model performance across different tasks, with unified data collection and processing.
Start saving on API costs while delivering better results with intelligent model routing.