Getting Started

IntroductionLocal ProxySDK Quick Start

AI ROI

Analytics & ReportsTeam ManagementGitHub Integration

Cost Management

Smart RoutingBudget & AlertsPrompt ClassifierCaching & Performance

Integrations

OpenAIAnthropicMCP ServerSlack

Reference

SDKREST APIErrors
Docs/Reference/SDK

SDK Reference

Complete SDK reference — track costs, attribute spend, and optimize routing across providers.

SDK Setup

  1. Install
    npm install costlens openai
  2. Set environment variables
    # .env
    COSTLENS_API_KEY=cl_your_api_key_here
    OPENAI_API_KEY=sk-...
    ANTHROPIC_API_KEY=sk-ant-...
  3. Make your first request
    import OpenAI from 'openai';
    import { CostLens } from 'costlens';
    
    const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
    const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, enableCache: true });
    const tracked = costlens.wrapOpenAI(openai);
    
    const res = await tracked.chat.completions.create({
      model: 'gpt-5.4-mini',
      messages: [{ role: 'user', content: 'Hello' }]
    });

Installation

npm install costlens

API Mode

Requires an API key. Get yours at costlens.dev/settings.

Basic Usage

import { CostLens } from 'costlens';
import OpenAI from 'openai';

const costlens = new CostLens();
const openai = new OpenAI({ apiKey: 'your-openai-key' });
const ai = costlens.wrapOpenAI(openai);

const response = await ai.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
});

// Check potential savings
const savings = await costlens.calculateSavings('gpt-5.4', [
  { role: 'user', content: 'What is 2+2?' }
]);
console.log(`Potential savings: ${savings.savingsPercentage}% with ${savings.recommendedModel}`);

Features

  • Smart model routing (GPT-5.4 → GPT-5.4-mini for simple tasks)
  • Cost calculations and savings estimates
  • Works in any environment
  • Zero configuration required
  • No cloud tracking (upgrade for analytics)

Constructor

new CostLens(config?: CostLensConfig)

Parameters

NameTypeRequiredDescription
apiKeystringNoYour CostLens API key (omit for API Mode)
autoOptimizebooleanNoCost tracking and analytics (feature in development)
smartRoutingbooleanNoRoute to cheapest model (20x savings)
enableCachebooleanNoCache responses (savings on repeats)
costLimitnumberNoMax cost per request (prevents overruns)
autoFallbackbooleanNoAuto-fallback on rate limits
maxRetriesnumberNoMax retry attempts (default: 3)
baseUrlstringNoCustom base URL (default: https://api.costlens.dev)
routingPolicyfunctionNoCustom routing decisions
qualityValidatorfunctionNoCustom quality scoring

Per-request options like requestId, correlationId, promptId, and userId are passed as the second argument to wrapper methods, not the constructor.

Example

const costlens = new CostLens({
  apiKey: 'cl_your_api_key_here',
  autoOptimize: true,
  smartRouting: true,
  enableCache: true,
  costLimit: 0.10,
  autoFallback: true,    // Auto-retry on failures
  maxRetries: 3,         // Retry up to 3 times
  
  // New SDK Features
  routingPolicy: (requestedModel, messages) => {
    // Custom routing logic
    if (messages.length > 10) return 'gpt-5.4-mini';
    return requestedModel;
  },
  qualityValidator: (responseText, messagesJson) => {
    // Custom quality scoring (1-5)
    return responseText.length > 100 ? 0.9 : 0.5;
  },
  requestId: 'req_' + Date.now(),
  correlationId: 'session_abc123'
});

Provider Wrappers

Wrap your provider clients so CostLens can route, cache and track usage automatically.

OpenAI

import OpenAI from 'openai';
import { CostLens } from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-5.4-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Anthropic

import Anthropic from '@anthropic-ai/sdk';
import { CostLens } from 'costlens';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const trackedClaude = costlens.wrapAnthropic(anthropic);

const res = await trackedClaude.messages.create({
  model: 'claude-haiku-4.5',
  messages: [{ role: 'user', content: 'Hello' }]
});

Safety & Rate Limiting

Built-in protection against runaway agents and accidental cost spikes.

ProtectionDefaultConfigurable
Burst rate limit50 req/minPer key in Settings → Alerts
Spend velocity$10 / 5 minPer key in Settings → Alerts
Daily budgetUnlimitedPer key in Settings → Alerts
Slack kill switchPauses key 30 minSlack interactivity webhook

Response headers on every request:

HeaderDescription
X-CostLens-Spend-TodayRunning daily spend for this key ($)
X-CostLens-Budget-RemainingRemaining budget or "unlimited"

When a limit is hit, the API returns 429 with retryAfter in the response body. Agents should back off and retry.

Error handling

Return a helpful message and optionally record the failure for visibility.

try {
  const start = Date.now();
  const result = await tracked.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, Date.now() - start, 'prompt-42');
  return result;
} catch (err) {
  await costlens.trackError('openai', params.model as string, JSON.stringify(params.messages), err as Error, 0);
  throw err; // surface to caller
}

Methods

trackOpenAI()

Track an OpenAI API call.

trackOpenAI(
  params: OpenAI.Chat.ChatCompletionCreateParams,
  result: OpenAI.Chat.ChatCompletion,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

  • params - The parameters passed to OpenAI
  • result - The response from OpenAI
  • latency - Time taken in milliseconds
  • promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await openai.chat.completions.create(params);
await costlens.trackOpenAI(
  params, 
  result, 
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackAnthropic()

Track an Anthropic (Claude) API call.

trackAnthropic(
  params: Anthropic.MessageCreateParams,
  result: Anthropic.Message,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

  • params - The parameters passed to Anthropic
  • result - The response from Anthropic
  • latency - Time taken in milliseconds
  • promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await anthropic.messages.create(params);
await costlens.trackAnthropic(
  params,
  result,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackError()

Track a failed API call.

trackError(
  provider: string,
  model: string,
  input: string,
  error: Error,
  latency: number
): Promise<void>

Parameters

  • provider - The provider (openai, anthropic)
  • model - The model that was attempted
  • input - The input that was sent
  • error - The error object
  • latency - Time taken before failure

Example

try {
  const result = await openai.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, latency);
} catch (error) {
  await costlens.trackError(
    'openai',
    params.model,
    JSON.stringify(params.messages),
    error,
    latency
  );
  throw error;
}

trackBatch()

Process multiple AI requests in a single call for 3-5x better performance.

trackBatch(
  requests: Array<{ provider?: string; model: string; prompt?: string; tokens?: number; latency?: number }>
): Promise<BatchResult[]>

Parameters

  • calls - Array of request data to process in batch
  • provider - The AI provider (openai, anthropic, etc.)
  • model - The model used
  • tokens - Number of tokens used
  • latency - Request latency in milliseconds

Performance Benefits

  • 3-5x faster than individual requests
  • Reduced HTTP overhead with batching
  • 90% more reliable with fewer failure points
  • Automatic batching with queue management

Example

// Process multiple requests efficiently
const requests = [
  { provider: 'openai', model: 'gpt-5.4', tokens: 150, latency: 1200 },
  { provider: 'anthropic', model: 'claude-3', tokens: 200, latency: 1000 },
  { provider: 'openai', model: 'gpt-5.4-mini', tokens: 100, latency: 800 }
];

// Single batch call - 3-5x faster than individual requests
await costlens.trackBatch(requests);

// Automatic queue management for optimal performance
// SDK automatically batches requests when possible

getCostAnalytics()

Get real-time performance metrics and savings data.

getCostAnalytics(): {
  cacheHitRate: number;    // Cache hit rate (0-1)
  totalSavings: number;    // Total money saved
  averageLatency: number;  // Average request latency
  errorRate: number;       // Error rate (0-1)
}

Example

const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);
console.log('Average Latency:', analytics.averageLatency + 'ms');
console.log('Error Rate:', analytics.errorRate * 100 + '%');

calculateSavings()

Calculate potential savings before making a request.

calculateSavings(
  requestedModel: string,
  messages: any[]
): Promise<{
  currentCost: number;
  optimizedCost: number;
  savings: number;
  savingsPercentage: number;
  recommendedModel: string;
}>

Example

const savings = await costlens.calculateSavings('gpt-5.4', messages);
console.log('Current Cost: $' + savings.currentCost);
console.log('Optimized Cost: $' + savings.optimizedCost);
console.log('Savings: $' + savings.savings);
console.log('Savings %: ' + savings.savingsPercentage + '%');
console.log('Recommended Model: ' + savings.recommendedModel);

OpenAI & Anthropic Support

CostLens currently supports OpenAI and Anthropic APIs with smart routing between models.

// OpenAI routing: GPT-5.4 → GPT-5.4-mini for simple tasks
const openaiResult = await costlens.openai.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to GPT-5.4-mini-turbo (20x cheaper)

// Anthropic routing: Claude Opus → Haiku for simple tasks  
const anthropicResult = await costlens.anthropic.messages.create({
  model: 'claude-sonnet-4.6',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to Claude Haiku (60x cheaper)

Advanced Analytics & Forecasting

CostLens provides ML-powered cost forecasting and routing analytics through the dashboard.

Predictive Analytics

  • ML-based cost forecasting
  • 7-day and 30-day predictions
  • Confidence scoring (45-85%)
  • Trend analysis & seasonality

Smart Routing

  • Context-aware model selection
  • Quality vs cost optimization
  • Real-time routing decisions
  • Provider performance tracking
// All analytics available through dashboard API
const stats = await fetch('/api/dashboard/stats', {
  headers: { 'Authorization': 'Bearer ' + apiKey }
});

const data = await stats.json();
console.log('Cost forecast:', data.costForecast);
console.log('Routing decisions:', data.routingDecisions);
console.log('Provider stats:', data.providerStats);

30-day forecast and optimization tips.

const forecast = await costlens.getCostForecast();
console.log(forecast.projectedMonthlyCost, forecast.trend, forecast.confidence);

const alerts = await costlens.checkCostAlerts();
console.log(alerts);

const recs = await costlens.getOptimizationRecommendations();
console.log(recs);

Advanced Features

Custom Routing Policy

Override default routing decisions with custom logic based on request context.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  routingPolicy: (requestedModel, messages) => {
    // Route complex queries to better models
    const complexity = messages.reduce((acc, msg) => acc + msg.content.length, 0);
    
    if (complexity > 1000) {
      return 'gpt-5.4'; // Use premium model for complex tasks
    }
    
    if (requestedModel === 'gpt-5.4' && complexity < 100) {
      return 'gpt-5.4-mini'; // Downgrade simple tasks
    }
    
    return requestedModel; // Keep original choice
  }
});

Custom Quality Validation

Implement custom quality scoring to improve routing decisions over time.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  qualityValidator: (responseText, messagesJson) => {
    const messages = JSON.parse(messagesJson);
    
    // Score based on response completeness
    let score = 3; // baseline
    
    if (responseText.length > 200) score += 1;
    if (responseText.includes('```')) score += 1; // code examples
    if (messages.some(m => m.content.includes('?')) && 
        responseText.includes('?')) score -= 1; // answered with question
    
    return Math.max(1, Math.min(5, score)); // clamp 1-5
  }
});

Request Correlation

Track related requests across your application with correlation IDs.

// Track user session
const sessionId = 'session_' + userId;
const requestId = 'req_' + Date.now();

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  requestId: requestId,
  correlationId: sessionId
});

const tracked = costlens.wrapOpenAI(openai);

// All requests will be tagged with these IDs
await tracked.chat.completions.create({
  model: 'gpt-5.4-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

// Get analytics using SDK method
const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);

Money-Saving Features

Redis Caching

Automatically cache responses to save money on repeated requests. Achieves strong hit rates in production.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  enableCache: true, // Enable Redis caching
});

const tracked = costlens.wrapOpenAI(openai);

// First call - cache miss, costs $0.05
await tracked.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

// Second call - cache hit, costs $0.00!
await tracked.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

Quality Monitoring

Smart routing automatically disables if response quality drops below 3.5/5 stars.

// SDK checks quality status before routing
const tracked = costlens.wrapOpenAI(openai);

// If quality is good: GPT-5.4 → GPT-5.4-mini (saves money)
// If quality dropped: Uses GPT-5.4 (protects quality)
await tracked.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'Complex task...' }],
});

// Note: Quality feedback is handled automatically by the SDK
// The SDK tracks routing decisions and learns from them internally

AI-Powered Optimization

Automatically compress prompts by 30-50% while preserving meaning.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  autoOptimize: true, // Enable AI compression
});

const tracked = costlens.wrapOpenAI(openai);

// Original: 200 tokens
// Optimized: 100 tokens (50% reduction)
// Savings: $0.009 per request
await tracked.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{
    role: 'user',
    content: 'Please kindly help me understand what the weather will be like tomorrow in San Francisco, California, USA'
  }],
});
// Compressed to: "Weather forecast for San Francisco tomorrow?"

Real-Time Savings

Track exactly how much money you're saving with baseline cost comparison.

// Calculate potential savings using SDK method
const savings = await costlens.calculateSavings('gpt-5.4', messages);

console.log(`Current Cost: $${savings.currentCost}`);
console.log(`Optimized Cost: $${savings.optimizedCost}`);
console.log(`Savings: $${savings.savings} (${savings.savingsPercentage.toFixed(1)}%)`);
console.log(`Recommended Model: ${savings.recommendedModel}`);

// Example output:
// Current Cost: $0.15
// Optimized Cost: $0.03
// Savings: $0.12 (80.0%)
// Recommended Model: gpt-5.4-mini

Types

CostLensConfig

interface CostLensConfig {
  apiKey: string;              // Required - get from Settings
  baseUrl?: string;
  enableCache?: boolean;        // Default: true
  smartRouting?: boolean;       // Enable model routing
  autoOptimize?: boolean;       // Auto cost optimization
  costLimit?: number;           // Monthly cost cap in USD
  maxRetries?: number;          // Default: 3
  logLevel?: 'debug' | 'info' | 'warn' | 'error';
  routingPolicy?: 'cost' | 'quality' | 'balanced';
  qualityValidator?: (response: string, messages: string) => number; // Return 0-1
}

Environment Variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key

Troubleshooting

  • 401 Unauthorized: Check COSTLENS_API_KEY and header formatting.
  • Missing data: Ensure server-side usage for caching/optimization features.
  • Model mismatch: Enable enforcement and verify allowed models.

Previous

← Slack

Next

REST API →