SDK Reference

Complete reference for the CostLens SDK - Save 50-80% on AI costs automatically.

💰 Money-Saving Features

Redis caching, quality monitoring, AI optimization, and real-time savings tracking

⚠️ Server-Side Only

Caching and optimization require server-side environment (Node.js, Next.js API routes). Browser usage works but skips these features for security.

Installation

npm install costlens

Constructor

new CostLens(config: CostLensConfig)

Parameters

Name	Type	Required	Description
`apiKey`	string	Yes	Your CostLens API key
`autoOptimize`	boolean	No	💰 Auto-optimize prompts (50-80% token reduction)
`smartRouting`	boolean	No	💰 Route to cheapest model (20x savings)
`enableCache`	boolean	No	💰 Cache responses (80% savings on repeats)
`costLimit`	number	No	💰 Max cost per request (prevents overruns)
`autoFallback`	boolean	No	Auto-fallback on rate limits
`maxRetries`	number	No	Max retry attempts (default: 3)
`baseUrl`	string	No	Custom base URL (default: https://api.costlens.dev)
`routingPolicy`	function	No	🆕 Custom routing decisions
`qualityValidator`	function	No	🆕 Custom quality scoring
`requestId`	string	No	🆕 Request tracking ID
`correlationId`	string	No	🆕 Correlation tracking ID

Example

const costlens = new CostLens({
  apiKey: 'cl_your_api_key_here',
  autoOptimize: true,    // 💰 Save 50-80% on tokens
  smartRouting: true,    // 💰 Route to cheapest model
  enableCache: true,     // 💰 Cache repeated queries
  costLimit: 0.10,       // 💰 Max $0.10 per request
  autoFallback: true,    // Auto-retry on failures
  maxRetries: 3,         // Retry up to 3 times
  
  // 🆕 New SDK Features
  routingPolicy: (requestedModel, messages) => {
    // Custom routing logic
    if (messages.length > 10) return 'gpt-4o-mini';
    return requestedModel;
  },
  qualityValidator: (responseText, messagesJson) => {
    // Custom quality scoring (1-5)
    return responseText.length > 100 ? 5 : 3;
  },
  requestId: 'req_' + Date.now(),
  correlationId: 'session_abc123'
});

Provider Wrappers

Wrap your provider clients so CostLens can route, cache and track usage automatically.

OpenAI

import OpenAI from 'openai';
import CostLens from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Anthropic

import Anthropic from '@anthropic-ai/sdk';
import CostLens from 'costlens';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const trackedClaude = costlens.wrapAnthropic(anthropic);

const res = await trackedClaude.messages.create({
  model: 'claude-3-haiku',
  messages: [{ role: 'user', content: 'Hello' }]
});

Error handling

Return a helpful message and optionally record the failure for visibility.

try {
  const start = Date.now();
  const result = await tracked.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, Date.now() - start, 'prompt-42');
  return result;
} catch (err) {
  await costlens.trackError('openai', params.model as string, JSON.stringify(params.messages), err as Error, 0);
  throw err; // surface to caller
}

Methods

trackOpenAI()

Track an OpenAI API call.

trackOpenAI(
  params: OpenAI.Chat.ChatCompletionCreateParams,
  result: OpenAI.Chat.ChatCompletion,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to OpenAI
result - The response from OpenAI
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await openai.chat.completions.create(params);
await costlens.trackOpenAI(
  params, 
  result, 
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackAnthropic()

Track an Anthropic (Claude) API call.

trackAnthropic(
  params: Anthropic.MessageCreateParams,
  result: Anthropic.Message,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to Anthropic
result - The response from Anthropic
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await anthropic.messages.create(params);
await costlens.trackAnthropic(
  params,
  result,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackGemini()

Track a Google Gemini API call.

trackGemini(
  params: any,
  result: any,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to Gemini
result - The response from Gemini
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await model.generateContent(params);
await costlens.trackGemini(
  { model: 'gemini-pro', ...params },
  result.response,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackGrok()

Track an xAI Grok API call.

trackGrok(
  params: any,
  result: any,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to Grok
result - The response from Grok
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await grok.chat.completions.create(params);
await costlens.trackGrok(
  params,
  result,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackError()

Track a failed API call.

trackError(
  provider: string,
  model: string,
  input: string,
  error: Error,
  latency: number
): Promise<void>

Parameters

provider - The provider (openai, anthropic, gemini, grok)
model - The model that was attempted
input - The input that was sent
error - The error object
latency - Time taken before failure

Example

try {
  const result = await openai.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, latency);
} catch (error) {
  await costlens.trackError(
    'openai',
    params.model,
    JSON.stringify(params.messages),
    error,
    latency
  );
  throw error;
}

🆕 Advanced Features

Custom Routing Policy

Override default routing decisions with custom logic based on request context.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  routingPolicy: (requestedModel, messages) => {
    // Route complex queries to better models
    const complexity = messages.reduce((acc, msg) => acc + msg.content.length, 0);
    
    if (complexity > 1000) {
      return 'gpt-4o'; // Use premium model for complex tasks
    }
    
    if (requestedModel === 'gpt-4' && complexity < 100) {
      return 'gpt-4o-mini'; // Downgrade simple tasks
    }
    
    return requestedModel; // Keep original choice
  }
});

Custom Quality Validation

Implement custom quality scoring to improve routing decisions over time.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  qualityValidator: (responseText, messagesJson) => {
    const messages = JSON.parse(messagesJson);
    
    // Score based on response completeness
    let score = 3; // baseline
    
    if (responseText.length > 200) score += 1;
    if (responseText.includes('```')) score += 1; // code examples
    if (messages.some(m => m.content.includes('?')) && 
        responseText.includes('?')) score -= 1; // answered with question
    
    return Math.max(1, Math.min(5, score)); // clamp 1-5
  }
});

Request Correlation

Track related requests across your application with correlation IDs.

// Track user session
const sessionId = 'session_' + userId;
const requestId = 'req_' + Date.now();

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  requestId: requestId,
  correlationId: sessionId
});

const tracked = costlens.wrapOpenAI(openai);

// All requests will be tagged with these IDs
await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

// Query analytics by correlation ID
const analytics = await fetch(`/api/analytics?correlationId=${sessionId}`);

💰 Money-Saving Features

Redis Caching

Automatically cache responses to save money on repeated requests. Achieves 60-80% hit rates in production.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  enableCache: true, // Enable Redis caching
});

const tracked = costlens.wrapOpenAI(openai);

// First call - cache miss, costs $0.05
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

// Second call - cache hit, costs $0.00!
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

Quality Monitoring

Smart routing automatically disables if response quality drops below 3.5/5 stars.

// SDK checks quality status before routing
const tracked = costlens.wrapOpenAI(openai);

// If quality is good: GPT-4 → GPT-3.5 (saves money)
// If quality dropped: Uses GPT-4 (protects quality)
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Complex task...' }],
});

// Submit feedback to improve routing
await fetch('/api/quality/feedback', {
  method: 'POST',
  body: JSON.stringify({
    runId: 'run_123',
    rating: 5, // 1-5 stars
    feedback: 'Great response!'
  })
});

AI-Powered Optimization

Automatically compress prompts by 30-50% while preserving meaning.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  autoOptimize: true, // Enable AI compression
});

const tracked = costlens.wrapOpenAI(openai);

// Original: 200 tokens
// Optimized: 100 tokens (50% reduction)
// Savings: $0.009 per request
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{
    role: 'user',
    content: 'Please kindly help me understand what the weather will be like tomorrow in San Francisco, California, USA'
  }],
});
// Compressed to: "Weather forecast for San Francisco tomorrow?"

Real-Time Savings

Track exactly how much money you're saving with baseline cost comparison.

// View savings in dashboard
const response = await fetch('/api/savings?period=today');
const savings = await response.json();

console.log(`Saved today: $${savings.totalSaved}`);
console.log(`Breakdown:`);
console.log(`  Smart Routing: $${savings.smartRouting}`);
console.log(`  Caching: $${savings.caching}`);
console.log(`  Optimization: $${savings.optimization}`);
console.log(`Savings Rate: ${savings.savingsRate}%`);

// Example output:
// Saved today: $45.67
// Breakdown:
//   Smart Routing: $30.00
//   Caching: $12.50
//   Optimization: $3.17
// Savings Rate: 68%

Types

CostLensConfig

interface CostLensConfig {
  apiKey: string;
  baseUrl?: string;
}

Environment Variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key