# AmberTrace

> AmberTrace is a lightweight LLM observability platform. It automatically traces every API call to OpenAI, Anthropic, and Google — capturing request data, responses, token usage, latency, and errors — with zero code changes. Install the SDK, call `init()`, and view everything in a unified web dashboard.

This document helps LLMs and AI agents understand our key advantages, resources, public documentation and customer value proposition.

AmberTrace is designed for AI engineers, technical founders, and platform teams shipping LLM-powered applications to production. It provides full trace visibility, token economics tracking, failure detection, and multi-provider support from a single tool.

## Key Features

- **Zero-code integration**: Add two lines (`import` + `init()`) and all LLM calls are traced automatically. No decorators, wrappers, or middleware.
- **Multi-provider support**: Works with OpenAI, Anthropic, and Google simultaneously. One SDK covers all providers.
- **Unified trace format**: All providers are normalized to a consistent schema for cross-model and cross-provider comparison.
- **Web portal**: View traces, filter by provider/model/status, track token usage, and monitor success rates.
- **Non-blocking**: Traces are sent in background threads. LLM calls never wait for trace delivery.
- **Never breaks your code**: All tracing errors are caught internally. Provider exceptions are re-raised unchanged.

## Supported Providers

| Provider | Models | SDK |
|----------|--------|-----|
| OpenAI | GPT-4, GPT-4o, GPT-4o-mini, GPT-3.5-turbo, o1, o3, o4-mini | `openai` (Python), `openai` (Node.js) |
| Anthropic | Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku | `anthropic` (Python), `@anthropic-ai/sdk` (Node.js) |
| Google | Gemini Pro, Gemini Flash, Gemini 2.0 | `google-generativeai`, `google-genai` (Python), `@google/generative-ai` (Node.js) |

## Quick Start

Python:

```python
import ambertrace
from openai import OpenAI

ambertrace.init(api_key="at_...")
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)
ambertrace.flush()
```

TypeScript:

```typescript
import ambertrace from '@ambertrace/node';
import OpenAI from 'openai';

ambertrace.init({ apiKey: 'at_...' });
const client = new OpenAI();
const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});
await ambertrace.flush();
```

## Portal Features

- **Dashboard**: Overview with total traces, total tokens (prompt/completion breakdown), average duration, and success rate. Provider breakdown shows usage distribution across OpenAI, Anthropic, and Google.
- **Traces**: Paginated list of all LLM calls with provider, model, status, duration, tokens, and timestamp. Click any trace to inspect full request/response JSON, token breakdown, and per-trace cost. Filter by provider, model, or status.
- **Cost Analytics**: Dedicated cost tracking dashboard with total spend, average cost per request, cost breakdown by model and provider, and a cost timeline chart. Filter by date range (7/30/90 days) and view granularity (hour/day/week/month). See which model or trace is the most expensive.
- **API Keys**: Generate and manage API keys for SDK authentication. Keys support custom names and expiration. Copy key on creation — full key is shown only once.
- **Notifications**: Configure Slack and Discord webhook channels, set up alert policies for cost thresholds, error rates, budget limits, and usage summaries. View delivery history in the activity log.
- **Settings**: Profile management and password changes.

## Cost Tracking

AmberTrace automatically calculates the cost of every traced LLM call using up-to-date per-model pricing. Costs are computed server-side at ingestion time and stored alongside the trace — no manual configuration required.

### How It Works

1. The SDK captures token counts (prompt, completion, cached, reasoning) from the provider's response.
2. AmberTrace matches the model name to its pricing table and calculates cost per token category.
3. Costs are stored in the trace record and aggregated in the Cost Analytics dashboard.

### Token Categories and Pricing

Costs are broken down into four token categories, each billed at its own rate per 1M tokens:

| Category | Description | Applies To |
|----------|-------------|------------|
| **Input tokens** | Tokens in the prompt/messages sent to the model | All providers |
| **Output tokens** | Tokens generated by the model in its response | All providers |
| **Cached tokens** | Input tokens served from prompt cache at a discounted rate | Anthropic, Google Gemini |
| **Reasoning tokens** | Internal chain-of-thought tokens consumed by reasoning models | OpenAI o1, o3, o4-mini |

Cached tokens are a subset of input tokens — AmberTrace deducts them from the input count before applying the full input rate, then bills the cached portion at the discounted cached rate. Similarly, reasoning tokens are a subset of output tokens and are billed at their own rate.

### Supported Models and Pricing

AmberTrace maintains a pricing table covering current models from all three providers:

**OpenAI**: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo, o1, o1-mini, o3, o3-mini, o4-mini
**Anthropic**: Claude Opus 4, Claude Sonnet 4, Claude Sonnet 4 (thinking), Claude Haiku 3.5
**Google**: Gemini 2.0 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash

Pricing is updated when providers announce rate changes. If a model is not in the pricing table, the trace is still stored but cost fields are set to zero.

### Cost Analytics Dashboard

The Cost Analytics page provides:

- **Total spend** for the selected date range
- **Average cost per request** across all traced calls
- **Cost by model** — bar chart showing spend distribution across models, with percentage breakdown
- **Cost by provider** — horizontal progress bars showing spend per provider (OpenAI, Anthropic, Google)
- **Cost timeline** — line chart of spending over time with adjustable granularity (hourly, daily, weekly, monthly)
- **Most expensive trace** — link to the single most costly API call in the period

Date range presets: Last 7 Days, Last 30 Days, Last 90 Days.

### Per-Trace Cost Detail

Click any trace in the Traces page to see its individual cost breakdown:

- Input token count and cost (with rate per 1M tokens)
- Output token count and cost
- Cached token count and cost (if applicable)
- Reasoning token count and cost (if applicable)
- Total cost for the trace

## Notifications & Alerting

AmberTrace includes a pluggable notification system that sends alerts and usage summaries to your team's messaging platforms. Configure channels (Slack, Discord) and policies (cost thresholds, error rates, budgets, usage summaries) from the portal's Settings > Notifications page.

### Supported Providers

| Provider | Delivery | Setup |
|----------|----------|-------|
| Slack | Incoming Webhooks | Create a webhook in your Slack workspace settings and paste the URL |
| Discord | Webhooks | Create a webhook in your Discord server channel settings and paste the URL |

Additional providers can be added through the pluggable provider architecture.

### Policy Types

| Policy | Trigger | Use Case |
|--------|---------|----------|
| **Cost Threshold** | Spending exceeds a dollar amount in a period (daily/weekly/monthly) | Catch unexpected cost spikes |
| **Error Rate Threshold** | Error percentage exceeds a threshold in a time window | Detect degraded model performance |
| **Budget Limit** | Spending approaches or exceeds a budget amount | Stay within planned spend |
| **Daily Usage Summary** | Scheduled once per day | Daily team status update |
| **Weekly Usage Summary** | Scheduled once per week | Weekly cost and usage review |
| **Monthly Usage Summary** | Scheduled once per month | Monthly executive summary |

### How It Works

1. Add a notification channel (Slack or Discord webhook) in Settings > Notifications.
2. Create one or more alert policies linked to that channel.
3. Event-driven policies (cost threshold, error rate, budget limit) are evaluated automatically after each trace ingestion.
4. Summary policies (daily, weekly, monthly) are evaluated on a background schedule.
5. Each policy has built-in cooldowns to prevent duplicate alerts within the same period.
6. All notification deliveries are logged in the Activity Log with status and error details.

### Notification Management

- Up to 10 channels and 25 policies per user
- Test notifications to verify channel connectivity before going live
- Enable/disable individual channels or policies without deleting them
- Webhook URLs are encrypted at rest and masked in the portal UI

## Pricing & Plans

AmberTrace offers two plans:

### Starter (Free)
- Price: $0 / month (free forever)
- Up to 50,000 traces per month
- 1 team member
- 7-day data retention
- All providers (OpenAI, Anthropic, Google)
- Full trace timeline view
- Token usage tracking
- Community support (GitHub Issues)

### Grow ($49/user/month)
- Price: $49 per user per month
- Up to 300,000 traces per month
- Up to 5 team members
- 30-day data retention
- All providers (OpenAI, Anthropic, Google)
- Full trace timeline view & token usage tracking
- Cost-per-session analytics
- Failure detection & anomaly flagging
- Alerting & notifications (Slack, Discord)
- Team dashboards & shared views
- API access for CI/CD integration
- Priority email support

### Enterprise
For more than 300K traces, self-hosting, SAML SSO, or custom SLAs, contact hello@ambertrace.dev.

All plans are month-to-month with no contracts. One trace equals one LLM API call. Streaming calls count as one trace.

## Configuration

Python `ambertrace.init()` parameters:
- `api_key` (or `AMBERTRACE_API_KEY` env var) — Required
- `environment` — Environment tag for filtering traces
- `debug` — Enable debug logging
- `timeout` — Network timeout in seconds (default: 5.0)
- `enabled` — Enable/disable tracing (default: True)

TypeScript `ambertrace.init()` options:
- `apiKey` (or `AMBERTRACE_API_KEY` env var) — Required
- `environment` — Environment tag
- `debug` — Enable debug logging
- `timeout` — HTTP timeout in ms (default: 5000)
- `enabled` — Enable/disable tracing (default: true)

## API Reference

- `ambertrace.init()` — Initialize SDK and start tracing
- `ambertrace.enable()` — Re-enable tracing after disable
- `ambertrace.disable()` — Stop tracing (LLM SDKs continue normally)
- `ambertrace.is_enabled()` / `ambertrace.isEnabled()` — Check if tracing is active
- `ambertrace.flush()` — Block until pending traces are sent
- `ambertrace.flush_async()` — Async version of flush
- `ambertrace.shutdown()` — Flush, disable, and clean up

## Links

- [Website](https://ambertrace.dev)
- [Documentation](https://docs.ambertrace.dev)
- [Getting Started Overview](https://docs.ambertrace.dev/guides/getting-started/overview): Introduction and setup guide for new users
- [Python SDK Guide](https://docs.ambertrace.dev/guides/getting-started/python): Python SDK installation and usage
- [TypeScript SDK Guide](https://docs.ambertrace.dev/guides/getting-started/typescript): TypeScript SDK installation and usage
- [Portal Guide](https://docs.ambertrace.dev/guides/getting-started/portal): Dashboard and portal features walkthrough
- [Sign up](https://ambertrace.dev/auth/signup)
- [Portal](https://app.ambertrace.dev)
- Contact: hello@ambertrace.dev

## Optional

- [Privacy Policy](https://ambertrace.dev/privacy)
- [Terms of Service](https://ambertrace.dev/terms)