Browse docs

API

Tap to expand

Getting Started

Core Concepts

API

Auth1 page

API Auth Caller Model

Memory5 pages

Index API1 page

Index API

Context and Sources2 pages

Search and Operations2 pages

SDK

Quickstart1 page

SDK Quickstart

Scoping1 page

User and Session Scoping

Modules4 pages

Adapters2 pages

Migration1 page

Migration: RetainDBClient to RetainDB

MCP

Setup3 pages

Primary Tools1 page

Semantic Search Tools

Security and Scope1 page

Security and Scope Controls

Integrations

Frameworks4 pages

Agent Hosts2 pages

Connectors

Web5 pages

Knowledge Bases6 pages

Structured Sources4 pages

Packages and Research4 pages

Dashboard

Overview2 pages

Sources2 pages

Workflows3 pages

Developer1 page

Dev: Keys, SDK, and MCP

Tutorials

Migrations

Operations

Legacy

Legacy Documentation

Contribute

Contributing

APIUpdated 2026-03-18

Cache, Cost, and Usage Endpoints

Inspect cache behavior, understand cost, and measure API activity with the operational endpoints most teams reach for after the first successful integration.

These endpoints are for teams that already have RetainDB running and now want to answer operational questions like:

Is cache doing useful work?
What is this project costing?
Which request types are driving usage?

They are not part of the initial setup path, but they become useful quickly once traffic is real.

Endpoints covered here

GET /v1/cache/stats
GET /v1/cost/summary
GET /v1/cost/breakdown
GET /v1/cost/savings
GET /v1/usage
GET /v1/usage/timeseries

Cache stats

Use GET /v1/cache/stats to answer a simple question: is cache helping or just existing?

bash

curl "https://api.retaindb.com/v1/cache/stats" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Example response:

json

{
  "cache_type": "redis",
  "hit_rate": 0.78,
  "total_requests": 19700,
  "hits": 15420,
  "misses": 4280,
  "size_bytes": 1843200,
  "keys_count": 913,
  "average_latency_ms": 5,
  "uptime_seconds": 86400
}

The fields that matter most at first:

hit_rate
total_requests
average_latency_ms
cache_type

Cost summary

Use GET /v1/cost/summary for the big picture.

Query parameters:

project optional
start_date optional ISO datetime
end_date optional ISO datetime

bash

curl "https://api.retaindb.com/v1/cost/summary?project=retaindb-quickstart" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Example response:

json

{
  "org_id": "org_123",
  "project_id": "proj_456",
  "period": {
    "start": "2026-03-01T00:00:00.000Z",
    "end": "2026-03-09T00:00:00.000Z"
  },
  "total_cost_usd": 125.5,
  "total_requests": 125000,
  "cost_by_model": {
    "claude-sonnet": 74.2
  },
  "cost_by_task": {
    "query": 80.1,
    "ingest": 45.4
  },
  "average_cost_per_request": 0.001,
  "estimated_monthly_cost": 512.0
}

Cost breakdown

Use GET /v1/cost/breakdown when the summary is not enough and you need to see where spend is concentrating.

Query parameters:

project optional
group_by optional: model, task, day, or hour
start_date optional
end_date optional

bash

curl "https://api.retaindb.com/v1/cost/breakdown?group_by=task" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Good first choice: group_by=task

That usually answers the operational question faster than grouping by model.

Cost savings

Use GET /v1/cost/savings when you want RetainDB's optimization story, not just raw spend.

bash

curl "https://api.retaindb.com/v1/cost/savings" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

The response compares actual cost against an “always use the expensive path” baseline.

Look for:

actual_cost_usd
opus_only_cost_usd
savings_usd

Usage summary

Use GET /v1/usage for aggregate usage over the last n days.

bash

curl "https://api.retaindb.com/v1/usage?days=30" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

The response groups usage by event type and includes:

request count
total tokens
total embedding tokens
average latency

Usage timeseries

Use GET /v1/usage/timeseries when you need a trend line instead of a single rollup.

Query parameters:

days optional, defaults to 7
project_id optional

bash

curl "https://api.retaindb.com/v1/usage/timeseries?days=7" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

This is the better endpoint for dashboards and regression checks.

Good defaults

start with org-wide summary before filtering by project
use group_by=task for the first cost investigation
use days=7 or days=30 before reaching for custom ranges
check cache hit rate and usage trends together; one without the other is easy to misread

Common mistakes

Treating usage as billing

Usage and cost are related, but they are not the same endpoint family. Start with usage for volume and cost for spend.

Filtering too early

If you jump straight to a project filter, you can miss whether the problem is systemic across the org.

Debugging latency from cost pages

These pages help with operational visibility, not request-level trace debugging. For single-request behavior, use endpoint-specific responses and trace ids.

Next step

If you want the dashboard view of similar data, go to usage analytics. If you are tuning request behavior rather than monitoring spend, continue to latency accounting.

Was this page helpful?

Your feedback helps us prioritize docs improvements weekly.