Cache, Cost, and Usage Endpoints
Inspect cache behavior, understand cost, and measure API activity with the operational endpoints most teams reach for after the first successful integration.
These endpoints are for teams that already have RetainDB running and now want to answer operational questions like:
- Is cache doing useful work?
- What is this project costing?
- Which request types are driving usage?
They are not part of the initial setup path, but they become useful quickly once traffic is real.
Endpoints covered here
GET /v1/cache/statsGET /v1/cost/summaryGET /v1/cost/breakdownGET /v1/cost/savingsGET /v1/usageGET /v1/usage/timeseries
Cache stats
Use GET /v1/cache/stats to answer a simple question: is cache helping or just existing?
curl "https://api.retaindb.com/v1/cache/stats" \
-H "Authorization: Bearer $RETAINDB_API_KEY"Example response:
{
"cache_type": "redis",
"hit_rate": 0.78,
"total_requests": 19700,
"hits": 15420,
"misses": 4280,
"size_bytes": 1843200,
"keys_count": 913,
"average_latency_ms": 5,
"uptime_seconds": 86400
}The fields that matter most at first:
hit_ratetotal_requestsaverage_latency_mscache_type
Cost summary
Use GET /v1/cost/summary for the big picture.
Query parameters:
projectoptionalstart_dateoptional ISO datetimeend_dateoptional ISO datetime
curl "https://api.retaindb.com/v1/cost/summary?project=retaindb-quickstart" \
-H "Authorization: Bearer $RETAINDB_API_KEY"Example response:
{
"org_id": "org_123",
"project_id": "proj_456",
"period": {
"start": "2026-03-01T00:00:00.000Z",
"end": "2026-03-09T00:00:00.000Z"
},
"total_cost_usd": 125.5,
"total_requests": 125000,
"cost_by_model": {
"claude-sonnet": 74.2
},
"cost_by_task": {
"query": 80.1,
"ingest": 45.4
},
"average_cost_per_request": 0.001,
"estimated_monthly_cost": 512.0
}Cost breakdown
Use GET /v1/cost/breakdown when the summary is not enough and you need to see where spend is concentrating.
Query parameters:
projectoptionalgroup_byoptional:model,task,day, orhourstart_dateoptionalend_dateoptional
curl "https://api.retaindb.com/v1/cost/breakdown?group_by=task" \
-H "Authorization: Bearer $RETAINDB_API_KEY"Good first choice: group_by=task
That usually answers the operational question faster than grouping by model.
Cost savings
Use GET /v1/cost/savings when you want RetainDB's optimization story, not just raw spend.
curl "https://api.retaindb.com/v1/cost/savings" \
-H "Authorization: Bearer $RETAINDB_API_KEY"The response compares actual cost against an “always use the expensive path” baseline.
Look for:
actual_cost_usdopus_only_cost_usdsavings_usd
Usage summary
Use GET /v1/usage for aggregate usage over the last n days.
curl "https://api.retaindb.com/v1/usage?days=30" \
-H "Authorization: Bearer $RETAINDB_API_KEY"The response groups usage by event type and includes:
- request count
- total tokens
- total embedding tokens
- average latency
Usage timeseries
Use GET /v1/usage/timeseries when you need a trend line instead of a single rollup.
Query parameters:
daysoptional, defaults to7project_idoptional
curl "https://api.retaindb.com/v1/usage/timeseries?days=7" \
-H "Authorization: Bearer $RETAINDB_API_KEY"This is the better endpoint for dashboards and regression checks.
Good defaults
- start with org-wide summary before filtering by project
- use
group_by=taskfor the first cost investigation - use
days=7ordays=30before reaching for custom ranges - check cache hit rate and usage trends together; one without the other is easy to misread
Common mistakes
Treating usage as billing
Usage and cost are related, but they are not the same endpoint family. Start with usage for volume and cost for spend.
Filtering too early
If you jump straight to a project filter, you can miss whether the problem is systemic across the org.
Debugging latency from cost pages
These pages help with operational visibility, not request-level trace debugging. For single-request behavior, use endpoint-specific responses and trace ids.
Next step
If you want the dashboard view of similar data, go to usage analytics. If you are tuning request behavior rather than monitoring spend, continue to latency accounting.
Was this page helpful?
Your feedback helps us prioritize docs improvements weekly.