Skip to main content

Token Usage

If you are an administrator responsible for monitoring AI spend, this page is for you. Token Usage helps you track how much each LLM model is costing your organization by showing consumption broken down by model, day, and month — so you can identify heavy usage patterns, set budgets, and avoid unexpected bills.

The Token Usage page (/token-usage) gives administrators a clear view of how many tokens are being consumed by the LLM integrations — broken down by model, day, and month. This is the primary tool for tracking AI costs and identifying heavy usage patterns.

Availability

This page only appears in the sidebar when at least one LLM instance is enabled. If no LLM is configured, the Generative AI section will not show Token Usage.


Period Selector

At the top of the page, a month navigator lets you move forward and backward through calendar months. All cards and tables update instantly when the period changes.

◀  January 2025  ▶

Summary Cards

Four cards provide an at-a-glance snapshot of the selected month:

CardDescription
Total RequestsNumber of LLM API calls made during the period
Input TokensTotal tokens sent to the model (prompts, context, system messages)
Output TokensTotal tokens received from the model (generated responses)
Total TokensSum of input + output tokens

Values are displayed in a human-readable format — for example 1.2M, 45K, or 8.3K — rather than raw numbers.


Summary by Model

A monthly aggregation table grouping consumption by LLM instance:

ColumnDescription
InstanceThe LLM instance name as configured in Administration
VendorProvider (e.g., OpenAI, Anthropic, Google, Azure, Ollama)
ModelSpecific model identifier (e.g., gpt-4o, claude-3-5-sonnet)
RequestsTotal requests to this instance in the period
InputTotal input tokens for this instance
OutputTotal output tokens for this instance
TotalCombined token count for this instance

Use this table to compare token consumption across different providers and models, and to identify which instances account for the largest share of usage.


Daily Breakdown

A day-by-day table showing consumption per model across the selected month:

ColumnDescription
DateCalendar day
InstanceLLM instance name
VendorProvider
ModelModel identifier
Input TokensTokens sent that day
Output TokensTokens received that day
Total TokensCombined token count for that day
RequestsNumber of requests on that day

Days with no LLM activity do not appear in the table. This breakdown helps spot daily spikes — for example, a batch re-indexing job or a high-traffic event.


How Token Recording Works

Turing ES records token usage automatically on every LLM call — no extra configuration is needed.

What gets recorded on each call:

FieldDescription
InstanceWhich LLM instance handled the request
VendorProvider name
ModelModel identifier from the response metadata
UsernameThe authenticated user who triggered the call
Input TokensExtracted from the LLM response metadata
Output TokensExtracted from the LLM response metadata
Total TokensInput + output
TimestampDate and time of the request
note

Responses where all token counts are zero are not recorded. This filters out failed or incomplete LLM calls that would skew usage statistics.


API Endpoint

Token usage data is also available via the REST API. See REST API Reference → Token Usage API for endpoint details and examples.