Skip to content

Analytics

Niccanor Dhas edited this page Feb 22, 2026 · 1 revision

Analytics

The Analytics section of the dashboard provides aggregated views into your AI infrastructure across multiple dimensions. All analytics are scoped to your selected organization and project and can be filtered by time range.


LLM Analytics

Analytics → LLM gives you visibility into how your language models are being used and what they cost.

Token Usage Over Time

A time-series chart showing input tokens, output tokens, and total tokens broken down by hour, day, or month. Use this to understand usage patterns and plan capacity.

Cost by Application

A breakdown of estimated spend by application_name. Helps you understand which parts of your system are the most expensive.

Top Models

A ranked list of the models used most frequently. Shows request count, average tokens, and estimated cost per model.

Model Usage Over Time

Time-series data for each model's usage — useful for tracking migrations (e.g., from GPT-3.5 to GPT-4o) or monitoring adoption of new models.

Requests Over Time

Total request volume over time, filterable by operation type (chat, embeddings, image, audio, agent, tool).


GPU Analytics

Analytics → GPU visualizes GPU resource consumption from your instrumented workloads.

Metrics shown:

  • GPU utilization % over time
  • Memory used vs. total (per GPU)
  • Encoder/decoder utilization
  • Temperature and power draw
  • Per-GPU breakdown for multi-GPU systems

Requires collect_gpu_stats=True in your SDK init() call. See GPU Monitoring.


Vector DB Analytics

Analytics → Vector shows performance metrics from instrumented vector database operations.

Views available:

  • Operations over time (inserts, queries, deletes)
  • Latency percentiles
  • Breakdown by collection/index
  • Breakdown by system (Chroma, Pinecone, Qdrant, etc.)
  • Breakdown by environment and application

Models Analytics

Analytics → Models provides a view into the AI models you've configured in Settings → Models — their usage, cost, and performance as used in OpenGround comparisons and AI Arbiter evaluations.


Guardrail Analytics

Analytics → Guardrails shows the effectiveness and activity of your guardrails:

  • Detection rate over time
  • Breakdown by guard type (Prompt Injection, Sensitive Topics, etc.)
  • Breakdown by classification category
  • Per-application guardrail metrics
  • Flagged vs. passed ratio

Scores Analytics

Analytics → Scores shows quality evaluation scores over time:

  • Average score per dataset or experiment run
  • Score distribution
  • Comparison across experiment runs
  • AI Arbiter evaluation trends

Filtering and Time Ranges

All analytics views support filtering by:

Filter | Description -- | -- Time range | Preset (last hour, 24h, 7d, 30d) or custom date range Application name | Filter by your application_name value Environment | Filter by environment (production, staging, etc.) Model | Filter to a specific model Provider | Filter by LLM provider Operation type | chat, embeddings, image, audio, vectordb, agent

Data Aggregation

Analytics data is aggregated server-side from the raw traces and metrics stored in MongoDB. The aggregation pipeline groups spans by time buckets, model, application, and environment.

For high-traffic deployments, consider:

  • Adding MongoDB indexes on timestamp, orgId, proId, and spanAttributes
  • Setting up TTL indexes to expire old raw traces while preserving aggregated analytics
# Analytics

The Analytics section of the dashboard provides aggregated views into your AI infrastructure across multiple dimensions. All analytics are scoped to your selected organization and project and can be filtered by time range.


LLM Analytics

Analytics → LLM gives you visibility into how your language models are being used and what they cost.

Token Usage Over Time

A time-series chart showing input tokens, output tokens, and total tokens broken down by hour, day, or month. Use this to understand usage patterns and plan capacity.

Cost by Application

A breakdown of estimated spend by application_name. Helps you understand which parts of your system are the most expensive.

Top Models

A ranked list of the models used most frequently. Shows request count, average tokens, and estimated cost per model.

Model Usage Over Time

Time-series data for each model's usage — useful for tracking migrations (e.g., from GPT-3.5 to GPT-4o) or monitoring adoption of new models.

Requests Over Time

Total request volume over time, filterable by operation type (chat, embeddings, image, audio, agent, tool).


GPU Analytics

Analytics → GPU visualizes GPU resource consumption from your instrumented workloads.

Metrics shown:

  • GPU utilization % over time
  • Memory used vs. total (per GPU)
  • Encoder/decoder utilization
  • Temperature and power draw
  • Per-GPU breakdown for multi-GPU systems

Requires collect_gpu_stats=True in your SDK init() call. See [GPU Monitoring](GPU-Monitoring).


Vector DB Analytics

Analytics → Vector shows performance metrics from instrumented vector database operations.

Views available:

  • Operations over time (inserts, queries, deletes)
  • Latency percentiles
  • Breakdown by collection/index
  • Breakdown by system (Chroma, Pinecone, Qdrant, etc.)
  • Breakdown by environment and application

Models Analytics

Analytics → Models provides a view into the AI models you've configured in Settings → Models — their usage, cost, and performance as used in OpenGround comparisons and AI Arbiter evaluations.


Guardrail Analytics

Analytics → Guardrails shows the effectiveness and activity of your guardrails:

  • Detection rate over time
  • Breakdown by guard type (Prompt Injection, Sensitive Topics, etc.)
  • Breakdown by classification category
  • Per-application guardrail metrics
  • Flagged vs. passed ratio

Scores Analytics

Analytics → Scores shows quality evaluation scores over time:

  • Average score per dataset or experiment run
  • Score distribution
  • Comparison across experiment runs
  • AI Arbiter evaluation trends

Filtering and Time Ranges

All analytics views support filtering by:

Filter Description
Time range Preset (last hour, 24h, 7d, 30d) or custom date range
Application name Filter by your application_name value
Environment Filter by environment (production, staging, etc.)
Model Filter to a specific model
Provider Filter by LLM provider
Operation type chat, embeddings, image, audio, vectordb, agent

Data Aggregation

Analytics data is aggregated server-side from the raw traces and metrics stored in MongoDB. The aggregation pipeline groups spans by time buckets, model, application, and environment.

For high-traffic deployments, consider:

  • Adding MongoDB indexes on timestamp, orgId, proId, and spanAttributes
  • Setting up TTL indexes to expire old raw traces while preserving aggregated analytics

Clone this wiki locally