Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
378 changes: 378 additions & 0 deletions examples/Cost_Estimation_And_Health_Monitoring.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,378 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "LWWFilxLgFvH"
},
"source": [
"##### Copyright 2025 Google LLC."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zKZnDihChHk4"
},
"source": [
"# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XS_CBAC0gNsd"
},
"source": [
"# Cost Estimation and Health Monitoring with Gemini\n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Cost_Estimation_And_Health_Monitoring.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=30/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Vk5Chb-XgPpf"
},
"source": [
"## Overview\n",
"\n",
"Cost observability is key for scaling Gemini API applications. Without tracking token usage and costs, you can't optimize your spending or identify expensive operations.\n",
"\n",
"This notebook demonstrates how to build an observability layer for Gemini API applications. You will learn how to:\n",
"\n",
"1. Extract token usage metadata from API responses\n",
"2. Calculate real-time USD costs based on model pricing\n",
"3. Perform health checks to verify API availability and quota\n",
"\n",
"By the end of this notebook, you'll have a reusable class that you can integrate into any Gemini API application to monitor costs and health."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "zcEAmXVLgRNc"
},
"outputs": [],
"source": [
"## Setup\n",
"\n",
"First, install the Gemini API Python library."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qR32cW4ugSyv"
},
"source": [
"%pip install -U -q \"google-genai>=1.0.0\"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "G0i9yHPogUa6"
},
"outputs": [],
"source": [
"### Grab an API Key\n",
"\n",
"Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.\n",
"\n",
"<a class=\"button button-primary\" href=\"https://aistudio.google.com/app/apikey\" target=\"_blank\" rel=\"noopener noreferrer\">Get an API key</a>\n",
"\n",
"In Colab, add the key to the secrets manager under the \"\ud83d\udd11\" in the left panel. Give it the name `GOOGLE_API_KEY`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vnfyGQ_IgV3W"
},
"outputs": [],
"source": [
"from google import genai\n",
"from google.genai import types\n",
"from typing import Dict\n",
"from google.colab import userdata\n",
"\n",
"# Get API key from Colab secrets\n",
"api_key = userdata.get('GOOGLE_API_KEY')\n",
"client = genai.Client(api_key=api_key)\n",
"print(\"\u2705 API key configured successfully!\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pinhZ6S8gXz2"
},
"outputs": [],
"source": [
"## Cost Estimator Class\n",
"\n",
"The `GeminiObservability` class provides a simple way to track costs and monitor the health of your Gemini API usage. It handles different pricing for different models automatically."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "29M5u1SuuTW9"
},
"outputs": [],
"source": [
"# @title\n",
"class GeminiObservability:\n",
" \"\"\"A class to monitor costs and health of Gemini API usage.\"\"\"\n",
"\n",
" def __init__(self, client):\n",
" \"\"\"\n",
" Initialize the observability class.\n",
"\n",
" Args:\n",
" client: The genai.Client instance\n",
" \"\"\"\n",
" self.client = client\n",
" # Prices per 1 million tokens (Example 2025 rates)\n",
" # Update these prices based on current pricing at https://ai.google.dev/pricing\n",
" self.prices = {\n",
" \"gemini-2.5-flash\": {\"input\": 0.075, \"output\": 0.30},\n",
" \"gemini-2.5-pro\": {\"input\": 3.50, \"output\": 10.50},\n",
" \"gemini-2.5-flash-lite\": {\"input\": 0.0375, \"output\": 0.15},\n",
" \"gemini-3-flash-preview\": {\"input\": 0.10, \"output\": 0.40},\n",
" \"gemini-3-pro-preview\": {\"input\": 3.50, \"output\": 10.50},\n",
" }\n",
"\n",
" def estimate_cost(self, usage_metadata: types.UsageMetadata, model_name: str) -> float:\n",
" \"\"\"\n",
" Calculate the cost in USD based on token usage.\n",
"\n",
" Args:\n",
" usage_metadata: The usage_metadata object from a Gemini API response\n",
" model_name: The name of the model used (e.g., \"gemini-2.5-flash\")\n",
"\n",
" Returns:\n",
" The estimated cost in USD\n",
" \"\"\"\n",
" if model_name not in self.prices:\n",
" print(f\"Warning: No pricing data for model {model_name}. Returning 0.0\")\n",
" return 0.0\n",
"\n",
" # Calculate input cost (prompt tokens)\n",
" input_cost = (\n",
" usage_metadata.input_token_count / 1_000_000\n",
" ) * self.prices[model_name][\"input\"]\n",
"\n",
" # Calculate output cost (response tokens)\n",
" output_cost = (\n",
" usage_metadata.output_token_count / 1_000_000\n",
" ) * self.prices[model_name][\"output\"]\n",
"\n",
" return input_cost + output_cost\n",
"\n",
" def check_health(self, model_name: str = \"gemini-2.5-flash\") -> bool:\n",
" \"\"\"\n",
" Perform a simple health check by making a minimal API call.\n",
"\n",
" Args:\n",
" model_name: The model to test (default: \"gemini-2.5-flash\")\n",
"\n",
" Returns:\n",
" True if the API is healthy, False otherwise\n",
" \"\"\"\n",
" try:\n",
" response = self.client.models.generate_content(\n",
" model=model_name,\n",
" contents=\"ping\",\n",
" config=types.GenerateContentConfig(max_output_tokens=1)\n",
" )\n",
" return True\n",
" except exceptions.ResourceExhausted as e:\n",
" print(f\"Health Check Failed: Quota limit reached. {e}\")\n",
" return False\n",
" except Exception as e:\n",
" print(f\"Health Check Failed: {e}\")\n",
" return False\n",
Comment thread
haripriyarao26 marked this conversation as resolved.
"\n",
" def get_usage_summary(self, usage_metadata: types.UsageMetadata, model_name: str) -> Dict:\n",
" \"\"\"\n",
" Get a summary of token usage and cost.\n",
"\n",
" Args:\n",
" usage_metadata: The usage_metadata object from a Gemini API response\n",
" model_name: The name of the model used\n",
"\n",
" Returns:\n",
" A dictionary with usage details and cost\n",
" \"\"\"\n",
" cost = self.estimate_cost(usage_metadata, model_name)\n",
" return {\n",
" \"model\": model_name,\n",
" \"input_tokens\": usage_metadata.input_token_count,\n",
" \"output_tokens\": usage_metadata.output_token_count,\n",
" \"total_tokens\": usage_metadata.total_token_count,\n",
" \"estimated_cost_usd\": round(cost, 6)\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PG6VQiyigeIQ"
},
"source": [
"## Example Usage\n",
"\n",
"You can now see the observability class in action. First, create an instance of the class.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create an instance of the observability class\n",
"observability = GeminiObservability(client)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"MODEL_ID = \"gemini-2.5-flash\" # @param [\"gemini-2.5-flash-lite\", \"gemini-2.5-flash\", \"gemini-2.5-pro\", \"gemini-3-flash-preview\", \"gemini-3-pro-preview\"] {\"allow-input\": true, \"isTemplate\": true}\n",
"\n",
"# Check if the API is healthy\n",
"is_healthy = observability.check_health(MODEL_ID)\n",
"print(f\"API Health Status: {'\u2705 Healthy' if is_healthy else '\u274c Unhealthy'}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Cost Estimation Example\n",
"\n",
"Make a simple API call and track the cost.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Make a simple API call\n",
"response = client.models.generate_content(\n",
" model=MODEL_ID,\n",
" contents=\"Explain quantum computing in one sentence.\"\n",
")\n",
"\n",
"# Get usage metadata from the response\n",
"usage_metadata = response.usage_metadata\n",
"\n",
"# Calculate and display the cost\n",
"summary = observability.get_usage_summary(usage_metadata, MODEL_ID)\n",
"print(\"Usage Summary:\")\n",
"for key, value in summary.items():\n",
" print(f\" {key}: {value}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tracking Multiple Requests\n",
"\n",
"You can track costs across multiple API calls to monitor your total spending.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Example: Track costs across multiple requests\n",
"total_cost = 0.0\n",
"queries = [\n",
" \"What is machine learning?\",\n",
" \"Explain neural networks briefly.\",\n",
" \"What is the difference between AI and ML?\"\n",
"]\n",
"\n",
"print(\"Tracking costs for multiple queries:\\n\")\n",
"for i, query in enumerate(queries, 1):\n",
" response = client.models.generate_content(\n",
" model=MODEL_ID,\n",
" contents=query\n",
" )\n",
" usage_metadata = response.usage_metadata\n",
" cost = observability.estimate_cost(usage_metadata, MODEL_ID)\n",
" total_cost += cost\n",
"\n",
" total_tokens = usage_metadata.total_token_count\n",
" print(f\"Query {i}: {query[:50]}...\")\n",
" print(f\" Tokens: {total_tokens} | Cost: ${cost:.6f}\\n\")\n",
"\n",
"print(f\"Total cost for all queries: ${total_cost:.6f}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"Now that you have a working cost estimation and health monitoring system, you can:\n",
"\n",
"1. **Integrate into your applications**: Add the `GeminiObservability` class to your production code to track costs in real-time\n",
"2. **Set up alerts**: Monitor total costs and set up alerts when spending exceeds thresholds\n",
"3. **Optimize usage**: Use the token counts to identify expensive operations and optimize your prompts\n",
"4. **Update pricing**: Keep the pricing dictionary up-to-date with current rates from [Google AI Pricing](https://ai.google.dev/pricing)\n",
"\n",
"For more information about the Gemini API, check out the [quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts) and other [examples](https://github.com/google-gemini/cookbook/tree/main/examples).\n"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.14.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading