-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add Cost Estimation and Health Monitoring Example #1088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
haripriyarao26
wants to merge
7
commits into
google-gemini:main
Choose a base branch
from
haripriyarao26:feat-cost-observability
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
329a70d
add cost estimation and health monitoring notebook
haripriyarao26 8383d8e
update notebook to follow rules
haripriyarao26 916de56
update readme
haripriyarao26 fee8866
add exception
haripriyarao26 28e4060
update file name in colab
haripriyarao26 3560e50
solve the minor grammatical comments
haripriyarao26 f593f52
Merge branch 'main' into feat-cost-observability
haripriyarao26 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,378 @@ | ||
| { | ||
| "cells": [ | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "LWWFilxLgFvH" | ||
| }, | ||
| "source": [ | ||
| "##### Copyright 2025 Google LLC." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "zKZnDihChHk4" | ||
| }, | ||
| "source": [ | ||
| "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n", | ||
| "# you may not use this file except in compliance with the License.\n", | ||
| "# You may obtain a copy of the License at\n", | ||
| "#\n", | ||
| "# https://www.apache.org/licenses/LICENSE-2.0\n", | ||
| "#\n", | ||
| "# Unless required by applicable law or agreed to in writing, software\n", | ||
| "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", | ||
| "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", | ||
| "# See the License for the specific language governing permissions and\n", | ||
| "# limitations under the License.\n", | ||
| "\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "XS_CBAC0gNsd" | ||
| }, | ||
| "source": [ | ||
| "# Cost Estimation and Health Monitoring with Gemini\n", | ||
| "\n", | ||
| "<a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Cost_Estimation_And_Health_Monitoring.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=30/></a>" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "Vk5Chb-XgPpf" | ||
| }, | ||
| "source": [ | ||
| "## Overview\n", | ||
| "\n", | ||
| "Cost observability is key for scaling Gemini API applications. Without tracking token usage and costs, you can't optimize your spending or identify expensive operations.\n", | ||
| "\n", | ||
| "This notebook demonstrates how to build an observability layer for Gemini API applications. You will learn how to:\n", | ||
| "\n", | ||
| "1. Extract token usage metadata from API responses\n", | ||
| "2. Calculate real-time USD costs based on model pricing\n", | ||
| "3. Perform health checks to verify API availability and quota\n", | ||
| "\n", | ||
| "By the end of this notebook, you'll have a reusable class that you can integrate into any Gemini API application to monitor costs and health." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": { | ||
| "id": "zcEAmXVLgRNc" | ||
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "## Setup\n", | ||
| "\n", | ||
| "First, install the Gemini API Python library." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "qR32cW4ugSyv" | ||
| }, | ||
| "source": [ | ||
| "%pip install -U -q \"google-genai>=1.0.0\"\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": { | ||
| "id": "G0i9yHPogUa6" | ||
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "### Grab an API Key\n", | ||
| "\n", | ||
| "Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.\n", | ||
| "\n", | ||
| "<a class=\"button button-primary\" href=\"https://aistudio.google.com/app/apikey\" target=\"_blank\" rel=\"noopener noreferrer\">Get an API key</a>\n", | ||
| "\n", | ||
| "In Colab, add the key to the secrets manager under the \"\ud83d\udd11\" in the left panel. Give it the name `GOOGLE_API_KEY`." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": { | ||
| "id": "vnfyGQ_IgV3W" | ||
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "from google import genai\n", | ||
| "from google.genai import types\n", | ||
| "from typing import Dict\n", | ||
| "from google.colab import userdata\n", | ||
| "\n", | ||
| "# Get API key from Colab secrets\n", | ||
| "api_key = userdata.get('GOOGLE_API_KEY')\n", | ||
| "client = genai.Client(api_key=api_key)\n", | ||
| "print(\"\u2705 API key configured successfully!\")" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": { | ||
| "id": "pinhZ6S8gXz2" | ||
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "## Cost Estimator Class\n", | ||
| "\n", | ||
| "The `GeminiObservability` class provides a simple way to track costs and monitor the health of your Gemini API usage. It handles different pricing for different models automatically." | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": { | ||
| "id": "29M5u1SuuTW9" | ||
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# @title\n", | ||
| "class GeminiObservability:\n", | ||
| " \"\"\"A class to monitor costs and health of Gemini API usage.\"\"\"\n", | ||
| "\n", | ||
| " def __init__(self, client):\n", | ||
| " \"\"\"\n", | ||
| " Initialize the observability class.\n", | ||
| "\n", | ||
| " Args:\n", | ||
| " client: The genai.Client instance\n", | ||
| " \"\"\"\n", | ||
| " self.client = client\n", | ||
| " # Prices per 1 million tokens (Example 2025 rates)\n", | ||
| " # Update these prices based on current pricing at https://ai.google.dev/pricing\n", | ||
| " self.prices = {\n", | ||
| " \"gemini-2.5-flash\": {\"input\": 0.075, \"output\": 0.30},\n", | ||
| " \"gemini-2.5-pro\": {\"input\": 3.50, \"output\": 10.50},\n", | ||
| " \"gemini-2.5-flash-lite\": {\"input\": 0.0375, \"output\": 0.15},\n", | ||
| " \"gemini-3-flash-preview\": {\"input\": 0.10, \"output\": 0.40},\n", | ||
| " \"gemini-3-pro-preview\": {\"input\": 3.50, \"output\": 10.50},\n", | ||
| " }\n", | ||
| "\n", | ||
| " def estimate_cost(self, usage_metadata: types.UsageMetadata, model_name: str) -> float:\n", | ||
| " \"\"\"\n", | ||
| " Calculate the cost in USD based on token usage.\n", | ||
| "\n", | ||
| " Args:\n", | ||
| " usage_metadata: The usage_metadata object from a Gemini API response\n", | ||
| " model_name: The name of the model used (e.g., \"gemini-2.5-flash\")\n", | ||
| "\n", | ||
| " Returns:\n", | ||
| " The estimated cost in USD\n", | ||
| " \"\"\"\n", | ||
| " if model_name not in self.prices:\n", | ||
| " print(f\"Warning: No pricing data for model {model_name}. Returning 0.0\")\n", | ||
| " return 0.0\n", | ||
| "\n", | ||
| " # Calculate input cost (prompt tokens)\n", | ||
| " input_cost = (\n", | ||
| " usage_metadata.input_token_count / 1_000_000\n", | ||
| " ) * self.prices[model_name][\"input\"]\n", | ||
| "\n", | ||
| " # Calculate output cost (response tokens)\n", | ||
| " output_cost = (\n", | ||
| " usage_metadata.output_token_count / 1_000_000\n", | ||
| " ) * self.prices[model_name][\"output\"]\n", | ||
| "\n", | ||
| " return input_cost + output_cost\n", | ||
| "\n", | ||
| " def check_health(self, model_name: str = \"gemini-2.5-flash\") -> bool:\n", | ||
| " \"\"\"\n", | ||
| " Perform a simple health check by making a minimal API call.\n", | ||
| "\n", | ||
| " Args:\n", | ||
| " model_name: The model to test (default: \"gemini-2.5-flash\")\n", | ||
| "\n", | ||
| " Returns:\n", | ||
| " True if the API is healthy, False otherwise\n", | ||
| " \"\"\"\n", | ||
| " try:\n", | ||
| " response = self.client.models.generate_content(\n", | ||
| " model=model_name,\n", | ||
| " contents=\"ping\",\n", | ||
| " config=types.GenerateContentConfig(max_output_tokens=1)\n", | ||
| " )\n", | ||
| " return True\n", | ||
| " except exceptions.ResourceExhausted as e:\n", | ||
| " print(f\"Health Check Failed: Quota limit reached. {e}\")\n", | ||
| " return False\n", | ||
| " except Exception as e:\n", | ||
| " print(f\"Health Check Failed: {e}\")\n", | ||
| " return False\n", | ||
| "\n", | ||
| " def get_usage_summary(self, usage_metadata: types.UsageMetadata, model_name: str) -> Dict:\n", | ||
| " \"\"\"\n", | ||
| " Get a summary of token usage and cost.\n", | ||
| "\n", | ||
| " Args:\n", | ||
| " usage_metadata: The usage_metadata object from a Gemini API response\n", | ||
| " model_name: The name of the model used\n", | ||
| "\n", | ||
| " Returns:\n", | ||
| " A dictionary with usage details and cost\n", | ||
| " \"\"\"\n", | ||
| " cost = self.estimate_cost(usage_metadata, model_name)\n", | ||
| " return {\n", | ||
| " \"model\": model_name,\n", | ||
| " \"input_tokens\": usage_metadata.input_token_count,\n", | ||
| " \"output_tokens\": usage_metadata.output_token_count,\n", | ||
| " \"total_tokens\": usage_metadata.total_token_count,\n", | ||
| " \"estimated_cost_usd\": round(cost, 6)\n", | ||
| " }" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": { | ||
| "id": "PG6VQiyigeIQ" | ||
| }, | ||
| "source": [ | ||
| "## Example Usage\n", | ||
| "\n", | ||
| "You can now see the observability class in action. First, create an instance of the class.\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Create an instance of the observability class\n", | ||
| "observability = GeminiObservability(client)\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "MODEL_ID = \"gemini-2.5-flash\" # @param [\"gemini-2.5-flash-lite\", \"gemini-2.5-flash\", \"gemini-2.5-pro\", \"gemini-3-flash-preview\", \"gemini-3-pro-preview\"] {\"allow-input\": true, \"isTemplate\": true}\n", | ||
| "\n", | ||
| "# Check if the API is healthy\n", | ||
| "is_healthy = observability.check_health(MODEL_ID)\n", | ||
| "print(f\"API Health Status: {'\u2705 Healthy' if is_healthy else '\u274c Unhealthy'}\")\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "### Cost Estimation Example\n", | ||
| "\n", | ||
| "Make a simple API call and track the cost.\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Make a simple API call\n", | ||
| "response = client.models.generate_content(\n", | ||
| " model=MODEL_ID,\n", | ||
| " contents=\"Explain quantum computing in one sentence.\"\n", | ||
| ")\n", | ||
| "\n", | ||
| "# Get usage metadata from the response\n", | ||
| "usage_metadata = response.usage_metadata\n", | ||
| "\n", | ||
| "# Calculate and display the cost\n", | ||
| "summary = observability.get_usage_summary(usage_metadata, MODEL_ID)\n", | ||
| "print(\"Usage Summary:\")\n", | ||
| "for key, value in summary.items():\n", | ||
| " print(f\" {key}: {value}\")\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "### Tracking Multiple Requests\n", | ||
| "\n", | ||
| "You can track costs across multiple API calls to monitor your total spending.\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "# Example: Track costs across multiple requests\n", | ||
| "total_cost = 0.0\n", | ||
| "queries = [\n", | ||
| " \"What is machine learning?\",\n", | ||
| " \"Explain neural networks briefly.\",\n", | ||
| " \"What is the difference between AI and ML?\"\n", | ||
| "]\n", | ||
| "\n", | ||
| "print(\"Tracking costs for multiple queries:\\n\")\n", | ||
| "for i, query in enumerate(queries, 1):\n", | ||
| " response = client.models.generate_content(\n", | ||
| " model=MODEL_ID,\n", | ||
| " contents=query\n", | ||
| " )\n", | ||
| " usage_metadata = response.usage_metadata\n", | ||
| " cost = observability.estimate_cost(usage_metadata, MODEL_ID)\n", | ||
| " total_cost += cost\n", | ||
| "\n", | ||
| " total_tokens = usage_metadata.total_token_count\n", | ||
| " print(f\"Query {i}: {query[:50]}...\")\n", | ||
| " print(f\" Tokens: {total_tokens} | Cost: ${cost:.6f}\\n\")\n", | ||
| "\n", | ||
| "print(f\"Total cost for all queries: ${total_cost:.6f}\")\n" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## Next Steps\n", | ||
| "\n", | ||
| "Now that you have a working cost estimation and health monitoring system, you can:\n", | ||
| "\n", | ||
| "1. **Integrate into your applications**: Add the `GeminiObservability` class to your production code to track costs in real-time\n", | ||
| "2. **Set up alerts**: Monitor total costs and set up alerts when spending exceeds thresholds\n", | ||
| "3. **Optimize usage**: Use the token counts to identify expensive operations and optimize your prompts\n", | ||
| "4. **Update pricing**: Keep the pricing dictionary up-to-date with current rates from [Google AI Pricing](https://ai.google.dev/pricing)\n", | ||
| "\n", | ||
| "For more information about the Gemini API, check out the [quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts) and other [examples](https://github.com/google-gemini/cookbook/tree/main/examples).\n" | ||
| ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
| "colab": { | ||
| "provenance": [] | ||
| }, | ||
| "kernelspec": { | ||
| "display_name": ".venv", | ||
| "language": "python", | ||
| "name": "python3" | ||
| }, | ||
| "language_info": { | ||
| "name": "python", | ||
| "version": "3.14.0" | ||
| } | ||
| }, | ||
| "nbformat": 4, | ||
| "nbformat_minor": 0 | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.