Skip to content

Add Ecologits-based LLM inference energy estimation for AWS Bedrock#163

Open
dpol1 wants to merge 23 commits intoDigitalPebble:mainfrom
dpol1:feature/bedrock-ai-estimates
Open

Add Ecologits-based LLM inference energy estimation for AWS Bedrock#163
dpol1 wants to merge 23 commits intoDigitalPebble:mainfrom
dpol1:feature/bedrock-ai-estimates

Conversation

@dpol1
Copy link
Contributor

@dpol1 dpol1 commented Mar 5, 2026

Summary

Adds a new EnrichmentModule for estimating the energy consumption and embodied emissions of LLM inference on AWS Bedrock, using static coefficients derived from the EcoLogits project.

This follows the same pattern as BoaviztAPIstatic: a static data file bundled in the JAR is loaded at init() time, and the module matches Bedrock CUR rows to per-model coefficients to compute base energy (kWh) and embodied emissions
($gCO_2eq$).

Key design decisions (Discussed in #143)

  • The static file provides only base energy and embodied emissions per token — final operational emissions are computed by the existing downstream pipeline (PUE -> AverageCarbonIntensity -> OperationalEmissions).
  • CUR-specific filtering (AWS Bedrock) is kept separate from the core coefficient lookup logic, so it can be easily reused when Azure or GCP support is added.
  • pricing_unit is parsed to handle unit normalization (e.g., converting "1K tokens" or "1M tokens" to individual tokens).

Current state

  • Define JSON static data contract
  • Create the decoupled class architecture
  • Implement JSON parsing
  • Implement CUR parsing logic
  • Write unit tests
  • Add module to default-config.json

Resolved Uncertainties & Technical Notes

  • Coefficients: Values are derived from EcoLogits research (to be validated before the final merge).
  • Model ID Extraction: Confirmed via the sample CUR data. Extracted using the product map (key: model).
  • Pricing Unit: Handled real-world values (e.g., 1K tokens) seen in the sample data by adding a parseTokenMultiplier normalizer.
  • Spark Map Compatibility: Added safe casting for the product column. Since Spark SQL passes a scala.collection.Map at runtime but local tests might use a java.util.Map, the code handles both using JavaConverters and @SuppressWarnings("unchecked") to prevent ClassCastException.

dpol1 added 3 commits March 4, 2026 18:12
Modeled the data using a `ModelImpacts` static inner class, inspired by the `Impacts` class in `BoaviztAPIClient` to maintain codebase consistency.
@jnioche
Copy link
Member

jnioche commented Mar 6, 2026

a good start @dpol1!
below is a real world example of usage (with some edits)

                                        bill_bill_type = Anniversary
                                             bill_billing_entity = AWS
                                    bill_billing_period_end_date = 2025-09-01 01:00:00+01
                                  bill_billing_period_start_date = 2025-08-01 01:00:00+01
                                                 bill_invoice_id = NULL
                                           bill_invoicing_entity = Amazon Web Services EMEA SARL
                                           bill_payer_account_id = xxxxxxxxxxxxxxxx
                                         bill_payer_account_name = xxxxxxxxxxxxxxxx
                                                   cost_category = {}
                                                        discount = {}
                                       discount_bundled_discount = NULL
                                         discount_total_discount = NULL
                                           identity_line_item_id = r6mzzhm7ibfzamcnbqcg237am3hxzldhwtxsprtozonhlygebu3q
                                          identity_time_interval = 2025-08-12T15:00:00Z/2025-08-12T16:00:00Z
                                     line_item_availability_zone = NULL
                                          line_item_blended_cost = 0.015262
                                          line_item_blended_rate = 0.002
                                         line_item_currency_code = USD
                                          line_item_legal_entity = Amazon Web Services EMEA SARL
                                 line_item_line_item_description = $0.002 per 1K input tokens for Pixtral Large 25.02 in EU (Stockholm)
                                        line_item_line_item_type = Usage
                                    line_item_net_unblended_cost = NULL
                                    line_item_net_unblended_rate = NULL
                                  line_item_normalization_factor = 0.0
                               line_item_normalized_usage_amount = 0.0
                                             line_item_operation = InvokeModelStreamingInference
                                          line_item_product_code = AmazonBedrock
                                           line_item_resource_id = arn:aws:bedrock:eu-north-1:xxxxxxxxxxxxxxxx:inference-profile/eu.mistral.pixtral-large-2502-v1:0
                                              line_item_tax_type = NULL
                                        line_item_unblended_cost = 0.015262
                                        line_item_unblended_rate = 0.002
                                      line_item_usage_account_id = xxxxxxxxxxxxxxxx
                                    line_item_usage_account_name = xxxxxxxxxxxxxxxx
                                          line_item_usage_amount = 7.631
                                        line_item_usage_end_date = 2025-08-12 17:00:00+01
                                      line_item_usage_start_date = 2025-08-12 16:00:00+01
                                            line_item_usage_type = EUN1-PixtralLarge2502-input-tokens
                                                pricing_currency = USD
                                   pricing_lease_contract_length = NULL
                                          pricing_offering_class = NULL
                                   pricing_public_on_demand_cost = 0.015262
                                   pricing_public_on_demand_rate = 0.002
                                         pricing_purchase_option = NULL
                                               pricing_rate_code = YGJDU88BVRJ52VPV.JRTCKXETXF.6YS6EN2CT7
                                                 pricing_rate_id = 220240620176
                                                    pricing_term = OnDemand
                                                    pricing_unit = 1K tokens
                                                         product = {"feature":"On-demand Inference","provider":"Mistral","model":"Pixtral Large 25.02","inference_type":"Input tokens","product_name":"Amazon Bedrock","region":"eu-north-1","servicename":"Amazon Bedrock"}
                                                 product_comment = NULL
                                                product_fee_code = NULL
                                         product_fee_description = NULL
                                           product_from_location = NULL
                                      product_from_location_type = NULL
                                        product_from_region_code = NULL
                                         product_instance_family = NULL
                                           product_instance_type = NULL
                                             product_instancesku = NULL
                                                product_location = EU (Stockholm)
                                           product_location_type = AWS Region
                                               product_operation = NULL
                                            product_pricing_unit = NULL
                                          product_product_family = Amazon Bedrock
                                             product_region_code = eu-north-1
                                             product_servicecode = AmazonBedrock
                                                     product_sku = YGJDU88BVRJ52VPV
                                             product_to_location = NULL
                                        product_to_location_type = NULL
                                          product_to_region_code = NULL
                                               product_usagetype = EUN1-PixtralLarge2502-input-tokens

here are the variatons pricing unit / usage type I found in my dataset

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-NovaLite-input-tokens

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-PixtralLarge2502-output-tokens

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-NovaPro-input-tokens

        pricing_unit = video
line_item_usage_type = USE1-NovaReel-T2V-Medfps-HDRes

        pricing_unit = image
line_item_usage_type = USE1-NovaCanvas-T2I-2048-Standard

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-NovaPro-output-tokens

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-NovaLite-output-tokens

        pricing_unit = 1K tokens
line_item_usage_type = EUN1-PixtralLarge2502-input-tokens

@jnioche
Copy link
Member

jnioche commented Mar 6, 2026

Why did you choose the name EcoLogitsStore ? Not sure what the Store bit is about. Why not just call it EcoLogitsModel?

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 6, 2026

You make a fair point! Actually your feedback is valuable, I didn't know how to call it. EcoLogitsModel sounds good and keeps it simple. I'll rename the class in my next commit. Thanks!
However, I avoided Model because I wanted to distinguish between the 'manager' class and the inner class (ModelImpacts).

@jnioche
Copy link
Member

jnioche commented Mar 6, 2026

You make a fair point! Actually your feedback is valuable,

you sound surprised :-)

I didn't know how to call it. EcoLogitsModel sounds good and keeps it simple. I'll rename the class in my next commit. Thanks! However, I avoided Model because I wanted to distinguish between the 'manager' class and the inner class (ModelImpacts).

what about simply EcoLogits ? you are right that it deals with Models and calling it something model can create confusion.

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 6, 2026

Haha, you got me! Just wanted to make sure we were perfectly aligned. EcoLogits sounds perfect and perfectly clean. Renaming it ASAP, and then I'll dive into the CUR parsing logic in BedrockEcoLogits! Yes that is how I wanna call the other class, you're ok with that?

@jnioche
Copy link
Member

jnioche commented Mar 6, 2026

no offence but this interaction feels like I am refining prompts from chatGPT.
I'll let you use your human intuition to make progress on this :-)

dpol1 and others added 10 commits March 6, 2026 16:00
Set up config init, column mapping, and Javadoc. Enrich logic pending.
Compute energy and CO₂ for Bedrock models from CUR metadata, normalizing token counts and splitting input/output tokens.
Use usageType to split input/output tokens for energy calculation.
Fallback to ratio split when ambiguous.
Fix modelId lookup key and aggregate results instead of overwriting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dpol1 dpol1 marked this pull request as ready for review March 9, 2026 09:46
@dpol1 dpol1 changed the title WIP: Add Ecologits-based LLM inference energy estimation for AWS Bedrock Add Ecologits-based LLM inference energy estimation for AWS Bedrock Mar 9, 2026
@dpol1
Copy link
Contributor Author

dpol1 commented Mar 9, 2026

Looking at the sample CUR data you shared, I want to highlight two edge cases based on how the module is currently built:

  1. Model Name Mismatch: The CUR product map uses commercial names (e.g., "Pixtral Large 25.02"). If our bedrock.json uses API IDs, the lookup will fail and safely skip the row. Going forward, we just need to ensure the JSON keys exactly match these CUR commercial names.
  2. Multimodal Pricing Units: I saw pricing_unit = image and video in your sample. The current math is strictly tailored for LLM tokens (doing the / 1000.0 conversion). Multimodal rows will be safely ignored for now as long as they aren't in the JSON, but we'll need to branch the logic if we decide to support them in the future.

Think this points needs a new PR, @jnioche pls give me a feedback on this

I will also push the general documentation for the new ecologits module shortly.

@jnioche
Copy link
Member

jnioche commented Mar 9, 2026

Looking at the sample CUR data you shared, I want to highlight two edge cases based on how the module is currently built:

  1. Model Name Mismatch: The CUR product map uses commercial names (e.g., "Pixtral Large 25.02"). If our bedrock.json uses API IDs, the lookup will fail and safely skip the row. Going forward, we just need to ensure the JSON keys exactly match these CUR commercial names.

indeed

  1. Multimodal Pricing Units: I saw pricing_unit = image and video in your sample. The current math is strictly tailored for LLM tokens (doing the / 1000.0 conversion). Multimodal rows will be safely ignored for now as long as they aren't in the JSON, but we'll need to branch the logic if we decide to support them in the future.

Think this points needs a new PR, @jnioche pls give me a feedback on this

yes, this would be a separate feature

I will also push the general documentation for the new ecologits module shortly.

great, let me know when you want me to review this PR

dpol1 and others added 3 commits March 9, 2026 12:36
- Add human-readable model names to bedrock.json alongside AWS model IDs.
- Update unit tests to reflect new fields.
Co-authored-by: Claude AI <claude@anthropic.com>
@dpol1
Copy link
Contributor Author

dpol1 commented Mar 9, 2026

@jnioche Ready for review now.
Sorry for the mess, forgot to merge main before pushing docs. Conflicts should be resolved now, but please double-check the modules.md file

Copy link
Member

@jnioche jnioche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dpol1 - have left a few comments

What do you mean by: The EcoLogits coefficients are derived from research data and should be validated before use in production reporting
?

it is true of any external data we use like Boavizta - no need to specify it

did you compare the results you are getting here with examples from Ecologits?

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 10, 2026

thanks @dpol1 - have left a few comments

What do you mean by: The EcoLogits coefficients are derived from research data and should be validated before use in production reporting ?

it is true of any external data we use like Boavizta - no need to specify it

Added it out of an abundance of caution since AI energy estimation is a relatively new field, but you're 100% right. It's redundant. I've removed it to keep the docs consistent with other modules.

did you compare the results you are getting here with examples from Ecologits?

I realized the coefficients I put in bedrock.json are too heavily rounded. I probably made a mistake during the extraction. I'm working on this!

dpol1 added 2 commits March 11, 2026 09:48
- Add warning logs for missing model IDs or coefficients to aid debugging.
- Simplify extraction of usage types and product maps from row data.
- Update energy and emissions values via direct assignment instead of accumulating from previous values.
- Replace placeholder values in bedrock.json with more precise estimates for all supported models.
- Update unit tests to reflect the new coefficients.
@dpol1
Copy link
Contributor Author

dpol1 commented Mar 11, 2026

Moorning @jnioche, I've now extracted the values precisely from the EcoLogits Python library and updated bedrock.json. I'm linking a python script on Colab that documents the extraction workflow: GoogleColab

Also worth noting - EcoLogits already has direct/proxy support for the latest models (like Claude 4.6 Sonnet and Opus).
For now, I kept bedrock.json strictly focused on the models we know the exact AWS CUR billing strings for. Because of how the architecture is set up, as soon as we start seeing these newer models in our actual AWS billing data, supporting them will be as simple as adding a single line to the JSON. No code changes required!

(Of course, if you already have the exact Bedrock CUR strings for the newer models, let me know and I'll add them to the JSON right away).

@dpol1 dpol1 requested a review from jnioche March 12, 2026 08:21
@jnioche jnioche added this to the 0.10 milestone Mar 12, 2026
@jnioche jnioche added the enhancement New feature or request label Mar 12, 2026
@jnioche
Copy link
Member

jnioche commented Mar 12, 2026

thanks @dpol1
I will review and test shortly. This is an important feature and I wan't it to get the attention it deserves so I will first release what we currently have - and that's a lot! It would get drowned in the mass of changes and it would be a shame.

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 12, 2026

Glad to contribute to this - Makes total sense to release the current batch first so this feature gets the right spotlight in 0.10. Take your time with the review and testing, and let me know if you need any further tweaks!

@jnioche
Copy link
Member

jnioche commented Mar 12, 2026

Glad to contribute to this - Makes total sense to release the current batch first so this feature gets the right spotlight in 0.10. Take your time with the review and testing, and let me know if you need any further tweaks!

one thing worth investigating is that the billing can have references to batch inference, which I know has quite an impact on the energy used. Maybe worth checking if batching is taken into account by Ecologits?

https://bentoml.com/llm/inference-optimization/static-dynamic-continuous-batching

this site is on my to read list and contains loads of useful explanations.

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 12, 2026

one thing worth investigating is that the billing can have references to batch inference, which I know has quite an impact on the energy used. Maybe worth checking if batching is taken into account by Ecologits?

think this article from EcoLogits might be useful: https://ecologits.ai/latest/methodology/llm_inference/

https://bentoml.com/llm/inference-optimization/static-dynamic-continuous-batching

this site is on my to read list and contains loads of useful explanations.

will have a look and report back the findings for both

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 13, 2026

EcoLogits & batch size — TL;DR

EcoLogits does model batch size in its formulas (energy/token and latency/token), but then hardcodes B = 64 (B = batch size = number of requests the GPU processes concurrently) for all estimates. No distinction between batching strategies, no provider-specific adjustment.
What production actually looks like (BentoML handbook):

  • Static → waits for full batch, GPU idles on short sequences
  • Dynamic → time-window collection, batches not always full
  • Continuous (vLLM, TGI, SGLang) → completed slots replaced immediately, GPU occupancy ~100%

Concrete numbers (CMU/HuggingFace, ACL 2025): continuous batching + vLLM cuts energy by up to 73% vs unoptimized baselines on real workloads (BurstGPT, Azure traces). Batch size effect on energy is non-linear: drops until ~B=64, then plateaus + KV cache pressure kicks in.

Practical implication: EcoLogits numbers should be read as a "mid-batch estimate (B=64)". They underestimate cost in low-traffic scenarios (effective B << 64) and overestimate for async batch inference (e.g. Bedrock batch mode, priced at 50% of on-demand — a clear signal of a different infrastructure path and energy profile).

Suggested action: add a note in the docs flagging the B=64 assumption. For now, the most useful patch is just to document the assumption. Dynamic implementation requires data that we don't have.

@jnioche What are your thoughts on this? How would you prefer to proceed regarding a potential dynamic implementation in the future?

Copy link
Member

@jnioche jnioche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few suggestions

@jnioche
Copy link
Member

jnioche commented Mar 13, 2026

@jnioche What are your thoughts on this? How would you prefer to proceed regarding a potential dynamic implementation in the future?

best to document for now as you suggested and raise the issue with the Ecologits project itself. We can just mirror what they do for now and refine later. Worst case we could add a factor to modify the values to compensate for the under/overestimation depending on batch or non batch

dpol1 added 4 commits March 14, 2026 16:47
- Use `row.getJavaMap()` instead of manual `instanceof` checks for map extraction from `Row`, aligning with `Networking.java`.
- Update tests to convert Java maps to Scala maps when building `GenericRowWithSchema`.
- Refactor bedrock.json to use aliases for model variants, reducing redundancy.
- Update EcoLogits to register multiple model IDs from an optional aliases field.
Document the limitations of the EcoLogits `B=64` baseline, as shown in DigitalPebble#163.
@jnioche jnioche linked an issue Mar 17, 2026 that may be closed by this pull request
@jnioche
Copy link
Member

jnioche commented Mar 21, 2026

hi @dpol1 are you planning more work on this PR or is it ready to review? Thanks!

@dpol1
Copy link
Contributor Author

dpol1 commented Mar 21, 2026

Hi @jnioche, ready for review!

@dpol1 dpol1 requested a review from jnioche March 21, 2026 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide estimates for AI models

2 participants