Skip to content

fix: correct UCUM units for TiBy and kilobytes in unitMap#66

Open
Naman-B-Parlecha wants to merge 6 commits into
prometheus:mainfrom
Naman-B-Parlecha:main
Open

fix: correct UCUM units for TiBy and kilobytes in unitMap#66
Naman-B-Parlecha wants to merge 6 commits into
prometheus:mainfrom
Naman-B-Parlecha:main

Conversation

@Naman-B-Parlecha

Copy link
Copy Markdown

Fixes two incorrect UCUM unit mappings as mentioned in #65

  1. TiBy should map to tebibytes and not tibibytes
  2. kilobytes should map to kBy and not KBy

Signed-off-by: Naman-B-Parlecha <naman.parlecha@finalroundai.com>
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
@Naman-B-Parlecha

Copy link
Copy Markdown
Author

PTAL @ArthurSens @dashpole
Thanks!

@aknuds1 aknuds1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Naman-B-Parlecha @dashpole Some questions:

  1. Won't we need to update the tebibytes spelling simultaneously also in opentelemetry-collector-contrib?
  2. Won't renaming "tibibytes" metrics units to "tebibytes" be a backwards compatible change in Prometheus, that might break things for users?
  3. If we stop supporting "KBy" as an OTLP unit, won't that silently break support for producers sending the "KBy" spelling?
  4. Won't we need to update the "KBy" spelling simultaneously also in opentelemetry-collector-contrib?

Comment thread metric_namer.go Outdated
Comment thread metric_namer_test.go
@dashpole

Copy link
Copy Markdown
Contributor

Sure. I'm OK keeping KBy for backwards-compatibility. We can also coordinate the update to the collector.

FYI @Naman-B-Parlecha that is here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/138fbe1878d1b8c8bd31e3c942ec0d34e08c0c87/pkg/translator/prometheus/unit_to_ucum.go#L24

@aknuds1

aknuds1 commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Thanks @dashpole - WDYT about my concern that renaming "tibibytes" metrics units to "tebibytes" will be a backwards compatible change in Prometheus, potentially breaking things for users though?

@Naman-B-Parlecha

Naman-B-Parlecha commented Apr 12, 2026

Copy link
Copy Markdown
Author

Thanks @dashpole - WDYT about my concern that renaming "tibibytes" metrics units to "tebibytes" will be a backwards compatible change in Prometheus, potentially breaking things for users though?

Worth adding a warning log where this conversion happens in prometheus/prometheus for user visibility?

@dashpole @aknuds1

@aknuds1

aknuds1 commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

I'm wondering whether this backwards compatible change should be possible to disable, e.g. for configurability from Grafana Mimir. As for Prometheus itself, maybe the old behaviour should be enabled until v4? As backwards incompatible changes shouldn't be made until the next major version (of Prometheus). @dashpole @Naman-B-Parlecha.

WDYT @ArthurSens @krajorama @jesusvazquez @ywwg?

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
@ArthurSens

ArthurSens commented Apr 14, 2026

Copy link
Copy Markdown
Member

Hmmmm, while I understand the potential negative effect, I'm not sure the units being changed are popular enough for us to be THIS careful.

I can see kylobytes being used somehow (and we're keeping the backward compatible behavior), but Tebibytes? 😅

I don't think this is used by any Prometheus exporter, and it's not part of any semantic conventions in OTel either. The only way this is used is when someone creates a custom metric and intentionally measures things in Terabytes. There are very few companies in the world that measure things on this scale.

I'd be super happy if we could just do a patch release for this, but if you're feeling very strongly opinionated about this, we could do a 2.0. What do you think?

@aknuds1

aknuds1 commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

@ArthurSens I'm not suggesting a major bump for otlptranslator, but potential configurability. I'm further suggesting that we might want to disable this new behaviour in Prometheus until v4, for backwards compatibility. It's a question from my side though, and I'd like feedback from other maintainers.

As for Grafana Mimir, the configurability could be good to have to protect users from potential metric renames.

@dashpole

Copy link
Copy Markdown
Contributor

I'm not that worried about this this breaking users. I think this is a case where it is clearly a bug fix, and unlikely to have a significant negative impact. If upstream projects want configurability, it should be pretty easy for them to add special handling for TiBy prior to calling into this library to preserve existing behavior.

I would probably advocate for using a feature gate to allow users to temporarily revert back to the old name (and provide feedback, if necessary) for a period of time.

@aknuds1

aknuds1 commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

@dashpole That's what I'm suggesting too. We already have precedence for such configuration parameters in otlptranslator, see:

  • PreserveMultipleUnderscores
  • UnderscoreLabelSanitization

@Naman-B-Parlecha The above two are examples of configuration parameters controlling backwards compatible translation behaviour.

@dashpole

Copy link
Copy Markdown
Contributor

Sure. Lets do that then

…suffixes

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
@Naman-B-Parlecha

Copy link
Copy Markdown
Author

Hey @dashpole @aknuds1 @ArthurSens!
Apologies for the delay, have added the configurable flag for updated metric namer

one concern i have is naming currently added UpdatedMetricMapping, if u guys any better convention happy to use the suggestion!!

Comment thread metric_namer.go Outdated
…capingWithUpdatedSuffixes

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
@Naman-B-Parlecha

Copy link
Copy Markdown
Author

@dashpole PTAL added two new strategies UnderscoreEscapingWithUpdatedSuffixes and NoUTF8EscapingWithUpdatedSuffixes

@dashpole

Copy link
Copy Markdown
Contributor

We shouldn't need a new strategy, right? Just booleans on the Namer structs (they should probably be "legacyUnitMapping" or something like that so that the default value of false uses the new behavior).

@aknuds1 aknuds1 self-requested a review May 20, 2026 06:34

@aknuds1 aknuds1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole is right. Adding new strategies is not the right approach. Please see my suggestions for a starting point.

Comment thread README.md
- **Namespace Support**: Add configurable namespace prefixes
- **UTF-8 Support**: Choose between Prometheus legacy scheme compliant metric/label names (`[a-zA-Z0-9:_]`) or untranslated metric/label names
- **Translation Strategy Configuration**: Select a translation strategy with a standard set of strings.
- **Updated UCUM Unit Mappings**: Opt in to corrected UCUM unit suffixes (`TiBy`: `tebibytes`, `kBy`: `kilobytes`) by selecting the `UnderscoreEscapingWithUpdatedSuffixes` or `NoUTF8EscapingWithUpdatedSuffixes` translation strategy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently reads like a changelog item. Please see my suggestion for line 10 instead.

Suggested change
- **Updated UCUM Unit Mappings**: Opt in to corrected UCUM unit suffixes (`TiBy`: `tebibytes`, `kBy`: `kilobytes`) by selecting the `UnderscoreEscapingWithUpdatedSuffixes` or `NoUTF8EscapingWithUpdatedSuffixes` translation strategy.

Comment thread README.md
## Features

- **Metric Name and Label Translation**: Convert OTLP metric names and attributes to Prometheus-compliant format
- **Unit Handling**: Translate OTLP units to Prometheus unit conventions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Unit Handling**: Translate OTLP units to Prometheus unit conventions, using spec-correct UCUM mappings by default with an opt-out (`LegacyUnitMapping`) for callers pinned to pre-correction names
- ```

Comment thread README.md
Comment on lines +35 to +37
// Use UnderscoreEscapingWithUpdatedSuffixes to opt into corrected UCUM unit mappings:
// TiBy -> "tebibytes" instead of "tibibytes" and kBy -> "kilobytes".
strategy := otlptranslator.UnderscoreEscapingWithUpdatedSuffixes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo this change.

Comment thread README.md
unitNamer.Build("By") // bytes
unitNamer.Build("requests/s") // requests_per_second
unitNamer.Build("1") // "" (dimensionless)
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a corresponding migration sub-section:

Suggested change
### Migration: Pre-correction unit suffixes
This library now uses the spec-correct UCUM unit suffixes by default. Two mappings changed compared to earlier releases:
- `TiBy``tebibytes` (previously `tibibytes``tebi-` is the SI binary prefix; the older spelling was a misspelling).
- `kBy``kilobytes` is **newly** recognised (previously not mapped). `KBy` continues to map to `kilobytes` as before, for backwards compatibility.
**Behavior change for existing consumers**: dashboards, alerts, and recording rules that reference `*_tibibytes` will silently stop receiving new datapoints once your Prometheus / OTel Collector deployment picks up this library version.
To preserve the pre-correction names while you migrate, opt in to the legacy mapping (combine with `UTF8Allowed: true` if your callers use UTF-8 metric names):
\\\go
namer := otlptranslator.MetricNamer{
WithMetricSuffixes: true,
LegacyUnitMapping: true,
}
\\\

Comment thread metric_namer.go
"KiBy": "kibibytes",
"MiBy": "mebibytes",
"GiBy": "gibibytes",
"TiBy": "tibibytes",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current units should live in the canonical unitMap:

Suggested change
"TiBy": "tebibytes",
"kBy": "kilobytes",

Comment thread metric_namer_test.go
},
wantMetricName: "requêsts_per_second",
wantUnitName: "per_second",
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding more test cases:

Suggested change
},
{
name: "utf8 metric with TiBy under default unit mapping",
namer: MetricNamer{
UTF8Allowed: true,
WithMetricSuffixes: true,
},
metric: Metric{
Name: "capacity",
Unit: "TiBy",
Type: MetricTypeGauge,
},
wantMetricName: "capacity_tebibytes",
wantUnitName: "tebibytes",
},
{
name: "utf8 metric with TiBy under legacy unit mapping",
namer: MetricNamer{
UTF8Allowed: true,
WithMetricSuffixes: true,
LegacyUnitMapping: true,
},
metric: Metric{
Name: "capacity",
Unit: "TiBy",
Type: MetricTypeGauge,
},
wantMetricName: "capacity_tibibytes",
wantUnitName: "tibibytes",
},
{
name: "utf8 metric with kBy under legacy unit mapping passes through unmapped",
namer: MetricNamer{
UTF8Allowed: true,
WithMetricSuffixes: true,
LegacyUnitMapping: true,
},
metric: Metric{
Name: "transfer",
Unit: "kBy",
Type: MetricTypeGauge,
},
wantMetricName: "transfer_kBy",
wantUnitName: "kBy",
},
{

Comment thread metric_namer_test.go
unitNamer := UnitNamer{
UTF8Allowed: tt.namer.UTF8Allowed,
UTF8Allowed: tt.namer.UTF8Allowed,
UpdatedMetricMapping: tt.namer.UpdatedMetricMapping,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UpdatedMetricMapping: tt.namer.UpdatedMetricMapping,
LegacyUnitMapping: tt.namer.LegacyUnitMapping,

Comment thread strategy.go
Comment on lines +60 to +70
// NoUTF8EscapingWithUpdatedSuffixes will accept metric/label names as they are. Unit
// and type suffixes may be added to metric names, according to certain rules.
// This option includes an updated mapping for the UCUM metrics suffixes tebibyte and kBy.
NoUTF8EscapingWithUpdatedSuffixes TranslationStrategyOption = "NoUTF8EscapingWithUpdatedSuffixes"
// UnderscoreEscapingWithSuffixes is the default option for translating OTLP
// to Prometheus. This option will translate metric name characters that are
// not alphanumerics/underscores/colons to underscores, and label name
// characters that are not alphanumerics/underscores to underscores. Unit and
// type suffixes may be appended to metric names, according to certain rules.
// This option includes an updated mapping for the UCUM metrics suffixes tebibyte and kBy.
UnderscoreEscapingWithUpdatedSuffixes TranslationStrategyOption = "UnderscoreEscapingWithUpdatedSuffixes"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New strategies is the wrong approach:

Suggested change
// NoUTF8EscapingWithUpdatedSuffixes will accept metric/label names as they are. Unit
// and type suffixes may be added to metric names, according to certain rules.
// This option includes an updated mapping for the UCUM metrics suffixes tebibyte and kBy.
NoUTF8EscapingWithUpdatedSuffixes TranslationStrategyOption = "NoUTF8EscapingWithUpdatedSuffixes"
// UnderscoreEscapingWithSuffixes is the default option for translating OTLP
// to Prometheus. This option will translate metric name characters that are
// not alphanumerics/underscores/colons to underscores, and label name
// characters that are not alphanumerics/underscores to underscores. Unit and
// type suffixes may be appended to metric names, according to certain rules.
// This option includes an updated mapping for the UCUM metrics suffixes tebibyte and kBy.
UnderscoreEscapingWithUpdatedSuffixes TranslationStrategyOption = "UnderscoreEscapingWithUpdatedSuffixes"

Comment thread unit_namer.go
type UnitNamer struct {
UTF8Allowed bool
UTF8Allowed bool
UpdatedMetricMapping bool

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UpdatedMetricMapping bool
// LegacyUnitMapping selects the pre-correction UCUM unit mappings (e.g.
// "TiBy" -> "tibibytes" instead of the spec-correct "tebibytes"). The
// default value (false) uses the spec-correct mappings. Set to true to
// preserve metric names produced by older versions of this library.
LegacyUnitMapping bool

Comment thread unit_namer.go
Comment on lines +77 to +84
if updatedSuffix {
// checks for updatd values, TiBy <-> tebibytes and kBy <-> kilobytes.
if promUnit, ok := updatedUnitMap[unit]; ok {
return promUnit
}
// doesnt return cause what if the unit is not in the updated map but is
// in the original map, we want to check that as well.
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if updatedSuffix {
// checks for updatd values, TiBy <-> tebibytes and kBy <-> kilobytes.
if promUnit, ok := updatedUnitMap[unit]; ok {
return promUnit
}
// doesnt return cause what if the unit is not in the updated map but is
// in the original map, we want to check that as well.
}
if legacy {
if promUnit, ok := legacyUnitMap[unit]; ok {
return promUnit
}
if _, excluded := legacyUnitExclusions[unit]; excluded {
return unit
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants