Skip to content

Tags: extensible model grouping#440

Draft
Seth Fitzsimmons (mojodna) wants to merge 3 commits intodevfrom
tags
Draft

Tags: extensible model grouping#440
Seth Fitzsimmons (mojodna) wants to merge 3 commits intodevfrom
tags

Conversation

@mojodna
Copy link
Collaborator

Draft design for replacing the hardcoded namespace concept with tags -- string labels declared by package authors and derived by tag providers.

Motivating goals:

  • Move discover_models from core into system
  • Support richer model categorization and filtering for CLI, code generation, and JSON Schema output
  • Simplify the extension mechanism

Design doc: docs/designs/tags.md

Draft design for replacing the hardcoded namespace concept with
extensible tags declared by package authors and derived by tag
providers.
@vcschapp
Copy link
Collaborator

Victor Schappert (vcschapp) commented Feb 12, 2026

Before reading this (to avoid any kind of subtle anchoring bias), I want to make sure I channel the ideas I was taking out of yesterday's conversation.

tl;dr

  1. System package does discovery and assigns tags.
  2. The following tags are reserved:
    1. feature - Automatically assigned by the system package if the model derives from Feature (not OvertureFeature).
    2. extension - Automatically assigned by the system package if the model meets the definition of extension.
    3. attributes (?) - Automatically assigned by the system package to non-models (???).
    4. overture - Automatically assigned by the system package to models discovered from a hard-coded list of designated Overture packages that lives in system.
  3. System package provides a default tagger. Maybe it can be overridden, or maybe it is always used.
  4. Additional taggers can be discovered via entry-points.
  5. We will export Overture-flavored tagger(s):
    • OPTION 1. In the core package. This tagger understands OvertureFeature, and uses the theme property of a model to add a theme-based tag.
    • OPTION 2. In each theme package. Each such tagger just knows what the package exports and slaps the theme-based tag onto them.

CLI interface

Short-form list (default). Basic FQCN.

$ overture-schema list
overture.schema.buildings.Building
overture.schema.buildings.BuildingPart
overture.schema.divisions.Divisions
overture.schema.divisions.DivisionArea
...
overture.schema.ext.places.OpeningHours
overture.schema.ext.transportation.Lanes
...
foo.overture.bar.GeographicalArea

Note that if the entry-points specify a "friendly name", e.g. building, then the output becomes nicer.

$ overture-schema list
address
connector
building
building_part
division
division_area
place
segment

Long-form list. One-liner table in a nicer format with tags.

$ overture-schema list -l
MODULE                             CLASS             TAGS
overture.schema.buildings          Building          [overture] [feature] [buildings]
overture.schema.buildings          BuildingPart      [overture] [feature] [buildings]
overture.schema.divisions          Division          [overture] [feature] [buildings]
overture.schema.divisions          DivisionArea      [overture] [feature] [buildings]
...
overture.schema.ext.places         OpeningHours      [overture] [extension] [attributes] [places]
overture.schema.ext.transportation Lanes             [overture] [extension] [attributes] [transportation]
...
foo.overture.bar                   GeographicalArea  [community] [extension] [feature] [divisions]

Long-form list. One-liner table in a nicer format with tags.

$ overture-schema list -l
MODULE                             CLASS             TAGS
overture.schema.buildings          Building          [overture] [feature] [buildings]
overture.schema.buildings          BuildingPart      [overture] [feature] [buildings]
overture.schema.divisions          Division          [overture] [feature] [buildings]
overture.schema.divisions          DivisionArea      [overture] [feature] [buildings]
...
overture.schema.ext.places         OpeningHours      [overture] [extension] [attributes] [places]
overture.schema.ext.transportation Lanes             [overture] [extension] [attributes] [transportation]
...
foo.overture.bar                   GeographicalArea  [community] [extension] [feature] [divisions]

Again, if we have the friendly name available from the entry-point maybe there's something nicer here:

$ overture-schema list -l
MODEL              TAGS                                                  DESCRIPTION
building           [overture] [feature] [buildings]                      Man-made structure with a roof existing permanently ...
building_part      [overture] [feature] [buildings]
division           [overture] [feature] [buildings]
division_area      [overture] [feature] [buildings]
...
opening_hours      [overture] [extension] [attributes] [places]
lanes              [overture] [extension] [attributes] [transportation]
...
geographical_area  [community] [extension] [feature] [divisions]

Long-form list filtering on tags.

$ overture-schema list -l --tag community --tag extension
MODULE                             CLASS             TAGS
foo.overture.bar                   GeographicalArea  [community] [extension] [feature] [divisions]

Full detail, filtering on tags.

$ overture-schema document --tag overture --tag buildings
BUILDING
========
overture.schema.buildings.Building                                       [overture] [feature] [buildings]

Buildings are man-made structures with roofs that exist permanently in one place.

A building's geometry represents the two-dimensional footprint of the building as viewed from
directly above, looking down. Fields such as `height` and `num_floors` allow the three-dimensional
shape to be approximated. Some buildings, identified by the `has_parts` field, have associated
`BuildingPart` features which can be used to generate a more representative 3D model of the building.

<Maybe down here you list the feature geometry type and properties also?>

BUILDING PART
==============
overture.schema.buildings.BuildingPart                                   [overture] [feature] [buildings]

...

Open issues

Union models

Do we really need to or want to create union models, as is currently done in the overture-schema package and perhaps other places also.

This is needed to supply a "blindly validate a blob" CLI command, like:

$ overture-schema validate-anything <anything.json

Personally, I think that the complexity we're taking on by doing this is high and the demonstrated necessity of this use case is low. Users can instead just specify the model to validate against:

$ overture-schema validate --type building <building.json

IMO we should remove the tagged union and associated behavior for now and tackle this issue in the future, when it bubbles up from P2+ to P0.

```

Note: `approved` is an unprefixed tag (not `system:approved`) since this
provider lives outside `system`.
Copy link
Contributor

@danabauer Dana Bauer (danabauer) Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does approved live outside of system?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be in overture-schema-core or downstream because it's an Overture concern (how we deem extensions or schemas as vetted) rather than overture-schema-system ("George", the parts of the schema system that we believe aren't specific to Overture).

ModelKey loses theme and type fields; gains name (from entry
point key) and class_name (entry point value). Theme becomes
overture:theme=buildings declared in [project].keywords rather
than parsed from entry point naming.

Introduces machine tag format [namespace:]key[=value] with
tags_by_key/tags_by_namespace helpers in system. CLI replaces
--theme with generic --group-by <key> that works for any
structured tag dimension.

Adds codegen path generation appendix: feature models use
tags_by_key for theme directory, supplementary types use
schema_root tag + module prefix stripping.
Restructure around meeting outcomes: tags generalize the
ModelKind classifier from pydantic-extensions, with tag
providers as the mechanism from Phase 1 (not deferred).

- Add Purpose, Generalizing the Classifier, Privileged
  Packages, Security Roadmap, and Deferred Keywords sections
- Merge old Phase 1 (static tags) and Phase 2 (tag providers)
  into a single phase; tag providers are the only tag source
- Add privilege table for prefix reservation (overture:* owned
  by core, system:* owned by system) over flat reservation,
  with rationale: structured tags encode relationships, prevent
  multi-ecosystem collision, and avoid tag-pairing ambiguity
- Add feature provider (Feature in system) separate from
  overture provider (OvertureFeature in core)
- Move extension provider into Phase 2 (Extension Support)
- Defer package keywords to a future tag provider; drop #tag
  syntax from entry point format (names are just identifiers)
- Renumber phases: extensions become Phase 2, manifest-driven
  approval becomes Phase 3
Comment on lines +152 to +153
This is the less-preferred mechanism. Reserving individual strings does not
scale -- every new reserved word requires an update to the table. Prefix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reserving individual strings does not scale

Implicit question here: how much reserved-ness do we need?

Is reserved a thin bootstrapping slice or potentially a very big group?


**Structured tags encode relationships that flat tags cannot.** With
`overture:theme=buildings`, `--group-by overture:theme` extracts the
dimension and groups by its values. With flat tags `theme` and `buildings`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am arguing that there's no tag for theme. There's just buildings. Anyone can apply buildings, but what makes it the Overture buildings theme is that there's an overture tag on it.

Comment on lines +170 to +173
`theme` and `buildings`?" -- and the pairing is implicit. A model tagged
`theme`, `buildings`, `residential` is ambiguous: are `buildings` and
`residential` both theme values, or is `residential` a separate
classification?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inherent in any tagging situation. Anytime multiple tags can be added, you can get ambiguity.

Unless you introduce a rigorous set of requirements of what tags you're allowed to add, which then becomes something other than a tagging system.

Comment on lines +177 to +178
buildings" (taxonomy) or "this package is the endorsed buildings schema"
(endorsement)? `overture:theme=buildings` unambiguously signals taxonomy;
Copy link
Collaborator

@vcschapp Victor Schappert (vcschapp) Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is true of overture (reserved tag) + buildings.

As well, why shouldn't an 3P extension be allowed to say "this extension is part of the buildings theme" where the only difference is w the 3P extension, it's not overture since that official tag is reserved?

Comment on lines +183 to +184
`overture:theme=buildings` and `acme:theme=industrial` coexist cleanly.
Flat `buildings` and `industrial` need external context to distinguish
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference would be the overture tag is on the first but not second.

Flat `buildings` and `industrial` need external context to distinguish
their purpose.

**`system` benefits from prefixed tags internally.** `system:extension`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally subjective but having to say give me system:extension to figure out if something is an extension instead of just extension seems clunky to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants