Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
5c7d118
Add Metadata class and integration tests for parameter metadata handling
tobiasploetz Jun 27, 2025
324867e
Add Metadata to parameters module exports
tobiasploetz Jun 27, 2025
945d0d0
Add changelog entry for metadata feature
tobiasploetz Jun 27, 2025
df13f7c
Make changelog entry more concise
tobiasploetz Jun 27, 2025
61338b4
Add missing validators
AdrianSosic Jun 27, 2025
ca03d18
Handle None path via optional conversion
AdrianSosic Jun 27, 2025
a85ed36
Make field separation type-safe
AdrianSosic Jun 27, 2025
f170cc5
Move class definition to right place
AdrianSosic Jun 27, 2025
f3dee22
Adjust converter name and signature
AdrianSosic Jun 27, 2025
d528bf8
Fix converter docstring
AdrianSosic Jun 27, 2025
06452b7
Drop unnecessary override
AdrianSosic Jun 27, 2025
486dfdd
Activate slots for Metadata class
AdrianSosic Jun 30, 2025
b8eaedc
Refactor tests for metadata
tobiasploetz Jun 30, 2025
1430754
Refactor Metadata as a generic class, not specific to Parameter
tobiasploetz Jun 30, 2025
8eddb61
Format changelog entry
AdrianSosic Jul 1, 2025
e7e0f23
Format error message
AdrianSosic Jul 1, 2025
d57d322
Fix class references in docstrings
AdrianSosic Jul 1, 2025
e3c7b05
Reorganize metadata tests
tobiasploetz Jul 2, 2025
1625769
Add serialization tests and hypothesis strategies for Metadata class
tobiasploetz Jul 2, 2025
5b2def4
Add optional metadata generation for parameter hypothesis strategies
tobiasploetz Jul 3, 2025
edc1a08
Polish docstring, comments and changelog
AdrianSosic Jul 4, 2025
3c0675e
Clean up metadata core tests
AdrianSosic Jul 4, 2025
268fdd6
Refactor integration tests
AdrianSosic Jul 4, 2025
07c38f6
Move validation test to correct folder
AdrianSosic Jul 4, 2025
e6035f4
Add validation test for Metadata class
AdrianSosic Jul 4, 2025
d2fe159
Add missing copy operation
AdrianSosic Jul 4, 2025
56c7b45
Add serialization test for field separation
AdrianSosic Jul 4, 2025
9219a96
Fix metadata serialization
AdrianSosic Jul 4, 2025
66e4ac1
Fix field query for Metadata subclassing
AdrianSosic Jul 8, 2025
ceec322
Separate logic into Metadata and ParameterMetadata classes
AdrianSosic Jul 8, 2025
1306e47
Split hypothesis strategy
AdrianSosic Jul 8, 2025
d7b238a
Generalize serialization to Metadata subclasses
AdrianSosic Jul 8, 2025
a90c8a5
Generalize field separation test
AdrianSosic Jul 8, 2025
133e0c2
Switch to ParameterMetadata for existing tests
AdrianSosic Jul 8, 2025
641bd40
Fix type hints for `_explicit_fields`
Scienfitz Jul 10, 2025
8f7a9ce
Update to metadata strategy and tests , CHANGELOG
tobiasploetz Jul 11, 2025
df90616
Typos in doc string
tobiasploetz Jul 11, 2025
40f1c6f
Make TypeVar _TMetaData private to module
tobiasploetz Jul 15, 2025
c506399
Add entry for docs/_build to .gitignore
tobiasploetz Jul 16, 2025
41330ba
Rename ParameterMetadata to MeasurableMetadata and update references …
tobiasploetz Jul 16, 2025
1a5370b
Add Metadata to Objective and MeasurableMetadata to Target
tobiasploetz Jul 18, 2025
e21d7ef
Refactor metadata handling in Objective, Parameter, and Target classe…
tobiasploetz Jul 21, 2025
43aeb20
Update CHANGELOG.md
tobiasploetz Jul 22, 2025
f3d758d
Small refactoring of metadata tests
tobiasploetz Jul 25, 2025
dfc2380
Fix mypy failure
tobiasploetz Jul 25, 2025
38b4fcf
Update CONTRIBUTERS.md
AdrianSosic Jul 25, 2025
157d251
Add metadata class to parameter and target package namespaces
AdrianSosic Jul 25, 2025
c5da6be
Fix changelog entries
AdrianSosic Jul 25, 2025
4c5c4b9
Reuse parent implement of is_empty
AdrianSosic Jul 25, 2025
ec150d1
Extend ranges of metadata component strategies
AdrianSosic Jul 25, 2025
82009eb
Harmonize formatting
AdrianSosic Jul 25, 2025
031ab64
Drop unnecessary test for non-None metadata
AdrianSosic Jul 25, 2025
331af4e
Drop unnecessary serialization tests
AdrianSosic Jul 25, 2025
57ab9c9
Drop unnecessary test for independent metadata objects
AdrianSosic Jul 25, 2025
59b45d6
Remove unnecessary boilerplate by parametrizing integration tests
AdrianSosic Jul 25, 2025
e261749
Parametrize is_empty tests
AdrianSosic Jul 28, 2025
e7e4134
Drop unnecessary target and objective validation tests
AdrianSosic Jul 28, 2025
50f7ff0
Add virtual environment directories to .gitignore
AdrianSosic Jul 28, 2025
f3760d9
Update scikit-fingerprints URL
AdrianSosic Jul 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ build
# VSCode
.vscode

# Virtual environments
.venv
.env

# Testing
.tox
.coverage
Expand All @@ -34,5 +38,6 @@ htmlcov

# Folders that are temporarily created when building the documentation
docs/_autosummary
docs/_build
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdrianSosic could you please add ".env" as another exception as well? Did some cleaning and setup of my local IDE and that would be super helpful :)

docs/examples
docs/sdk
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Added
- API diagram in user guide
- `Metadata` and `MeasurableMetadata` classes providing optional information for BayBE
objects
- `Objective` now has a `metadata` attribute as well as a `description` property
- `Target` and `Parameter` now have a `metadata` attribute as well as `description` and
`unit` properties

### Fixed
- `Campaign` no longer allows overlapping names between parameters and targets
Expand Down
4 changes: 3 additions & 1 deletion CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,6 @@
- Fabian Liebig (Merck KGaA, Darmstadt, Germany):\
Benchmarking structure and persistence capabilities for benchmarking results
- Alexander Wieczorek (Swiss Federal Institute for Materials Science and Technology, Dübendorf, Switzerland):\
SHAP explainers for insights
SHAP explainers for insights
- Tobias Plötz (Merck KGaA, Darmstadt, Germany):\
Metadata system
15 changes: 14 additions & 1 deletion baybe/objectives/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import cattrs
import pandas as pd
from attrs import define
from attrs import define, field

from baybe.serialization.core import (
converter,
Expand All @@ -15,6 +15,7 @@
)
from baybe.serialization.mixin import SerialMixin
from baybe.targets.base import Target
from baybe.utils.metadata import Metadata, to_metadata

# TODO: Reactive slots in all classes once cached_property is supported:
# https://github.com/python-attrs/attrs/issues/164
Expand All @@ -27,6 +28,18 @@ class Objective(ABC, SerialMixin):
is_multi_output: ClassVar[bool]
"""Class variable indicating if the objective produces multiple outputs."""

metadata: Metadata = field(
factory=Metadata,
converter=lambda x: to_metadata(x, Metadata),
kw_only=True,
)
"""Optional metadata containing description and other information."""

@property
def description(self) -> str | None:
"""The description of the objective."""
return self.metadata.description

@property
@abstractmethod
def targets(self) -> tuple[Target, ...]:
Expand Down
2 changes: 2 additions & 0 deletions baybe/parameters/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,14 @@
NumericalDiscreteParameter,
)
from baybe.parameters.substance import SubstanceParameter
from baybe.utils.metadata import MeasurableMetadata

__all__ = [
"CategoricalEncoding",
"CategoricalParameter",
"CustomDiscreteParameter",
"CustomEncoding",
"MeasurableMetadata",
"NumericalContinuousParameter",
"NumericalDiscreteParameter",
"SubstanceEncoding",
Expand Down
18 changes: 18 additions & 0 deletions baybe/parameters/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
unstructure_base,
)
from baybe.utils.basic import to_tuple
from baybe.utils.metadata import MeasurableMetadata, to_metadata

if TYPE_CHECKING:
from baybe.searchspace.continuous import SubspaceContinuous
Expand All @@ -48,6 +49,13 @@ class Parameter(ABC, SerialMixin):
name: str = field(validator=(instance_of(str), min_len(1)))
"""The name of the parameter"""

metadata: MeasurableMetadata = field(
factory=MeasurableMetadata,
converter=lambda x: to_metadata(x, MeasurableMetadata),
kw_only=True,
)
"""Optional metadata containing description, unit, and other information."""

@abstractmethod
def is_in_range(self, item: Any) -> bool:
"""Return whether an item is within the parameter range.
Expand Down Expand Up @@ -88,6 +96,16 @@ def to_searchspace(self) -> SearchSpace:
def summary(self) -> dict:
"""Return a custom summarization of the parameter."""

@property
def description(self) -> str | None:
"""The description of the parameter."""
return self.metadata.description

@property
def unit(self) -> str | None:
"""The unit of measurement for the parameter."""
return self.metadata.unit


@define(frozen=True, slots=False)
class DiscreteParameter(Parameter, ABC):
Expand Down
2 changes: 1 addition & 1 deletion baybe/parameters/enum.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ class CustomEncoding(ParameterEncoding):
class SubstanceEncoding(ParameterEncoding):
"""Available encodings for substance parameters from `scikit-fingerprints`_ package.

.. _scikit-fingerprints: https://scikit-fingerprints.github.io/scikit-fingerprints/
.. _scikit-fingerprints: https://scikit-fingerprints.readthedocs.io/
"""

ATOMPAIR = "ATOMPAIR"
Expand Down
2 changes: 2 additions & 0 deletions baybe/targets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,11 @@
from baybe.targets.binary import BinaryTarget
from baybe.targets.enum import TargetMode, TargetTransformation
from baybe.targets.numerical import NumericalTarget
from baybe.utils.metadata import MeasurableMetadata

__all__ = [
"BinaryTarget",
"MeasurableMetadata",
"NumericalTarget",
"TargetMode",
"TargetTransformation",
Expand Down
18 changes: 18 additions & 0 deletions baybe/targets/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
get_base_structure_hook,
unstructure_base,
)
from baybe.utils.metadata import MeasurableMetadata, to_metadata

if TYPE_CHECKING:
from baybe.objectives import SingleTargetObjective
Expand All @@ -31,6 +32,23 @@ class Target(ABC, SerialMixin):
name: str = field()
"""The name of the target."""

metadata: MeasurableMetadata = field(
factory=MeasurableMetadata,
converter=lambda x: to_metadata(x, MeasurableMetadata),
kw_only=True,
)
"""Optional metadata containing description, unit, and other information."""

@property
def description(self) -> str | None:
"""The description of the target."""
return self.metadata.description

@property
def unit(self) -> str | None:
"""The unit of measurement for the target."""
return self.metadata.unit

def to_objective(self) -> SingleTargetObjective:
"""Create a single-task objective from the target."""
from baybe.objectives.single import SingleTargetObjective
Expand Down
118 changes: 118 additions & 0 deletions baybe/utils/metadata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
"""Generic metadata system for BayBE objects."""
Comment thread
Scienfitz marked this conversation as resolved.

from __future__ import annotations

from typing import Any, TypeVar

import cattrs
from attrs import AttrsInstance, define, field, fields
from attrs.validators import deep_mapping, instance_of
from attrs.validators import optional as optional_v
from typing_extensions import override

from baybe.serialization import SerialMixin, converter
from baybe.utils.basic import classproperty

_TMetaData = TypeVar("_TMetaData", bound="Metadata")


@define(frozen=True)
Comment thread
AVHopp marked this conversation as resolved.
class Metadata(SerialMixin):
"""Metadata class providing basic information for BayBE objects."""

description: str | None = field(
default=None, validator=optional_v(instance_of(str))
)
"""A description of the object."""

misc: dict[str, Any] = field(
factory=dict,
validator=deep_mapping(
mapping_validator=instance_of(dict),
key_validator=instance_of(str),
# FIXME: https://github.com/python-attrs/attrs/issues/1246
value_validator=lambda *x: None,
),
kw_only=True,
)
"""Additional user-defined metadata."""

@misc.validator
def _validate_misc(self, _, value: dict[str, Any]) -> None:
if inv := set(value).intersection(self._explicit_fields):
raise ValueError(
f"Miscellaneous metadata cannot contain the following fields: {inv}. "
f"Use the corresponding attributes instead."
)

@classproperty
def _explicit_fields(cls: type[AttrsInstance]) -> set[str]:
"""The explicit metadata fields.""" # noqa: D401
flds = fields(cls)
return {fld.name for fld in flds if fld.name != flds.misc.name}
Comment thread
AdrianSosic marked this conversation as resolved.

@property
def is_empty(self) -> bool:
"""Check if metadata contains any meaningful information."""
return self.description is None and not self.misc


@define(frozen=True)
class MeasurableMetadata(Metadata):
"""Class providing metadata for BayBE :class:`Parameter` objects."""

unit: str | None = field(default=None, validator=optional_v(instance_of(str)))
"""The unit of measurement for the parameter."""

@override
@property
def is_empty(self) -> bool:
"""Check if metadata contains any meaningful information."""
return super().is_empty and self.unit is None


def to_metadata(
value: dict[str, Any] | _TMetaData, cls: type[_TMetaData], /
) -> _TMetaData:
"""Convert a dictionary to :class:`Metadata` (with :class:`Metadata` passthrough).

Args:
value: The metadata input.
cls: The specific :class:`Metadata` subclass to convert to.

Returns:
The created metadata instance of the requested :class:`Metadata` subclass.

Raises:
TypeError: If the input is not a dictionary or of the specified
:class:`Metadata` type.
"""
if isinstance(value, cls):
return value

if not isinstance(value, dict):
raise TypeError(
f"The input must be a dictionary or a '{cls.__name__}' instance. "
f"Got: {type(value)}"
)

# Separate known fields from unknown ones
return converter.structure(value, cls)


@converter.register_structure_hook
def _separate_metadata_fields(dct: dict[str, Any], cls: type[Metadata]) -> Metadata:
"""Separate known fields from miscellaneous metadata."""
dct = dct.copy()
explicit = {fld: dct.pop(fld, None) for fld in cls._explicit_fields}
return cls(**explicit, misc=dct)


@converter.register_unstructure_hook
def _flatten_misc_metadata(metadata: Metadata) -> dict[str, Any]:
"""Flatten the metadata for serialization."""
cls = type(metadata)
fn = cattrs.gen.make_dict_unstructure_fn(cls, converter)
dct = fn(metadata)
dct = dct | dct.pop(fields(Metadata).misc.name)
return dct
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@
"python": ("https://docs.python.org/3", None),
"pandas": ("https://pandas.pydata.org/docs/", None),
"polars": ("https://docs.pola.rs/api/python/stable/", None),
"skfp": ("https://scikit-fingerprints.github.io/scikit-fingerprints/", None),
"skfp": ("https://scikit-fingerprints.readthedocs.io/latest/", None),
"sklearn": ("https://scikit-learn.org/stable/", None),
"numpy": ("https://numpy.org/doc/stable/", None),
"torch": ("https://pytorch.org/docs/main/", None),
Expand Down
2 changes: 1 addition & 1 deletion docs/userguide/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[`TaskParameter`]: baybe.parameters.categorical.TaskParameter
[`CustomDiscreteParameter`]: baybe.parameters.custom.CustomDiscreteParameter
[`SubstanceEncoding`]: baybe.parameters.enum.SubstanceEncoding
[scikit-fingerprints]: https://scikit-fingerprints.github.io/scikit-fingerprints/
[scikit-fingerprints]: https://scikit-fingerprints.readthedocs.io

# Parameters

Expand Down
40 changes: 40 additions & 0 deletions tests/hypothesis_strategies/metadata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Hypothesis strategies for metadata."""

import hypothesis.strategies as st
from hypothesis import assume

from baybe.utils.metadata import MeasurableMetadata, Metadata

_descriptions = st.one_of(st.none(), st.text(min_size=0))
"""A strategy generating metadata descriptions."""


@st.composite
def _miscs(draw: st.DrawFn, cls: type[Metadata]):
"""Generates miscellaneous metadata for various metadata classes."""
misc = draw(
st.dictionaries(
st.text(min_size=0),
st.one_of(st.text(), st.integers(), st.floats(allow_nan=False)),
Comment thread
AVHopp marked this conversation as resolved.
max_size=5,
)
)
assume(not cls._explicit_fields.intersection(misc))
return misc


@st.composite
def metadata(draw: st.DrawFn):
"""Generate :class:`baybe.utils.metadata.Metadata`."""
description = draw(_descriptions)
misc = draw(_miscs(Metadata))
return Metadata(description=description, misc=misc)


@st.composite
def measurable_metadata(draw: st.DrawFn):
"""Generate :class:`baybe.parameters.base.MeasurableMetadata`."""
description = draw(_descriptions)
unit = draw(st.one_of(st.none(), st.text(min_size=0)))
misc = draw(_miscs(MeasurableMetadata))
return MeasurableMetadata(description=description, unit=unit, misc=misc)
Loading
Loading