-
Notifications
You must be signed in to change notification settings - Fork 0
Description
A new inference environment is computed whenever inference config is updated, because the inference config itself is used to generate the second short-hash inside the run_id wildcard. As a result, a new unique run_id is generated whenever an inference config is updated, which in turns means a new inference environment is generated. This should not happen, as there is no functional dependency between the two, and it can be time consuming and hard on the distributed filesystem.
As a solution, we might consider splitting the run_id in two. Right now, a run_id looks like:
├──<model-identity(+hash)>-<config-hash>
├── requirements.txt
├── anemoi.json
├── venv.squashfs
├── 2020010100
├── ...
├──<model-identity(+hash)>-<another-config-hash>
├── requirements.txt
├── anemoi.json
├── venv.squashfs
├── 2020010100
├── ...
where <model-identity(+hash)> represents the part that uniquely identifies a model (optionally contains a hash, e.g. from mlflow run id), and <config-hash> is a short hash based on how the model is configured to run.
We could, instead, split the two
├──<model-identity(+hash)>-
├── requirements.txt
├── anemoi.json
├── venv.squashfs
├── <config-hash>
├── 2020010100
├── ....
├── <another-config-hash>
├── 2020010100
├── ....
this way, even if the inference configuration file is updated, the same environment is reused.