Skip to content

Slow input of tree formats with many individual trajectories #4

@Reshief

Description

@Reshief

What happened?

Input of alkene datasets of shnitsel data in hierarchical tree format using xr.DataTree for storage are incredibly slow. It even appears to the user that the program is stuck. We are talking about 30 mins for a 50MB dataset, where datasets with 500MB load in less than a second.
The only difference seems to be that the smaller dataset has more trajectories of shorter length.

What did you expect to happen?

Input should be fairly fast. Neither conversion from xr.DataTree to ShnitselDB nor the input of a netcdf file should take this long.

Other things we should know?

This is meant to keep track of the issue of input speed.
It happens for st version 2026.1.1 but we have pinned down the issue to be the xarray.open_datatree() function that seems to slow down to a crawl at a very arbitrary number of datasets.

Is related to xarray Issue Nr. 9511

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions