On my contiguous dataset, any read (even of a single element) results in all 35GB being read into memory.
use hidefix::{prelude::Index, reader::ReaderExt};
use std::sync::Arc;
const PATH: &str = "/home/fm208/Downloads/pubmed/benchmark-dev-pubmed23.h5";
fn main() {
let i = Arc::new(Index::index(PATH).unwrap());
let mut r = i.reader("train").unwrap();
let values = r.values::<f32, _>([0..1, 0..1]).unwrap();
panic!("{values:?}");
}
In read_to on CacheReader, self.ds.chunk_slices returns one giant chunk, which when passed into read_chunk, loads the entire file. hdf5-metno does not have this behavior.
I'm not sure what changes would be needed to hidefix to improve this, do you have any suggestions?
On my contiguous dataset, any read (even of a single element) results in all 35GB being read into memory.
In
read_toonCacheReader,self.ds.chunk_slicesreturns one giant chunk, which when passed intoread_chunk, loads the entire file.hdf5-metnodoes not have this behavior.I'm not sure what changes would be needed to
hidefixto improve this, do you have any suggestions?