Fix possible UB in reader (#186)#203
Conversation
Performance hitHi! Could I ask if you've run any tests to see what the performance hit is like? I've run some quick tests (code further below) and it seems like:
With different opt-levels:
Test CodeJammed rather haphazardly into {
const ITER_COUNT: u126 = 100000;
for _ in 0..100 {
let start = Instant::now();
for _ in 0..ITER_COUNT {
let mut reader = Reader::from_reader(&mut &*data, data.len()).unwrap();
}
let time_taken = (Instant::now() - start).as_nanos() / ITER_COUNT;
let start2 = Instant::now();
for _ in 0..ITER_COUNT {
let mut reader2 = Reader::alt_from_reader(&mut &*data, data.len()).unwrap();
}
let time_taken2 = (Instant::now() - start).as_nanos() / ITER_COUNT;
println!("{} vs {}: the new method takes {}% more time", time_taken, time_taken2, ((time_taken2 as f64/time_taken as f64)-1.0)*100.0);
}
}Will the performance hit matter?It seems that the purpose of the function is to create the buffer from which we wish to read. Hence, regardless of how many times the function is called, the time spent zero-initializing would scale with the amount of data being read (whether that be streaming and calling this function multiple times, or as one big chunk, calling this function on one large piece of data). Depending on the particular use case of the user, I worry that such a performance hit might indeed be impactful. (Do correct me if I'm wrong, though.) Possible resolutionsI've discussed within my org and we're thinking of a few possible ways to proceed:
What do you / anyone else think? |
This fixes the possible undefined behaviour in
Reader::from_reader()(See #186 ).This causes a small runtime overhead due to the explicit initialization, but as far as I understood this method is only called by the user explicitly, so it may not matter so much.