Dear authors,
Thank you for contributing this dataset and suite of benchmark tasks !
I wanted to reach out regarding the pressure forecasting task, I implemented the following baseline: for each future time step (13 to 24) I predict at test time the average pressure computed on the train split (split randomly, I can share the exact ids if needed). When I compare the results to the numbers presented in table 5 of your paper, this baseline seems to compare favorably.
I obtain the following results (where in red I display the mean and std from your table 5) and in blue the lower rmse for all time steps for the per-timestep average pressure baseline.

I am fairly confident my evaluation code is correct, is this also something you can reproduce on your end ?
All the best,
Yana
Dear authors,
Thank you for contributing this dataset and suite of benchmark tasks !
I wanted to reach out regarding the pressure forecasting task, I implemented the following baseline: for each future time step (13 to 24) I predict at test time the average pressure computed on the train split (split randomly, I can share the exact ids if needed). When I compare the results to the numbers presented in table 5 of your paper, this baseline seems to compare favorably.
I obtain the following results (where in red I display the mean and std from your table 5) and in blue the lower rmse for all time steps for the per-timestep average pressure baseline.
I am fairly confident my evaluation code is correct, is this also something you can reproduce on your end ?
All the best,
Yana