Hi, thank you for your great work and for sharing the code. I have a couple of questions regarding the implementation and evaluation:
Q1. In the paper, it is mentioned that a small Decision module and a large Reaction module are used. However, in the current codebase, it seems that a single large model is only used with a silent head. Where can I find the implementation of small decision module?
Q2. I am trying to reproduce the results reported in the paper for the PO task of StreamingBench, but I am not getting consistent performance. Would it be possible for you to release the evaluation code used for StreamingBench, especially for the PO task?
Thank you in advance for your time and support.
Hi, thank you for your great work and for sharing the code. I have a couple of questions regarding the implementation and evaluation:
Q1. In the paper, it is mentioned that a small Decision module and a large Reaction module are used. However, in the current codebase, it seems that a single large model is only used with a silent head. Where can I find the implementation of small decision module?
Q2. I am trying to reproduce the results reported in the paper for the PO task of StreamingBench, but I am not getting consistent performance. Would it be possible for you to release the evaluation code used for StreamingBench, especially for the PO task?
Thank you in advance for your time and support.