Skip to content

Questions regarding implementation and evaluation code for StreamingBench #8

@aiclaudev

Description

@aiclaudev

Hi, thank you for your great work and for sharing the code. I have a couple of questions regarding the implementation and evaluation:

Q1. In the paper, it is mentioned that a small Decision module and a large Reaction module are used. However, in the current codebase, it seems that a single large model is only used with a silent head. Where can I find the implementation of small decision module?

Q2. I am trying to reproduce the results reported in the paper for the PO task of StreamingBench, but I am not getting consistent performance. Would it be possible for you to release the evaluation code used for StreamingBench, especially for the PO task?

Thank you in advance for your time and support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions