Skip to content

Eval step reward is logging something strange #97

@TheEimer

Description

@TheEimer

The eval step reward is currently implemented as "np.mean(rewards)/steps" which looks like it's supposed to return the mean reward per step. Due to numpy, this ends up being an array, however, of [np.mean(rewards)/s for s in steps] which is probably not what we want to log. Maybe we should just log the steps and the rewards and be done with it?

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationgood first issueGood for newcomersquestionFurther information is requested

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions