Skip to content

TD-Linear中Reward list 坐标对应错误 #1

@dalton-ly

Description

@dalton-ly

image

TD-Linear中reward list初始化有问题,和GridEnv PSA矩阵的初始化过程中的reward list的顺序不一致:
image
这会导致TD-Linear中的policy_evaluation函数得不到正确的状态值

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions