Hi, Thank's for this work. It's really usefull. Why you defined the score function as (line 66)?? self.logp / float(self.leng - 1 + 1e-6) + alpha * reward Can you explain this definition? Thank's, Best
Hi,
Thank's for this work. It's really usefull.
Why you defined the score function as (line 66)??
self.logp / float(self.leng - 1 + 1e-6) + alpha * reward
Can you explain this definition?
Thank's,
Best