d3f12d1622f736e649fcec853044b05fe68e05ba,catalyst/rl/onpolicy/algorithms/ppo.py,PPO,get_rollout,#PPO#,186

Before Change


        // advantage and returns
        // len x num_heads x num_atoms
        advantages = np.stack([
            utils.geometric_cumsum(gamma, deltas[:, i])
            for i, gamma in enumerate(self._gammas)
        ], axis=1)
        // len x num_heads
        returns = np.stack([
            utils.geometric_cumsum(gamma * self.gae_lambda, rewards)[0]
            for gamma in self._gammas
        ], axis=1)

        // final rollout
        rollout = {
            "action_logprob": logprobs,
            "advantage": advantages,
            "done": dones,

After Change



        // len x num_heads
        returns = np.stack([
            utils.geometric_cumsum(gamma, rewards[:, None])[:, 0]
            for gamma in self._gammas
        ], axis=1)

        // final rollout
        rollout = {
            "action_logprob": logprobs,
            "advantage": advantages,
            "done": dones,
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 2

Instances


Project Name: catalyst-team/catalyst
Commit Name: d3f12d1622f736e649fcec853044b05fe68e05ba
Time: 2019-07-07
Author: scitator@gmail.com
File Name: catalyst/rl/onpolicy/algorithms/ppo.py
Class Name: PPO
Method Name: get_rollout


Project Name: catalyst-team/catalyst
Commit Name: 86df3d0466bb72a566fca457b108ef35a4ff6b14
Time: 2019-07-03
Author: scitator@gmail.com
File Name: catalyst/rl/onpolicy/algorithms/ppo.py
Class Name: PPO
Method Name: get_rollout