a776728ed39a5ee9b3f3ad84fa2e76c1c5717277,examples/reinforcement_learning/tutorial_DPPO.py,Worker,work,#Worker#,292

Before Change


            s = self.env.reset()
            ep_r = 0
            buffer_s, buffer_a, buffer_r = [], [], []
            t0 = time.time()
            for t in range(EP_LEN):
                if not ROLLING_EVENT.is_set():  // while global PPO is updating
                    ROLLING_EVENT.wait()  // wait until PPO is updated
                    buffer_s, buffer_a, buffer_r = [], [], []  // clear history buffer, use new policy to collect data

After Change


                        discounted_r.append(v_s_)
                    discounted_r.reverse()
                    buffer_r = np.array(discounted_r)[:, np.newaxis]
                    QUEUE.put([buffer_s, buffer_a, buffer_r])  // put data in the queue
                    buffer_s, buffer_a, buffer_r = [], [], []

                    // update
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: tensorlayer/tensorlayer
Commit Name: a776728ed39a5ee9b3f3ad84fa2e76c1c5717277
Time: 2020-02-07
Author: 34995488+Tokarev-TT-33@users.noreply.github.com
File Name: examples/reinforcement_learning/tutorial_DPPO.py
Class Name: Worker
Method Name: work


Project Name: tensorlayer/tensorlayer
Commit Name: 35b2c4917344f338eda67c78673cf4064b3b4265
Time: 2020-02-07
Author: 34995488+Tokarev-TT-33@users.noreply.github.com
File Name: examples/reinforcement_learning/tutorial_DQN.py
Class Name:
Method Name:


Project Name: google/deepvariant
Commit Name: 7ed8c6bbcfb2dc0da9b1011ba21d12791239de79
Time: 2019-10-21
Author: gunjanbaid@google.com
File Name: deepvariant/postprocess_variants.py
Class Name:
Method Name: main