c395079717340c8a92b1635f8b40a5ba39c513e5,contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py,Actor,learn,#Actor#,72

Before Change



        // instead of above method, I use a hard replacement here
        if self.t_replace_counter % self.t_replace_iter == 0:
            self.sess.run([tf.assign(t, e) for t, e in zip(self.t_params, self.e_params)])
        self.t_replace_counter += 1

    def choose_action(self, s):

After Change


    def learn(self, s):   // batch update
        self.sess.run(self.train_op, feed_dict={S: s})

        if self.replacement["name"] == "soft":
            self.sess.run(self.soft_replace)
        else:
            if self.t_replace_counter % self.replacement["rep_iter_a"] == 0:
                self.sess.run(self.hard_replace)
            self.t_replace_counter += 1

    def choose_action(self, s):
        s = s[np.newaxis, :]    // single state
        return self.sess.run(self.a, feed_dict={S: s})[0]  // single action
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: MorvanZhou/Reinforcement-learning-with-tensorflow
Commit Name: c395079717340c8a92b1635f8b40a5ba39c513e5
Time: 2017-08-09
Author: morvanzhou@gmail.com
File Name: contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py
Class Name: Actor
Method Name: learn


Project Name: shaypal5/pdpipe
Commit Name: 99095d5412483ec623278bdb1a0c9e24b18bfc85
Time: 2017-03-16
Author: shaypal5@gmail.com
File Name: pdpipe/basic_stages.py
Class Name: Bin
Method Name: _op


Project Name: MorvanZhou/Reinforcement-learning-with-tensorflow
Commit Name: c395079717340c8a92b1635f8b40a5ba39c513e5
Time: 2017-08-09
Author: morvanzhou@gmail.com
File Name: contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py
Class Name: Critic
Method Name: learn