c395079717340c8a92b1635f8b40a5ba39c513e5,contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py,Actor,learn,#Actor#,72
Before Change
// instead of above method, I use a hard replacement here
if self.t_replace_counter % self.t_replace_iter == 0:
self.sess.run([tf.assign(t, e) for t, e in zip(self.t_params, self.e_params)])
self.t_replace_counter += 1
def choose_action(self, s):
After Change
def learn(self, s): // batch update
self.sess.run(self.train_op, feed_dict={S: s})
if self.replacement["name"] == "soft":
self.sess.run(self.soft_replace)
else:
if self.t_replace_counter % self.replacement["rep_iter_a"] == 0:
self.sess.run(self.hard_replace)
self.t_replace_counter += 1
def choose_action(self, s):
s = s[np.newaxis, :] // single state
return self.sess.run(self.a, feed_dict={S: s})[0] // single action
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 3
Instances Project Name: MorvanZhou/Reinforcement-learning-with-tensorflow
Commit Name: c395079717340c8a92b1635f8b40a5ba39c513e5
Time: 2017-08-09
Author: morvanzhou@gmail.com
File Name: contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py
Class Name: Actor
Method Name: learn
Project Name: shaypal5/pdpipe
Commit Name: 99095d5412483ec623278bdb1a0c9e24b18bfc85
Time: 2017-03-16
Author: shaypal5@gmail.com
File Name: pdpipe/basic_stages.py
Class Name: Bin
Method Name: _op
Project Name: MorvanZhou/Reinforcement-learning-with-tensorflow
Commit Name: c395079717340c8a92b1635f8b40a5ba39c513e5
Time: 2017-08-09
Author: morvanzhou@gmail.com
File Name: contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG.py
Class Name: Critic
Method Name: learn