78f46822aec6cea5e216e1f7404d351835c54346,yarll/agents/sac.py,SAC,train,#SAC#,92
Before Change
values = self.value_network(state0_batch)
value_loss = tf.reduce_mean(tf.square(values - value_target))
advantage = tf.stop_gradient(action_logprob - new_softq + values)
with tf.GradientTape() as actor_tape:
_, action_logprob = self.actor_network(state0_batch)
actor_loss = tf.reduce_mean(action_logprob * advantage)
After Change
with tf.GradientTape() as softq_tape:
softq = self.softq_network(state0_batch, action_batch)
softq_loss = tf.reduce_mean(tf.square(softq - softq_targets))
with tf.GradientTape() as value_tape:
values = self.value_network(state0_batch)
with tf.GradientTape() as actor_tape:
actions, action_logprob = self.actor_network(state0_batch)
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 2
Instances Project Name: arnomoonens/yarll
Commit Name: 78f46822aec6cea5e216e1f7404d351835c54346
Time: 2019-06-10
Author: arno.moonens@gmail.com
File Name: yarll/agents/sac.py
Class Name: SAC
Method Name: train
Project Name: arnomoonens/yarll
Commit Name: 78f46822aec6cea5e216e1f7404d351835c54346
Time: 2019-06-10
Author: arno.moonens@gmail.com
File Name: yarll/agents/sac.py
Class Name: SAC
Method Name: train
Project Name: tensorflow/models
Commit Name: 85f529286a472d3be2aff4e70551829252dd6ac4
Time: 2020-10-06
Author: dan.anghel@gmail.com
File Name: research/delf/delf/python/training/model/delf_model_test.py
Class Name: DelfTest
Method Name: test_train_step