7327bc3aa7a0e66168a84380edbd1e74a5a16355,ch04/02_frozenlake_naive.py,,iterate_batches,#,47
Before Change
obs = env.reset()
sm = nn.Softmax(dim=1)
while True:
obs_v = Variable(torch.from_numpy(np.array([obs])))
act_probs_v = sm(net(obs_v))
act_probs = act_probs_v.data.numpy()[0]
action = np.random.choice(len(act_probs), p=act_probs)
next_obs, reward, is_done, _ = env.step(action)
episode_reward += reward
episode_steps.append(EpisodeStep(observation=obs, action=action))
After Change
obs = env.reset()
sm = nn.Softmax(dim=1)
while True:
obs_v = torch.FloatTensor([obs])
act_probs_v = sm(net(obs_v))
act_probs = act_probs_v.data.numpy()[0]
action = np.random.choice(len(act_probs), p=act_probs)
next_obs, reward, is_done, _ = env.step(action)
episode_reward += reward
episode_steps.append(EpisodeStep(observation=obs, action=action))
In pattern: SUPERPATTERN
Frequency: 4
Non-data size: 4
Instances
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 7327bc3aa7a0e66168a84380edbd1e74a5a16355
Time: 2018-04-25
Author: max.lapan@gmail.com
File Name: ch04/02_frozenlake_naive.py
Class Name:
Method Name: iterate_batches
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 373ae159f7ae1cabaf87228d1ae0fb6acd1c6363
Time: 2018-04-29
Author: max.lapan@gmail.com
File Name: ch14/05_play_ddpg.py
Class Name:
Method Name:
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 7327bc3aa7a0e66168a84380edbd1e74a5a16355
Time: 2018-04-25
Author: max.lapan@gmail.com
File Name: ch04/04_frozenlake_nonslippery.py
Class Name:
Method Name: iterate_batches
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 7327bc3aa7a0e66168a84380edbd1e74a5a16355
Time: 2018-04-25
Author: max.lapan@gmail.com
File Name: ch04/03_frozenlake_tweaked.py
Class Name:
Method Name: iterate_batches