0e10ab0f8d509a7371e59873d96b81792160983c,ch17/03_i2a.py,,,#,15
Before Change
act_n = envs[0].action_space.n
net_policy = common.AtariA2C(obs_shape, act_n)
net_policy.load_state_dict(torch.load(args.policy, map_location=lambda storage, loc: storage))
net_em = i2a.EnvironmentModel(obs_shape, act_n)
net_em.load_state_dict(torch.load(args.em, map_location=lambda storage, loc: storage))
After Change
// policy distillation
probs_v = torch.from_numpy(mb_probs)
if args.cuda:
probs_v = probs_v.cuda()
policy_opt.zero_grad()
logits_v, _ = net_policy(obs_v)
policy_loss_v = -F.log_softmax(logits_v) * probs_v
policy_loss_v = policy_loss_v.sum(dim=1).mean()
policy_loss_v.backward()
policy_opt.step()
tb_tracker.track("loss_distill", policy_loss_v, step_idx)
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 4
Instances
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 0e10ab0f8d509a7371e59873d96b81792160983c
Time: 2018-03-04
Author: max.lapan@gmail.com
File Name: ch17/03_i2a.py
Class Name:
Method Name:
Project Name: PacktPublishing/Deep-Reinforcement-Learning-Hands-On
Commit Name: 155e770cb912f0ac89f862d29ae14b720ceef589
Time: 2018-02-28
Author: max.lapan@gmail.com
File Name: ch17/02_imag.py
Class Name:
Method Name:
Project Name: jwyang/faster-rcnn.pytorch
Commit Name: 5bed3f48f0e37d9731b28a6d2505215419d670c8
Time: 2017-08-24
Author: jyang375@vicki.cc.gatech.edu
File Name: test_net.py
Class Name:
Method Name: