ca907342507c1139696f542de0a3351d7a382eee,reinforcement_learning/reinforce.py,,,#,77

Before Change


running_reward = 10
for i_episode in count(1):
    state = env.reset()
    for t in range(10000):  // Don"t infinite loop while learning
        action = select_action(state)
        state, reward, done, _ = env.step(action)
        if args.render:
            env.render()
        policy.rewards.append(reward)
        if done:
            break

    running_reward = running_reward * 0.99 + t * 0.01
    finish_episode()
    if i_episode % args.log_interval == 0:
        print("Episode {}\tLast length: {:5d}\tAverage length: {:.2f}".format(

After Change


            break


if __name__ == "__main__":
    main()
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 3

Instances


Project Name: pytorch/examples
Commit Name: ca907342507c1139696f542de0a3351d7a382eee
Time: 2017-12-04
Author: sgross@fb.com
File Name: reinforcement_learning/reinforce.py
Class Name:
Method Name:


Project Name: openai/gym
Commit Name: 2e8141b00ddd3a76238abe95b44d56a39bc90885
Time: 2019-03-08
Author: christopherhesse@users.noreply.github.com
File Name: gym/envs/box2d/car_racing.py
Class Name: CarRacing
Method Name: render


Project Name: pytorch/examples
Commit Name: ca907342507c1139696f542de0a3351d7a382eee
Time: 2017-12-04
Author: sgross@fb.com
File Name: reinforcement_learning/actor_critic.py
Class Name:
Method Name: