Variables in step()

Par1hsharma · October 24, 2021, 6:34am

where are we using these variables to calculate the score?

observation,reward,done,other_info = env.step(action)

how the below line is calucating score with e and t variable

print(“Game Episode :{}/{} High Score :{}”.format(e,20,t))

princeyyadav178 · December 4, 2021, 1:38pm

Hi @Par1hsharma
we are not using below variables to calculate score.

observation,reward,done,other_info = env.step(action)

This is just to demonstrate the returned variables from the step function. observation represents the state of environment, reward is the integer value the environment gives us upon taking the random action, done is the boolean value which is True if game is over.

print(“Game Episode :{}/{} High Score :{}”.format(e,20,t))

In total we are running 20 game episodes. In each game episode we are taking 50 steps/action randomly. The above statement is getting executed only when the random action we take results into gameover (i.e., done=True). The t here represents the step at which the eth game episode is over.

Game Episode :16/20 High Score :14

The above statement means that the 16th episode of the total 20 game episodes was over in 14 steps. That is, the cart was able to prevent the pole from falling for 14 time steps. Hence score=14.