Value of reward return by the step() method

in the code i tried printing the value of reward returned by the .step() method, and it gave value 1.0, why? what does that mean?

Hey @saksham_thukral, first of all we are extremely sorry for replying you late. Now coming to your question , let me explain you :
In cartpole game, the agent basically has to balance the rod/pole between a given threshold. If the agent is able to balance the rod vertically between the range of a given threshold, then it gets a reward of +1 for that particular time step. So the value of reward you must have printed for a particular time step and hence you got a reward of +1. The maximum score that we can get from this game is +200.

Hope this gives you a better insight.
Happy Learning :slight_smile:

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.