In the theory video, sir taught us the importance of the memory of the agent and how we should discard the very old experiences from dequeue and use recent experience for training of agent. However, there is no such implementation (removing very old experience from dequeue) in the code. We’re just appending the experience and randomly drawing experiences equal to batch size.
Would it not affect the performance of the agent?