It is used to update the models from experiences stored in the experience replay object. It boosts learning in reinforcement by focusing on important experiences, improving efficiency compared to regular replay.
Creates a new experience replay object.
PrioritizedExperienceReplay.new(batchSize: number, numberOfRunsToUpdate: number, maxBufferSize: number, alpha: number, beta: number, aggregateFunction: string, epsilon: number)
batchSize: The number of experience to sample from for training.
numberOfRunsToUpdate: The number of run() function needed to be called to run a single event of experience replay.
maxBufferSize: The maximum number of experiences that can be kept inside the object.
alpha: Controls the degree of prioritization in sampling from the replay buffer. Must set the value between 0 and 1. 0 for uniform sampling, 1 for full prioritization.
beta: Corrects the bias introduced by prioritization. Adjusts the importance sampling weights. Must set the value between 0 and 1. 1 for fully compensation.
aggregateFunction: The function to choose a temporal difference if it is a vector. The options are:
Maximum (Default)
Minimum
Sum
Average
epsilon: A number that prevents 0 priority. Recommended to set to very small values.
Change the parameters of an experience replay object.
PrioritizedExperienceReplay:setParametersbatchSize: number, numberOfRunsToUpdate: number, maxBufferSize: number, alpha: number, beta: number, aggregateFunction: string, epsilon: number)
batchSize: The number of experience to sample from for training.
numberOfRunsToUpdate: The number of run() function needed to be called to run a single event of experience replay.
maxBufferSize: The maximum number of experiences that can be kept inside the object.
alpha: Controls the degree of prioritization in sampling from the replay buffer. Must set the value between 0 and 1. 0 for uniform sampling, 1 for full prioritization.
beta: Corrects the bias introduced by prioritization. Adjusts the importance sampling weights. Must set the value between 0 and 1. 1 for fully compensation.
aggregateFunction: The function to choose a temporal difference if it is a vector. The options are:
Maximum
Minimum
Sum
Average
epsilon: A number that prevents 0 priority. Recommended to set to very small values.
PrioritizedExperienceReplay:addModel()