DataPredict

API Reference - Models - AsynchronousAdvantageActorCritic (A3C)

AsynchronousAdvantageCritic is a base class for reinforcement learning.

Notes

Constructors

new()

Create new model object. If any of the arguments are nil, default argument values for that argument will be used.

AsynchronousAdvantageCritic.new(learningRate: integer, numberOfReinforcementsPerEpisode: integer, epsilon: number, epsilonDecayFactor: number, discountFactor: number, totalNumberOfReinforcementsToUpdateMainModel: number, actionSelectionFunction: string): ModelObject

Parameters:

Returns:

Functions

setParameters()

Set model’s parameters. When any of the arguments are nil, previous argument values for that argument will be used.

AsynchronousAdvantageCritic:setParameters(learningRate: integer, numberOfReinforcementsPerEpisode: integer, epsilon: number, epsilonDecayFactor: number, discountFactor: number, totalNumberOfReinforcementsToUpdateMainModel: number, actionSelectionFunction: string))

Parameters:

addActorCriticModel()

AsynchronousAdvantageCritic:addActorCriticModel(ActorModel: ModelObject, CriticModel: ModelObject, ExperienceReplay: ExperienceReplayObject)

Parameters:

setClassesList()

AsynchronousAdvantageCritic:setClassesList(classesList: [])

Parameters:

setActorCriticMainModelParameters()

AsynchronousAdvantageCritic:setActorCriticMainModelParameters(ActorMainModelParameters: [], CriticMainModelParameters[], applyToAllChildModels: boolean)

Parameters:

getActorCriticMainModelParameters()

AsynchronousAdvantageCritic:getActorCriticMainModelParameters(): [], []

Returns:

reinforce()

Reward or punish model based on the current state of the environment.

AsynchronousAdvantageCritic:reinforce(currentFeatureVector: matrix, actionStandardDeviationVector: matrix, rewardValue: number, returnOriginalOutput: boolean, actorCriticModelNumber: number): integer, number -OR- Matrix

Parameters:

Returns:

-OR-

categoricalUpdate()

Updates the model parameters based on diagonal Gaussian distribution for continuous action spaces.

AsynchronousAdvantageCritic:categoricalUpdate(previousFeatureVector: featureVector, action: number/string, rewardValue: number, currentFeatureVector: featureVector, actorCriticModelNumber: number)

Parameters:

diagonalGaussianUpdate()

Updates the model parameters based on categorical distribution for discrete action spaces.

AsynchronousAdvantageCritic:diagonalGaussianUpdate(previousFeatureVector: featureVector, actionVector: vector, rewardValue: number, currentFeatureVector: featureVector, actorCriticModelNumber: number)

Parameters:

getCurrentNumberOfEpisodes()

AsynchronousAdvantageCritic:getCurrentNumberOfEpisodes(actorCriticModelNumber: number): number

Parameters:

Returns:

getCurrentNumberOfReinforcements()

AsynchronousAdvantageCritic:getCurrentNumberOfReinforcements(actorCriticModelNumber: number): number

Parameters:

Returns:

getCurrentEpsilon()

AsynchronousAdvantageCritic:getCurrentEpsilon(actorCriticModelNumber: number): number

Parameters:

Returns:

getCurrentTotalNumberOfReinforcementsToUpdateMainModel()

AsynchronousAdvantageCritic:getCurrentTotalNumberOfReinforcementsToUpdateMainModel(): number

Returns:

reset()

Reset a single child model’s stored values (excluding the parameters).

AsynchronousAdvantageCritic:reset(actorCriticModelNumber)

Parameters:

resetAll()

Reset the main model’s and child models’ stored values (excluding the parameters).

AsynchronousAdvantageCritic:resetAll()

destroy()

Destroys the model object.

AsynchronousAdvantageCritic:destroy()

References