ReinforcementLearningActorCriticBaseModel is a base class for reinforcement learning neural network models.
Creates a new base model object. If any of the arguments are nil, default argument values for that argument will be used.
ReinforcementLearningActorCriticBaseModel.new(discountFactor: number): ModelObject
Set model’s parameters. When any of the arguments are nil, previous argument values for that argument will be used.
ReinforcementLearningActorCriticBaseModel:setParameters(discountFactor: number)
Sets the actor model. The outputs of the actor model is required to be in normal distribution format.
ReinforcementLearningActorCriticBaseModel:setActorModel(Model: ModelObject)
Sets the critic model.
ReinforcementLearningActorCriticBaseModel:setCriticModel(Model: ModelObject)
Gets the actor model.
ReinforcementLearningActorCriticBaseModel:getActorModel(): ModelObject
Gets the critic model.
ReinforcementLearningActorCriticBaseModel:getCriticModel(): ModelObject
Sets the model’s categorical policy update function.
ReinforcementLearningBaseModel:setCategoricalUpdateFunction(categoricalUpdateFunction)
Sets the model’s diagonal Gausian policy update function.
ReinforcementLearningBaseModel:setDiagonalGaussianUpdateFunction(diagonalGaussianUpdateFunction)
Sets the model’s episode update function.
ReinforcementLearningActorCriticBaseModel:setEpisodeUpdateFunction(episodeUpdateFunction)
Updates the model parameters using categoricalUpdateFunction().
ReinforcementLearningBaseModel:categoricalUpdate(previousFeatureVector: featureVector, action: number/string, rewardValue: number, currentFeatureVector: featureVector)
previousFeatureVector: The previous state of the environment.
action: The action selected.
rewardValue: The reward gained at current state.
currentFeatureVector: The currrent state of the environment.
Updates the model parameters using diagonalGaussianUpdateFunction().
ReinforcementLearningActorCriticBaseModel:diagonalGaussianUpdate(previousFeatureVector: featureVector, actionMeanVector: vector, actionStandardDeviationVector, rewardValue: number, currentFeatureVector: featureVector)
previousFeatureVector: The previous state of the environment.
actionMeanVector: The vector containing mean values for all actions.
actionStandardDeviationVector: The vector containing standard deviation values for all actions.
rewardValue: The reward gained at current state.
currentFeatureVector: The currrent state of the environment.
Updates the model parameters using episodeUpdateFunction().
ReinforcementLearningActorCriticBaseModel:episodeUpdate()
Sets a new function on reset alongside with the current reset() function.
ReinforcementLearningActorCriticBaseModel:setResetFunction(resetFunction)
Reset model’s stored values (excluding the parameters).
ReinforcementLearningActorCriticBaseModel:reset()