CategoricalPolicy is a base class for setuping up reinforcement learning functions.
Create new model object. If any of the arguments are nil, default argument values for that argument will be used.
CategoricalPolicy.new(numberOfReinforcementsPerEpisode: integer, epsilon: number, actionSelectionFunction: string): CategoricalPolicyObject
numberOfReinforcementsPerEpisode: The number of reinforcements to decay the epsilon value.
epsilon: The higher the value, the more likely it focuses on exploration over exploitation. The value must be set between 0 and 1. Exploration means choosing a random action to try to get better overall average performance, while exploiting means choosing an action that has the potential to increase the performance.
actionSelectionFunction: The function on how to choose an action. Available options are:
Maximum (Default)
Sample
Set model’s parameters. When any of the arguments are nil, previous argument values for that argument will be used.
CategoricalPolicy:setParameters(numberOfReinforcementsPerEpisode: integer, epsilon: number, actionSelectionFunction: string)
numberOfReinforcementsPerEpisode: The number of reinforcements to decay the epsilon value.
epsilon: The higher the value, the more likely it focuses on exploration over exploitation. The value must be set between 0 and 1. Exploration means choosing a random action to try to get better overall average performance, while exploiting means choosing an action that has the potential to increase the performance.
actionSelectionFunction: The function on how to choose an action. Available options are:
Maximum
Sample
CategoricalPolicy:setModel(Model: ModelObject)
CategoricalPolicy:getModel(): ModelObject
CategoricalPolicy:setExperienceReplay(ExperienceReplay: ExperienceReplayObject)
CategoricalPolicy:getExperienceReplay(): ExperienceReplayObject
CategoricalPolicy:setEpsilonValueScheduler(EpsilonValueScheduler: ValueSchedulerObject)
CategoricalPolicy:getEpsilonVaueScheduler(): ValueSchedulerObject
CategoricalPolicy:setClassesList(classesList: [])
Gets all the classes stored in the NeuralNetwork model.
CategoricalPolicy:getClassesList(): []
Sets a new function on update alongside with the current model’s update() function.
CategoricalPolicy:extendUpdateFunction(updateFunction)
Sets a new function on episode update alongside with the current model’s episodeUpdate() function.
CategoricalPolicy:extendEpisodeUpdateFunction(episodeUpdateFunction)
Reward or punish model based on the current state of the environment.
CategoricalPolicy:reinforce(currentFeatureVector: Matrix, rewardValue: number, returnOriginalOutput: boolean): integer, number -OR- Matrix
currentFeatureVector: Matrix containing data from the current state.
rewardValue: The reward value added/subtracted from the current state (recommended value between -1 and 1, but can be larger than these values).
returnOriginalOutput: Set whether or not to return predicted vector instead of value with highest probability.
predictedLabel: A label that is predicted by the model.
value: The value of predicted label.
-OR-
Resets the current parameters values.
CategoricalPolicy:reset()
Set whether or not to show the current number of episodes and current epsilon.
CategoricalPolicy:setPrintOutput(option: boolean)
CategoricalPolicy:getCurrentNumberOfEpisodes(): integer
CategoricalPolicy:getCurrentNumberOfReinforcements(): integer
CategoricalPolicy:getCurrentEpsilon(): number