Beta Version 2.11.0

Added

Added these value schedulers under the “ValueSchedulers” section:
- Chained
- Constant
- CosineAnnealing
- Exponential
- InverseSquareRoot
- InverseTime
- Linear
- MultipleStep
- Multiplicative
- Polynomial
- Sequential
- Step
Added these optimizers under the “Optimizers” section:
- ResilientBackwardPropagation

* AdaptiveFactor

* AdaptiveMomentEstimationWeightDecay

* RectifiedAdaptiveMomentEstimation

Added these tabular reinforcement learning models under the “Models” section:
- TabularClippedDoubleQLearning
- TabularDoubleQLearningV1
- TabularDoubleQLearningV2
- TabularDoubleStateActionRewardStateActionV1
- TabularDoubleStateActionRewardStateActionV2
- TabularDoubleExpectedStateActionRewardStateActionV1
- TabularDoubleExpectedStateActionRewardStateActionV2
Added OneVsOne under the “Others” section.
Added “stateIndex” parameter as the first parameter for BaseEligibilityTrace’s incrementFunction() under the “EligibilityTraces” section.
Added “StableSoftmaxSampling” and “StableBoltzmannSampling” options to the CategoricalPolicy under the “QuickSetup” section.

Changes

ValueSchedulers can now be used in place of Optimizers for scheduling learning rate.
All tabular reinforcement learning models can now be used inside CategoricalPolicy quick setup under the “Models” section.
TabularReinforcementLearningBaseModel’s new() constructor now accepts “Optimizer” as one of its parameters under the “Models” section.
BaseEligibilityTrace’s new() constructor function now accepts “mode” as one of its parameters under the “EligibilityTraces” section.
TabularQLearning, TabularStateActionRewardStateAction and TabularExpectedStateActionRewardStateAction uses EligibilityTrace object instead of “lambda” numerical value under the “Models” section.
The calculate() function now accepts ModelParameters as the third parameter for all the optimizers under the “Optimizers” section.
Renamed AdaptiveGradientDelta to AdaptiveDelta under the “Optimizers” section.
Optimized NeuralNetwork code so that the activation function calculation is not performed for bias values under the “Models” section.
The Optimizers’ “internalParameterArray” value are now set to nil instead of an empty table when calling the new() constructor and reset() function under the “Optimizers” section.
Made some internal code changes to DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models under the “Optimizers” section.
Made the default value for “averagingRate” parameter for DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models to 0.01 instead of 0.995 under the “Models” section.

Removed

Removed TimeDecay and StepDecay from the “ValueSchedulers” section.
Removed LearningRateStepDecay and LearningRateTimeDecay from the “Optimizers” section.
Removed StringSplitter and Tokenizer from the “Others” section.

Fixes

Fixed some bugs in AdaptiveMomentEstimation and NesterovAcceleratedAdaptiveMomentEstimation under the “Optimizers” section.
Fixed some bugs where BaseRegularizer and BaseEligibilityTrace returns an “Unknown” class name under the “Regularizers” and “EligibilityTraces” sections.
Fixed some bugs in CategoricalPolicy under the “QuickSetups” section.