Release Version 2.11
ValueSchedulers
-
Added these value schedulers:
-
Chained
-
Constant
-
CosineAnnealing
-
Exponential
-
InverseSquareRoot
-
InverseTime
-
Linear
-
MultipleStep
-
Multiplicative
-
Polynomial
-
Sequential
-
Step
-
-
ValueSchedulers can now be used in place of Optimizers for scheduling learning rate.
-
Removed TimeDecay and StepDecay.
Optimizers
-
Added these optimizers:
- ResilientBackwardPropagation
* AdaptiveMomentEstimationWeightDecay
-
The calculate() function now accepts ModelParameters as the third parameter for all the optimizers.
-
Renamed AdaptiveGradientDelta to AdaptiveDelta.
-
The Optimizers’ “internalParameterArray” value are now set to nil instead of an empty table when calling the new() constructor and reset() function.
-
Removed LearningRateStepDecay and LearningRateTimeDecay.
-
Fixed some bugs in AdaptiveMomentEstimation and NesterovAcceleratedAdaptiveMomentEstimation.
Models
-
Added these tabular reinforcement learning models:
-
TabularClippedDoubleQLearning
-
TabularDoubleQLearningV1
-
TabularDoubleQLearningV2
-
TabularDoubleStateActionRewardStateActionV1
-
TabularDoubleStateActionRewardStateActionV2
-
TabularDoubleExpectedStateActionRewardStateActionV1
-
TabularDoubleExpectedStateActionRewardStateActionV2
-
-
All tabular reinforcement learning models can now be used inside CategoricalPolicy quick setup.
-
TabularReinforcementLearningBaseModel’s new() constructor now accepts “Optimizer” as one of its parameters.
-
TabularQLearning, TabularStateActionRewardStateAction and TabularExpectedStateActionRewardStateAction uses EligibilityTrace object instead of “lambda” numerical value.
-
Added OneVsOne.
-
Made some internal code changes to DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models.
-
Made the default value for “averagingRate” parameter for DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models to 0.01 instead of 0.995.
-
Optimized NeuralNetwork code so that the activation function calculation is not performed for bias value.
EligibilityTraces
-
BaseEligibilityTrace’s new() constructor function now accepts “mode” as one of its parameters.
-
Added “stateIndex” parameter as the first parameter for BaseEligibilityTrace’s incrementFunction().
-
Fixed some bugs where the BaseEligibilityTrace returns an “Unknown” class name.
QuickSetup
-
Added “StableSoftmaxSampling” and “StableBoltzmannSampling” options to the CategoricalPolicy.
-
Fixed some bugs in CategoricalPolicy.
Others
- Removed StringSplitter and Tokenizer.
Regularizers
- Fixed some bugs where the BaseRegularizer returns an “Unknown” class name.