Beta Version 2.11.0
Added
-
Added these value schedulers under the “ValueSchedulers” section:
-
Chained
-
Constant
-
CosineAnnealing
-
Exponential
-
InverseSquareRoot
-
InverseTime
-
Linear
-
MultipleStep
-
Multiplicative
-
Polynomial
-
Sequential
-
Step
-
-
Added these optimizers under the “Optimizers” section:
- ResilientBackwardPropagation
* AdaptiveMomentEstimationWeightDecay
-
Added these tabular reinforcement learning models under the “Models” section:
-
TabularClippedDoubleQLearning
-
TabularDoubleQLearningV1
-
TabularDoubleQLearningV2
-
TabularDoubleStateActionRewardStateActionV1
-
TabularDoubleStateActionRewardStateActionV2
-
TabularDoubleExpectedStateActionRewardStateActionV1
-
TabularDoubleExpectedStateActionRewardStateActionV2
-
-
Added OneVsOne under the “Others” section.
-
Added “stateIndex” parameter as the first parameter for BaseEligibilityTrace’s incrementFunction() under the “EligibilityTraces” section.
-
Added “StableSoftmaxSampling” and “StableBoltzmannSampling” options to the CategoricalPolicy under the “QuickSetup” section.
Changes
-
ValueSchedulers can now be used in place of Optimizers for scheduling learning rate.
-
All tabular reinforcement learning models can now be used inside CategoricalPolicy quick setup under the “Models” section.
-
TabularReinforcementLearningBaseModel’s new() constructor now accepts “Optimizer” as one of its parameters under the “Models” section.
-
BaseEligibilityTrace’s new() constructor function now accepts “mode” as one of its parameters under the “EligibilityTraces” section.
-
TabularQLearning, TabularStateActionRewardStateAction and TabularExpectedStateActionRewardStateAction uses EligibilityTrace object instead of “lambda” numerical value under the “Models” section.
-
The calculate() function now accepts ModelParameters as the third parameter for all the optimizers under the “Optimizers” section.
-
Renamed AdaptiveGradientDelta to AdaptiveDelta under the “Optimizers” section.
-
Optimized NeuralNetwork code so that the activation function calculation is not performed for bias values under the “Models” section.
-
The Optimizers’ “internalParameterArray” value are now set to nil instead of an empty table when calling the new() constructor and reset() function under the “Optimizers” section.
-
Made some internal code changes to DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models under the “Optimizers” section.
-
Made the default value for “averagingRate” parameter for DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models to 0.01 instead of 0.995 under the “Models” section.
Removed
-
Removed TimeDecay and StepDecay from the “ValueSchedulers” section.
-
Removed LearningRateStepDecay and LearningRateTimeDecay from the “Optimizers” section.
-
Removed StringSplitter and Tokenizer from the “Others” section.
Fixes
-
Fixed some bugs in AdaptiveMomentEstimation and NesterovAcceleratedAdaptiveMomentEstimation under the “Optimizers” section.
-
Fixed some bugs where BaseRegularizer and BaseEligibilityTrace returns an “Unknown” class name under the “Regularizers” and “EligibilityTraces” sections.
-
Fixed some bugs in CategoricalPolicy under the “QuickSetups” section.