Release Version 2.11

ValueSchedulers

Added these value schedulers:
- Chained
- Constant
- CosineAnnealing
- Exponential
- InverseSquareRoot
- InverseTime
- Linear
- MultipleStep
- Multiplicative
- Polynomial
- Sequential
- Step
ValueSchedulers can now be used in place of Optimizers for scheduling learning rate.
Removed TimeDecay and StepDecay.

* AdaptiveFactor

* AdaptiveMomentEstimationWeightDecay

* RectifiedAdaptiveMomentEstimation

The calculate() function now accepts ModelParameters as the third parameter for all the optimizers.
Renamed AdaptiveGradientDelta to AdaptiveDelta.
The Optimizers’ “internalParameterArray” value are now set to nil instead of an empty table when calling the new() constructor and reset() function.
Removed LearningRateStepDecay and LearningRateTimeDecay.
Fixed some bugs in AdaptiveMomentEstimation and NesterovAcceleratedAdaptiveMomentEstimation.

Added these tabular reinforcement learning models:
- TabularClippedDoubleQLearning
- TabularDoubleQLearningV1
- TabularDoubleQLearningV2
- TabularDoubleStateActionRewardStateActionV1
- TabularDoubleStateActionRewardStateActionV2
- TabularDoubleExpectedStateActionRewardStateActionV1
- TabularDoubleExpectedStateActionRewardStateActionV2
All tabular reinforcement learning models can now be used inside CategoricalPolicy quick setup.
TabularReinforcementLearningBaseModel’s new() constructor now accepts “Optimizer” as one of its parameters.
TabularQLearning, TabularStateActionRewardStateAction and TabularExpectedStateActionRewardStateAction uses EligibilityTrace object instead of “lambda” numerical value.
Added OneVsOne.
Made some internal code changes to DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models.
Made the default value for “averagingRate” parameter for DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models to 0.01 instead of 0.995.
Optimized NeuralNetwork code so that the activation function calculation is not performed for bias value.

BaseEligibilityTrace’s new() constructor function now accepts “mode” as one of its parameters.
Added “stateIndex” parameter as the first parameter for BaseEligibilityTrace’s incrementFunction().
Fixed some bugs where the BaseEligibilityTrace returns an “Unknown” class name.

Added “StableSoftmaxSampling” and “StableBoltzmannSampling” options to the CategoricalPolicy.
Fixed some bugs in CategoricalPolicy.