Release Version 2.11

ValueSchedulers

  • Added these value schedulers:

    • Chained

    • Constant

    • CosineAnnealing

    • Exponential

    • InverseSquareRoot

    • InverseTime

    • Linear

    • MultipleStep

    • Multiplicative

    • Polynomial

    • Sequential

    • Step

  • ValueSchedulers can now be used in place of Optimizers for scheduling learning rate.

  • Removed TimeDecay and StepDecay.

Optimizers

  • Added these optimizers:

    • ResilientBackwardPropagation
* AdaptiveFactor
* AdaptiveMomentEstimationWeightDecay
* RectifiedAdaptiveMomentEstimation
  • The calculate() function now accepts ModelParameters as the third parameter for all the optimizers.

  • Renamed AdaptiveGradientDelta to AdaptiveDelta.

  • The Optimizers’ “internalParameterArray” value are now set to nil instead of an empty table when calling the new() constructor and reset() function.

  • Removed LearningRateStepDecay and LearningRateTimeDecay.

  • Fixed some bugs in AdaptiveMomentEstimation and NesterovAcceleratedAdaptiveMomentEstimation.

Models

  • Added these tabular reinforcement learning models:

    • TabularClippedDoubleQLearning

    • TabularDoubleQLearningV1

    • TabularDoubleQLearningV2

    • TabularDoubleStateActionRewardStateActionV1

    • TabularDoubleStateActionRewardStateActionV2

    • TabularDoubleExpectedStateActionRewardStateActionV1

    • TabularDoubleExpectedStateActionRewardStateActionV2

  • All tabular reinforcement learning models can now be used inside CategoricalPolicy quick setup.

  • TabularReinforcementLearningBaseModel’s new() constructor now accepts “Optimizer” as one of its parameters.

  • TabularQLearning, TabularStateActionRewardStateAction and TabularExpectedStateActionRewardStateAction uses EligibilityTrace object instead of “lambda” numerical value.

  • Added OneVsOne.

  • Made some internal code changes to DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models.

  • Made the default value for “averagingRate” parameter for DeepDoubleQLearningV2, DeepDoubleStateActionRewardStateActionV2 and DeepDoubleExpectedStateActionRewardStateActionV2 models to 0.01 instead of 0.995.

  • Optimized NeuralNetwork code so that the activation function calculation is not performed for bias value.

EligibilityTraces

  • BaseEligibilityTrace’s new() constructor function now accepts “mode” as one of its parameters.

  • Added “stateIndex” parameter as the first parameter for BaseEligibilityTrace’s incrementFunction().

  • Fixed some bugs where the BaseEligibilityTrace returns an “Unknown” class name.

QuickSetup

  • Added “StableSoftmaxSampling” and “StableBoltzmannSampling” options to the CategoricalPolicy.

  • Fixed some bugs in CategoricalPolicy.

Others

  • Removed StringSplitter and Tokenizer.

Regularizers

  • Fixed some bugs where the BaseRegularizer returns an “Unknown” class name.