Beta Version 1.25.0

Added

  • Added ProximalPolicyOptimizationClip model.

  • Added ReinforcementLearningActorCriticNeuralNetworkBaseModel model.

Changes

  • Refactored the codes for ProximalPolicyOptimization, VanillaPolicyGradient, ActorCritic and AdvantageActorCritic so that it inherits from ReinforcementLearningActorCriticNeuralNetworkBaseModel.

  • Episode updates now runs at the end of final reinforcement at the same episode. Previously it runs before the first reinforcement at the next episode.