Beta Version 1.21.0

Added

  • Added ProximalPolicyOptimization and VanillaPolicyGradient for Reinforcement Learning.