DataPredict

API Reference - AqwamCustomModels - WeightProximalPolicyOptimizationClip (WPPO-Clip)

WeightProximalPolicyOptimizationClip is a base class for reinforcement learning.

It is a modified ProximalPolicyOptimizationClip where the ratio of the weights are used instead of the action probability vector. Hopefully, by directly optimizing the weights, it makes things more sample efficient due to no backpropagation required.

Notes

Constructors

new()

Create new model object. If any of the arguments are nil, default argument values for that argument will be used.

WeightProximalPolicyOptimizationClip.new(clipRatio: number, discountFactor: number): ModelObject

Parameters:

Returns:

Functions

setParameters()

Set model’s parameters. When any of the arguments are nil, previous argument values for that argument will be used.

WeightProximalPolicyOptimizationClip:setParameters(clipRatio: number, discountFactor: number)

Parameters:

Inherited From

References