mlpack
Public Types | Public Member Functions | List of all members
mlpack::rl::GreedyPolicy< EnvironmentType > Class Template Reference

Implementation for epsilon greedy policy. More...

#include <greedy_policy.hpp>

Public Types

using ActionType = typename EnvironmentType::Action
 Convenient typedef for action.
 

Public Member Functions

 GreedyPolicy (const double initialEpsilon, const size_t annealInterval, const double minEpsilon, const double decayRate=1.0)
 Constructor for epsilon greedy policy class. More...
 
ActionType Sample (const arma::colvec &actionValue, bool deterministic=false, const bool isNoisy=false)
 Sample an action based on given action values. More...
 
void Anneal ()
 Exploration probability will anneal at each step.
 
const double & Epsilon () const
 

Detailed Description

template<typename EnvironmentType>
class mlpack::rl::GreedyPolicy< EnvironmentType >

Implementation for epsilon greedy policy.

In general we will select an action greedily based on the action value, however sometimes we will also randomly select an action to encourage exploration.

Template Parameters
EnvironmentTypeThe reinforcement learning task.

Constructor & Destructor Documentation

◆ GreedyPolicy()

template<typename EnvironmentType >
mlpack::rl::GreedyPolicy< EnvironmentType >::GreedyPolicy ( const double  initialEpsilon,
const size_t  annealInterval,
const double  minEpsilon,
const double  decayRate = 1.0 
)
inline

Constructor for epsilon greedy policy class.

Parameters
initialEpsilonThe initial probability to explore (select a random action).
annealIntervalThe steps during which the probability to explore will anneal.
minEpsilonEpsilon will never be less than this value.
decayRateHow much to change the model in response to the estimated error each time the model weights are updated.

Member Function Documentation

◆ Epsilon()

template<typename EnvironmentType >
const double& mlpack::rl::GreedyPolicy< EnvironmentType >::Epsilon ( ) const
inline
Returns
Current possibility to explore.

◆ Sample()

template<typename EnvironmentType >
ActionType mlpack::rl::GreedyPolicy< EnvironmentType >::Sample ( const arma::colvec &  actionValue,
bool  deterministic = false,
const bool  isNoisy = false 
)
inline

Sample an action based on given action values.

Parameters
actionValueValues for each action.
deterministicAlways select the action greedily.
isNoisySpecifies whether the network used is noisy.
Returns
Sampled action.

The documentation for this class was generated from the following file: