mlpack
Public Types | Public Member Functions | List of all members
mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > Class Template Reference

Forward declaration of OneStepQLearningWorker. More...

#include <async_learning.hpp>

Public Types

using StateType = typename EnvironmentType::State
 
using ActionType = typename EnvironmentType::Action
 
using TransitionType = std::tuple< StateType, ActionType, double, StateType >
 

Public Member Functions

 OneStepQLearningWorker (const UpdaterType &updater, const EnvironmentType &environment, const TrainingConfig &config, bool deterministic)
 Construct one step Q-Learning worker with the given parameters and environment. More...
 
 OneStepQLearningWorker (const OneStepQLearningWorker &other)
 Copy another OneStepQLearningWorker. More...
 
 OneStepQLearningWorker (OneStepQLearningWorker &&other)
 Take ownership of another OneStepQLearningWorker. More...
 
OneStepQLearningWorkeroperator= (const OneStepQLearningWorker &other)
 Copy another OneStepQLearningWorker. More...
 
OneStepQLearningWorkeroperator= (OneStepQLearningWorker &&other)
 Take ownership of another OneStepQLearningWorker. More...
 
 ~OneStepQLearningWorker ()
 Clean memory.
 
void Initialize (NetworkType &learningNetwork)
 Initialize the worker. More...
 
bool Step (NetworkType &learningNetwork, NetworkType &targetNetwork, size_t &totalSteps, PolicyType &policy, double &totalReward)
 The agent will execute one step. More...
 

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType>
class mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >

Forward declaration of OneStepQLearningWorker.

One step Q-Learning worker.

Template Parameters
EnvironmentTypeThe type of the reinforcement learning task.
NetworkTypeThe type of the network model.
UpdaterTypeThe type of the optimizer.
PolicyTypeThe type of the behavior policy.
EnvironmentTypeThe type of the reinforcement learning task.
NetworkTypeThe type of the network model.
UpdaterTypeThe type of the optimizer.
PolicyTypeThe type of the behavior policy. *

Constructor & Destructor Documentation

◆ OneStepQLearningWorker() [1/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::OneStepQLearningWorker ( const UpdaterType &  updater,
const EnvironmentType &  environment,
const TrainingConfig config,
bool  deterministic 
)
inline

Construct one step Q-Learning worker with the given parameters and environment.

Parameters
updaterThe optimizer.
environmentThe reinforcement learning task.
configHyper-parameters.
deterministicWhether it should be deterministic.

◆ OneStepQLearningWorker() [2/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::OneStepQLearningWorker ( const OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &  other)
inline

Copy another OneStepQLearningWorker.

Parameters
otherOneStepQLearningWorker to copy.

◆ OneStepQLearningWorker() [3/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::OneStepQLearningWorker ( OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &&  other)
inline

Take ownership of another OneStepQLearningWorker.

Parameters
otherOneStepQLearningWorker to take ownership of.

Member Function Documentation

◆ Initialize()

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
void mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::Initialize ( NetworkType &  learningNetwork)
inline

Initialize the worker.

Parameters
learningNetworkThe shared network.

◆ operator=() [1/2]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
OneStepQLearningWorker& mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::operator= ( const OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &  other)
inline

Copy another OneStepQLearningWorker.

Parameters
otherOneStepQLearningWorker to copy.

◆ operator=() [2/2]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
OneStepQLearningWorker& mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::operator= ( OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &&  other)
inline

Take ownership of another OneStepQLearningWorker.

Parameters
otherOneStepQLearningWorker to take ownership of.

◆ Step()

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
bool mlpack::rl::OneStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::Step ( NetworkType &  learningNetwork,
NetworkType &  targetNetwork,
size_t &  totalSteps,
PolicyType &  policy,
double &  totalReward 
)
inline

The agent will execute one step.

Parameters
learningNetworkThe shared learning network.
targetNetworkThe shared target network.
totalStepsThe shared counter for total steps.
policyThe shared behavior policy.
totalRewardThis will be the episode return if the episode ends after this step. Otherwise this is invalid.
Returns
Indicate whether current episode ends after this step.

The documentation for this class was generated from the following files: