mlpack
Public Types | Public Member Functions | List of all members
mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > Class Template Reference

Forward declaration of NStepQLearningWorker. More...

#include <async_learning.hpp>

Public Types

using StateType = typename EnvironmentType::State
 
using ActionType = typename EnvironmentType::Action
 
using TransitionType = std::tuple< StateType, ActionType, double, StateType >
 

Public Member Functions

 NStepQLearningWorker (const UpdaterType &updater, const EnvironmentType &environment, const TrainingConfig &config, bool deterministic)
 Construct N-step Q-Learning worker with the given parameters and environment. More...
 
 NStepQLearningWorker (const NStepQLearningWorker &other)
 Copy another NStepQLearningWorker. More...
 
 NStepQLearningWorker (NStepQLearningWorker &&other)
 Take ownership of another NStepQLearningWorker. More...
 
NStepQLearningWorkeroperator= (const NStepQLearningWorker &other)
 Copy another NStepQLearningWorker. More...
 
NStepQLearningWorkeroperator= (NStepQLearningWorker &&other)
 Take ownership of another NStepQLearningWorker. More...
 
 ~NStepQLearningWorker ()
 Clean memory.
 
void Initialize (NetworkType &learningNetwork)
 Initialize the worker. More...
 
bool Step (NetworkType &learningNetwork, NetworkType &targetNetwork, size_t &totalSteps, PolicyType &policy, double &totalReward)
 The agent will execute one step. More...
 

Detailed Description

template<typename EnvironmentType, typename NetworkType, typename UpdaterType, typename PolicyType>
class mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >

Forward declaration of NStepQLearningWorker.

N-step Q-Learning worker.

Template Parameters
EnvironmentTypeThe type of the reinforcement learning task.
NetworkTypeThe type of the network model.
UpdaterTypeThe type of the optimizer.
PolicyTypeThe type of the behavior policy.

Constructor & Destructor Documentation

◆ NStepQLearningWorker() [1/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::NStepQLearningWorker ( const UpdaterType &  updater,
const EnvironmentType &  environment,
const TrainingConfig config,
bool  deterministic 
)
inline

Construct N-step Q-Learning worker with the given parameters and environment.

Parameters
updaterThe optimizer.
environmentThe reinforcement learning task.
configHyper-parameters.
deterministicWhether it should be deterministic.

◆ NStepQLearningWorker() [2/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::NStepQLearningWorker ( const NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &  other)
inline

Copy another NStepQLearningWorker.

Parameters
otherNStepQLearningWorker to copy.

◆ NStepQLearningWorker() [3/3]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::NStepQLearningWorker ( NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &&  other)
inline

Take ownership of another NStepQLearningWorker.

Parameters
otherNStepQLearningWorker to take ownership of.

Member Function Documentation

◆ Initialize()

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
void mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::Initialize ( NetworkType &  learningNetwork)
inline

Initialize the worker.

Parameters
learningNetworkThe shared network.

◆ operator=() [1/2]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
NStepQLearningWorker& mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::operator= ( const NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &  other)
inline

Copy another NStepQLearningWorker.

Parameters
otherNStepQLearningWorker to copy.

◆ operator=() [2/2]

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
NStepQLearningWorker& mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::operator= ( NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType > &&  other)
inline

Take ownership of another NStepQLearningWorker.

Parameters
otherNStepQLearningWorker to take ownership of.

◆ Step()

template<typename EnvironmentType , typename NetworkType , typename UpdaterType , typename PolicyType >
bool mlpack::rl::NStepQLearningWorker< EnvironmentType, NetworkType, UpdaterType, PolicyType >::Step ( NetworkType &  learningNetwork,
NetworkType &  targetNetwork,
size_t &  totalSteps,
PolicyType &  policy,
double &  totalReward 
)
inline

The agent will execute one step.

Parameters
learningNetworkThe shared learning network.
targetNetworkThe shared target network.
totalStepsThe shared counter for total steps.
policyThe shared behavior policy.
totalRewardThis will be the episode return if the episode ends after this step. Otherwise this is invalid.
Returns
Indicate whether current episode ends after this step.

The documentation for this class was generated from the following files: