Implementation of Acrobot game. More...

#include <acrobot.hpp>

Classes
class	Action

class	State

Public Member Functions
	Acrobot (const size_t maxSteps=500, const double gravity=9.81, const double linkLength1=1.0, const double linkLength2=1.0, const double linkMass1=1.0, const double linkMass2=1.0, const double linkCom1=0.5, const double linkCom2=0.5, const double linkMoi=1.0, const double maxVel1=4 M_PI, const double maxVel2=9 M_PI, const double dt=0.2, const double doneReward=0)
	Construct a Acrobot instance using the given constants. More...

double	Sample (const State &state, const Action &action, State &nextState)
	Dynamics of the Acrobot System. More...

double	Sample (const State &state, const Action &action)
	Dynamics of the Acrobot System. More...

State	InitialSample ()
	This function does random initialization of state space.

bool	IsTerminal (const State &state) const
	This function checks if the acrobot has reached the terminal state. More...

arma::colvec	Dsdt (arma::colvec state, const double torque) const
	This is the ordinary differential equations required for estimation of nextState through RK4 method. More...

double	Wrap (double value, const double minimum, const double maximum) const
	Wrap funtion is required to truncate the angle value from -180 to 180. More...

double	Torque (const Action &action) const
	This function calculates the torque for a particular action. More...

arma::colvec	Rk4 (const arma::colvec state, const double torque) const
	This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation. More...

size_t	StepsPerformed () const
	Get the number of steps performed.

size_t	MaxSteps () const
	Get the maximum number of steps allowed.

size_t &	MaxSteps ()
	Set the maximum number of steps allowed.

Detailed Description

Implementation of Acrobot game.

Acrobot is a 2-link pendulum with only the second joint actuated. Intitially, both links point downwards. The goal is to swing the end-effector at a height at least the length of one link above the base. Both links can swing freely and can pass by each other, i.e., they don't collide when they have the same angle.

Constructor & Destructor Documentation

◆ Acrobot()

mlpack::rl::Acrobot::Acrobot	(	const size_t	maxSteps = `500`,
		const double	gravity = `9.81`,
		const double	linkLength1 = `1.0`,
		const double	linkLength2 = `1.0`,
		const double	linkMass1 = `1.0`,
		const double	linkMass2 = `1.0`,
		const double	linkCom1 = `0.5`,
		const double	linkCom2 = `0.5`,
		const double	linkMoi = `1.0`,
		const double	maxVel1 = `4 * M_PI`,
		const double	maxVel2 = `9 * M_PI`,
		const double	dt = `0.2`,
		const double	doneReward = `0`
	)

inline

Construct a Acrobot instance using the given constants.

Parameters

maxSteps	The number of steps after which the episode terminates. If the value is 0, there is no limit.
gravity	The gravity parameter.
linkLength1	The length of link 1.
linkLength2	The length of link 2.
linkMass1	The mass of link 1.
linkMass2	The mass of link 2.
linkCom1	The position of the center of mass of link 1.
linkCom2	The position of the center of mass of link 2.
linkMoi	The moments of inertia for both links.
maxVel1	The max angular velocity of link1.
maxVel2	The max angular velocity of link2.
dt	The differential value.
doneReward	The reward recieved by the agent on success.

Member Function Documentation

◆ Dsdt()

arma::colvec mlpack::rl::Acrobot::Dsdt	(	arma::colvec	state,
		const double	torque
	)		const

inline

This is the ordinary differential equations required for estimation of nextState through RK4 method.

Parameters

state	Current State.
torque	The torque Applied.

◆ IsTerminal()

bool mlpack::rl::Acrobot::IsTerminal ( const State & state ) const

inline

This function checks if the acrobot has reached the terminal state.

Parameters

state The current State.

Returns: true if state is a terminal state, otherwise false.

◆ Rk4()

arma::colvec mlpack::rl::Acrobot::Rk4	(	const arma::colvec	state,
		const double	torque
	)		const

inline

This function calls the RK4 iterative method to estimate the next state based on given ordinary differential equation.

Parameters

state	The current State.
torque	The torque applied.

◆ Sample() [1/2]

double mlpack::rl::Acrobot::Sample	(	const State &	state,
		const Action &	action,
		State &	nextState
	)

inline

Dynamics of the Acrobot System.

To get reward and next state based on current state and current action. Always return -1 reward.

Parameters

state	The current State.
action	The action taken.
nextState	The next state.

Returns: reward, it's always -1.0.

The value of angular velocity is bounded in min and max value.

◆ Sample() [2/2]

double mlpack::rl::Acrobot::Sample	(	const State &	state,
		const Action &	action
	)

inline

Dynamics of the Acrobot System.

To get reward and next state based on current state and current action. This function calls the Sample function to estimate the next state return reward for taking a particular action.

Parameters

state	The current State.
action	The action taken.

Returns: nextState The next state.

◆ Torque()

double mlpack::rl::Acrobot::Torque ( const Action & action ) const

inline

This function calculates the torque for a particular action.

0 : negative torque, 1 : zero torque, 2 : positive torque.

Parameters

action Action taken.

◆ Wrap()

double mlpack::rl::Acrobot::Wrap	(	double	value,
		const double	minimum,
		const double	maximum
	)		const

inline

Wrap funtion is required to truncate the angle value from -180 to 180.

This function will make sure that value will always be between minimum to maximum.

Parameters

value	Scalar value to wrap.
minimum	Minimum range of wrap.
maximum	Maximum range of wrap.

The documentation for this class was generated from the following file:

src/mlpack/methods/reinforcement_learning/environment/acrobot.hpp

Classes

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ Acrobot()

Member Function Documentation

◆ Dsdt()

◆ IsTerminal()

◆ Rk4()

◆ Sample() [1/2]

◆ Sample() [2/2]

◆ Torque()

◆ Wrap()