mlpack
Public Types | Public Member Functions | List of all members
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > Class Template Reference

This class implements a generic decision tree learner. More...

#include <decision_tree_regressor.hpp>

Inheritance diagram for mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >:
Inheritance graph
[legend]
Collaboration diagram for mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >:
Collaboration graph
[legend]

Public Types

typedef NumericSplitType< FitnessFunction > NumericSplit
 Allow access to the numeric split type.
 
typedef CategoricalSplitType< FitnessFunction > CategoricalSplit
 Allow access to the categorical split type.
 
typedef DimensionSelectionType DimensionSelection
 Allow access to the dimension selection type.
 

Public Member Functions

 DecisionTreeRegressor ()
 Construct a decision tree without training it. More...
 
template<typename MatType , typename ResponsesType >
 DecisionTreeRegressor (MatType data, const data::DatasetInfo &datasetInfo, ResponsesType responses, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType())
 Construct the decision tree on the given data and responses, where the data can be both numeric and categorical. More...
 
template<typename MatType , typename ResponsesType >
 DecisionTreeRegressor (MatType data, ResponsesType responses, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType())
 Construct the decision tree on the given data and responses, assuming that the data is all of the numeric type. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
 DecisionTreeRegressor (MatType data, const data::DatasetInfo &datasetInfo, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType(), const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Construct the decision tree on the given data and responses with weights, where the data can be both numeric and categorical. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
 DecisionTreeRegressor (MatType data, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType(), const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Construct the decision tree on the given data and responses with weights, assuming that the data is all of the numeric type. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
 DecisionTreeRegressor (const DecisionTreeRegressor &other, MatType data, const data::DatasetInfo &datasetInfo, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Take ownership of another decision tree and train on the given data and responses with weights, where the data can be both numeric and categorical. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
 DecisionTreeRegressor (const DecisionTreeRegressor &other, MatType data, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType(), const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Take ownership of another decision tree and train on the given data and responses with weights, assuming that the data is all of the numeric type. More...
 
 DecisionTreeRegressor (const DecisionTreeRegressor &other)
 Copy another tree. More...
 
 DecisionTreeRegressor (DecisionTreeRegressor &&other)
 Take ownership of another tree. More...
 
DecisionTreeRegressoroperator= (const DecisionTreeRegressor &other)
 Copy another tree. More...
 
DecisionTreeRegressoroperator= (DecisionTreeRegressor &&other)
 Take ownership of another tree. More...
 
 ~DecisionTreeRegressor ()
 Clean up memory.
 
template<typename MatType , typename ResponsesType >
double Train (MatType data, const data::DatasetInfo &datasetInfo, ResponsesType responses, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType())
 Train the decision tree on the given data. More...
 
template<typename MatType , typename ResponsesType >
double Train (MatType data, ResponsesType responses, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType())
 Train the decision tree on the given data, assuming that all dimensions are numeric. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
double Train (MatType data, const data::DatasetInfo &datasetInfo, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType(), const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Train the decision tree on the given weighted data. More...
 
template<typename MatType , typename ResponsesType , typename WeightsType >
double Train (MatType data, ResponsesType responses, WeightsType weights, const size_t minimumLeafSize=10, const double minimumGainSplit=1e-7, const size_t maximumDepth=0, DimensionSelectionType dimensionSelector=DimensionSelectionType(), const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *=0)
 Train the decision tree on the given weighted data, assuming that all dimensions are numeric. More...
 
template<typename VecType >
double Predict (const VecType &point) const
 Make prediction for the given point, using the entire tree. More...
 
template<typename MatType >
void Predict (const MatType &data, arma::Row< double > &predictions) const
 Make prediction for the given points, using the entire tree. More...
 
template<typename Archive >
void serialize (Archive &ar, const uint32_t)
 Serialize the tree.
 
size_t NumChildren () const
 Get the number of children.
 
size_t NumLeaves () const
 Get the number of leaves in the tree. More...
 
const DecisionTreeRegressorChild (const size_t i) const
 Get the child of the given index.
 
DecisionTreeRegressorChild (const size_t i)
 Modify the child of the given index (be careful!).
 
size_t SplitDimension () const
 Get the split dimension (only meaningful if this is a non-leaf in a trained tree). More...
 
template<typename VecType >
size_t CalculateDirection (const VecType &point) const
 Given a point and that this node is not a leaf, calculate the index of the child node this point would go towards. More...
 

Detailed Description

template<typename FitnessFunction = MSEGain, template< typename > class NumericSplitType = BestBinaryNumericSplit, template< typename > class CategoricalSplitType = AllCategoricalSplit, typename DimensionSelectionType = AllDimensionSelect, bool NoRecursion = false>
class mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >

This class implements a generic decision tree learner.

Its behavior can be controlled via its template arguments.

The class inherits from the auxiliary split information in order to prevent an empty auxiliary split information struct from taking any extra size.

Constructor & Destructor Documentation

◆ DecisionTreeRegressor() [1/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( )

Construct a decision tree without training it.

Construct, don't train.

It will be a leaf node.

◆ DecisionTreeRegressor() [2/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( MatType  data,
const data::DatasetInfo datasetInfo,
ResponsesType  responses,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType() 
)

Construct the decision tree on the given data and responses, where the data can be both numeric and categorical.

Construct and train without weight.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data or responses are no longer needed to avoid copies.

Parameters
dataDataset to train on.
datasetInfoType information for each dimension of the dataset.
responsesResponses for each training point.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.

◆ DecisionTreeRegressor() [3/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( MatType  data,
ResponsesType  responses,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType() 
)

Construct the decision tree on the given data and responses, assuming that the data is all of the numeric type.

Construct and train without weight on numeric data.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data or responses are no longer needed to avoid copies.

Parameters
dataDataset to train on.
responsesResponses for each training point.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.

◆ DecisionTreeRegressor() [4/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( MatType  data,
const data::DatasetInfo datasetInfo,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType(),
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Construct the decision tree on the given data and responses with weights, where the data can be both numeric and categorical.

Construct and train with weights.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
dataDataset to train on.
datasetInfoType information for each dimension of the dataset.
responsesResponses for each training point.
weightsThe weight list of given label.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.

◆ DecisionTreeRegressor() [5/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( MatType  data,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType(),
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Construct the decision tree on the given data and responses with weights, assuming that the data is all of the numeric type.

Construct and train on numeric data with weights.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
dataDataset to train on.
responsesResponses for each training point.
weightsThe Weight list of given labels.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.

◆ DecisionTreeRegressor() [6/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( const DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &  other,
MatType  data,
const data::DatasetInfo datasetInfo,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Take ownership of another decision tree and train on the given data and responses with weights, where the data can be both numeric and categorical.

Take ownership of another tree and train with weights.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
otherTree to take ownership of.
dataDataset to train on.
datasetInfoType information for each dimension of the dataset.
responsesResponses for each training point.
weightsThe weight list of given label.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.

◆ DecisionTreeRegressor() [7/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( const DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &  other,
MatType  data,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType(),
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Take ownership of another decision tree and train on the given data and responses with weights, assuming that the data is all of the numeric type.

Take ownership of another tree and train with weights.

Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
otherTree to take ownership of.
dataDataset to train on.
responsesResponses for each training point.
weightsThe Weight list of given labels.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.

◆ DecisionTreeRegressor() [8/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( const DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &  other)

Copy another tree.

This may use a lot of memory—be sure that it's what you want to do.

Parameters
otherTree to copy.

◆ DecisionTreeRegressor() [9/9]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::DecisionTreeRegressor ( DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &&  other)

Take ownership of another tree.

Parameters
otherTree to take ownership of.

Member Function Documentation

◆ CalculateDirection()

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename VecType >
size_t mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::CalculateDirection ( const VecType &  point) const

Given a point and that this node is not a leaf, calculate the index of the child node this point would go towards.

This method is primarily used by the Predict() function, but it can be used in a standalone sense too.

Parameters
pointPoint to predict.

◆ NumLeaves()

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
size_t mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::NumLeaves ( ) const

Get the number of leaves in the tree.

Return the number of leaves.

◆ operator=() [1/2]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > & mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::operator= ( const DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &  other)

Copy another tree.

This may use a lot of memory—be sure that it's what you want to do.

Parameters
otherTree to copy.

◆ operator=() [2/2]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > & mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::operator= ( DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion > &&  other)

Take ownership of another tree.

Parameters
otherTree to take ownership of.

◆ Predict() [1/2]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename VecType >
double mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Predict ( const VecType &  point) const

Make prediction for the given point, using the entire tree.

Return the prediction.

The predicted label is returned.

Parameters
pointPoint to predict.

◆ Predict() [2/2]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType >
void mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Predict ( const MatType &  data,
arma::Row< double > &  predictions 
) const

Make prediction for the given points, using the entire tree.

Return the predictions for a set of points.

The predicted responses for each point are stored in the given vector.

Parameters
dataSet of points to predict.
predictionsThis will be filled with predictions for each point.

◆ SplitDimension()

template<typename FitnessFunction = MSEGain, template< typename > class NumericSplitType = BestBinaryNumericSplit, template< typename > class CategoricalSplitType = AllCategoricalSplit, typename DimensionSelectionType = AllDimensionSelect, bool NoRecursion = false>
size_t mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::SplitDimension ( ) const
inline

Get the split dimension (only meaningful if this is a non-leaf in a trained tree).

◆ Train() [1/4]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType >
double mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Train ( MatType  data,
const data::DatasetInfo datasetInfo,
ResponsesType  responses,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType() 
)

Train the decision tree on the given data.

Train on the given data.

This will overwrite the existing model. The data may have numeric and categorical types, specified by the datasetInfo parameter. Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data or responses are no longer needed to avoid copies.

Parameters
dataDataset to train on.
datasetInfoType information for each dimension.
responsesResponses for each training point.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.
Returns
The final entropy of decision tree.

◆ Train() [2/4]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType >
double mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Train ( MatType  data,
ResponsesType  responses,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType() 
)

Train the decision tree on the given data, assuming that all dimensions are numeric.

Train on the given data, assuming all dimensions are numeric.

This will overwrite the given model. Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data or responses are no longer needed to avoid copies.

Parameters
dataDataset to train on.
responsesResponses for each training point.
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.
Returns
The final entropy of decision tree.

◆ Train() [3/4]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
double mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Train ( MatType  data,
const data::DatasetInfo datasetInfo,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType(),
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Train the decision tree on the given weighted data.

Train on the given weighted data.

This will overwrite the existing model. The data may have numeric and categorical types, specified by the datasetInfo parameter. Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
dataDataset to train on.
datasetInfoType information for each dimension.
responsesResponses for each training point.
weightsWeights of all the labels
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.
Returns
The final entropy of decision tree.

◆ Train() [4/4]

template<typename FitnessFunction , template< typename > class NumericSplitType, template< typename > class CategoricalSplitType, typename DimensionSelectionType , bool NoRecursion>
template<typename MatType , typename ResponsesType , typename WeightsType >
double mlpack::tree::DecisionTreeRegressor< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectionType, NoRecursion >::Train ( MatType  data,
ResponsesType  responses,
WeightsType  weights,
const size_t  minimumLeafSize = 10,
const double  minimumGainSplit = 1e-7,
const size_t  maximumDepth = 0,
DimensionSelectionType  dimensionSelector = DimensionSelectionType(),
const std::enable_if_t< arma::is_arma_type< typename std::remove_reference< WeightsType >::type >::value > *  = 0 
)

Train the decision tree on the given weighted data, assuming that all dimensions are numeric.

Train on the given weighted all numeric data.

This will overwrite the given model. Setting minimumLeafSize and minimumGainSplit too small may cause the tree to overfit, but setting them too large may cause it to underfit.

Use std::move if data, responses or weights are no longer needed to avoid copies.

Parameters
dataDataset to train on.
responsesResponses for each training point.
weightsWeights of all the labels
minimumLeafSizeMinimum number of points in each leaf node.
minimumGainSplitMinimum gain for the node to split.
maximumDepthMaximum depth for the tree.
dimensionSelectorInstantiated dimension selection policy.
Returns
The final entropy of decision tree.

The documentation for this class was generated from the following files: