mlpack
Classes | Typedefs | Functions | Variables
mlpack::tree Namespace Reference

Trees and tree-building procedures. More...

Classes

class  AllCategoricalSplit
 The AllCategoricalSplit is a splitting function that will split categorical features into many children: one child for each category. More...
 
class  AllDimensionSelect
 This dimension selection policy allows any dimension to be selected for splitting. More...
 
class  AxisParallelProjVector
 AxisParallelProjVector defines an axis-parallel projection vector. More...
 
class  BestBinaryNumericSplit
 The BestBinaryNumericSplit is a splitting function for decision trees that will exhaustively search a numeric dimension for the best binary split. More...
 
class  BinaryNumericSplit
 The BinaryNumericSplit class implements the numeric feature splitting strategy devised by Gama, Rocha, and Medas in the following paper: More...
 
class  BinaryNumericSplitInfo
 
class  BinarySpaceTree
 A binary space partitioning tree, such as a KD-tree or a ball tree. More...
 
class  CategoricalSplitInfo
 
class  CompareCosineNode
 
class  CosineTree
 
class  CoverTree
 A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces. More...
 
struct  CoverTreeMapEntry
 This is the structure the cover tree map will use for traversal. More...
 
class  DecisionTree
 This class implements a generic decision tree learner. More...
 
class  DecisionTreeRegressor
 This class implements a generic decision tree learner. More...
 
class  DiscreteHilbertValue
 The DiscreteHilbertValue class stores Hilbert values for all of the points in a RectangleTree node, and calculates Hilbert values for new points. More...
 
class  EmptyStatistic
 Empty statistic if you are not interested in storing statistics in your tree. More...
 
class  ExampleTree
 This is not an actual space tree but instead an example tree that exists to show and document all the functions that mlpack trees must implement. More...
 
class  FirstPointIsRoot
 This class is meant to be used as a choice for the policy class RootPointPolicy of the CoverTree class. More...
 
class  GiniGain
 The Gini gain, a measure of set purity usable as a fitness function (FitnessFunction) for decision trees. More...
 
class  GiniImpurity
 
class  GreedySingleTreeTraverser
 
struct  HasOptimizedBinarySplitForms
 
class  HilbertRTreeAuxiliaryInformation
 
class  HilbertRTreeDescentHeuristic
 This class chooses the best child of a node in a Hilbert R tree when inserting a new point. More...
 
class  HilbertRTreeSplit
 The splitting procedure for the Hilbert R tree. More...
 
class  HoeffdingCategoricalSplit
 This is the standard Hoeffding-bound categorical feature proposed in the paper below: More...
 
class  HoeffdingInformationGain
 
class  HoeffdingNumericSplit
 The HoeffdingNumericSplit class implements the numeric feature splitting strategy alluded to by Domingos and Hulten in the following paper: More...
 
class  HoeffdingTree
 The HoeffdingTree object represents all of the necessary information for a Hoeffding-bound-based decision tree. More...
 
class  HoeffdingTreeModel
 This class is a serializable Hoeffding tree model that can hold four different types of Hoeffding trees. More...
 
class  HyperplaneBase
 HyperplaneBase defines a splitting hyperplane based on a projection vector and projection value. More...
 
class  InformationGain
 The standard information gain criterion, used for calculating gain in decision trees. More...
 
struct  IsSpillTree
 
struct  IsSpillTree< tree::SpillTree< MetricType, StatisticType, MatType, HyperplaneType, SplitType > >
 
class  MADGain
 The MAD (Mean absolute deviation) gain, is a measure of set purity based on the deviation of dependent values present in the node. More...
 
class  MeanSpaceSplit
 
class  MeanSplit
 A binary space partitioning tree node is split into its left and right child. More...
 
class  MidpointSpaceSplit
 
class  MidpointSplit
 A binary space partitioning tree node is split into its left and right child. More...
 
class  MinimalCoverageSweep
 The MinimalCoverageSweep class finds a partition along which we can split a node according to the coverage of two resulting nodes. More...
 
class  MinimalSplitsNumberSweep
 The MinimalSplitsNumberSweep class finds a partition along which we can split a node according to the number of required splits of the node. More...
 
class  MSEGain
 The MSE (Mean squared error) gain, is a measure of set purity based on the variance of response values present in the node. More...
 
class  MultipleRandomDimensionSelect
 This dimension selection policy allows the selection from a few random dimensions. More...
 
class  NoAuxiliaryInformation
 
class  NumericSplitInfo
 
class  Octree
 
class  ProjVector
 ProjVector defines a general projection vector (not necessarily axis-parallel). More...
 
struct  QueueFrame
 
class  RandomBinaryNumericSplit
 The RandomBinaryNumericSplit is a splitting function for decision trees that will split based on a randomly selected point between the minimum and maximum value of the numerical dimension. More...
 
class  RandomDimensionSelect
 This dimension selection policy only selects one single random dimension. More...
 
class  RandomForest
 The RandomForest class provides an implementation of random forests, described in Breiman's seminal paper: More...
 
class  RectangleTree
 A rectangle type tree tree, such as an R-tree or X-tree. More...
 
class  RPlusPlusTreeAuxiliaryInformation
 
class  RPlusPlusTreeDescentHeuristic
 
class  RPlusPlusTreeSplitPolicy
 The RPlusPlusTreeSplitPolicy helps to determine the subtree into which we should insert a child of an intermediate node that is being split. More...
 
class  RPlusTreeDescentHeuristic
 
class  RPlusTreeSplit
 The RPlusTreeSplit class performs the split process of a node on overflow. More...
 
class  RPlusTreeSplitPolicy
 The RPlusPlusTreeSplitPolicy helps to determine the subtree into which we should insert a child of an intermediate node that is being split. More...
 
class  RPTreeMaxSplit
 This class splits a node by a random hyperplane. More...
 
class  RPTreeMeanSplit
 This class splits a binary space tree. More...
 
class  RStarTreeDescentHeuristic
 When descending a RectangleTree to insert a point, we need to have a way to choose a child node when the point isn't enclosed by any of them. More...
 
class  RStarTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...
 
class  RTreeDescentHeuristic
 When descending a RectangleTree to insert a point, we need to have a way to choose a child node when the point isn't enclosed by any of them. More...
 
class  RTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...
 
class  SpaceSplit
 
class  SpillTree
 A hybrid spill tree is a variant of binary space trees in which the children of a node can "spill over" each other, and contain shared datapoints. More...
 
class  TraversalInfo
 The TraversalInfo class holds traversal information which is used in dual-tree (and single-tree) traversals. More...
 
class  TreeTraits
 The TreeTraits class provides compile-time information on the characteristics of a given tree type. More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, SplitType > >
 This is a specialization of the TreeType class to the BallTree tree type. More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::CellBound, SplitType > >
 This is a specialization of the TreeType class to the UBTree tree type. More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, bound::HollowBallBound, SplitType > >
 This is a specialization of the TreeType class to an arbitrary tree with HollowBallBound (currently only the vantage point tree is supported). More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, RPTreeMaxSplit > >
 This is a specialization of the TreeType class to the max-split random projection tree. More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, RPTreeMeanSplit > >
 This is a specialization of the TreeType class to the mean-split random projection tree. More...
 
class  TreeTraits< BinarySpaceTree< MetricType, StatisticType, MatType, BoundType, SplitType > >
 This is a specialization of the TreeTraits class to the BinarySpaceTree tree type. More...
 
class  TreeTraits< CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > >
 The specialization of the TreeTraits class for the CoverTree tree type. More...
 
class  TreeTraits< Octree< MetricType, StatisticType, MatType > >
 This is a specialization of the TreeTraits class to the Octree tree type. More...
 
class  TreeTraits< RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< SplitPolicyType, SweepType >, DescentType, AuxiliaryInformationType > >
 Since the R+/R++ tree can not have overlapping children, we should define traits for the R+/R++ tree. More...
 
class  TreeTraits< RectangleTree< MetricType, StatisticType, MatType, SplitType, DescentType, AuxiliaryInformationType > >
 This is a specialization of the TreeType class to the RectangleTree tree type. More...
 
class  TreeTraits< SpillTree< MetricType, StatisticType, MatType, HyperplaneType, SplitType > >
 This is a specialization of the TreeType class to the SpillTree tree type. More...
 
class  UBTreeSplit
 Split a node into two parts according to the median address of points contained in the node. More...
 
class  VantagePointSplit
 The class splits a binary space partitioning tree node according to the median distance to the vantage point. More...
 
class  XTreeAuxiliaryInformation
 The XTreeAuxiliaryInformation class provides information specific to X trees for each node in a RectangleTree. More...
 
class  XTreeSplit
 A Rectangle Tree has new points inserted at the bottom. More...
 

Typedefs

template<typename MetricType , typename StatisticType , typename MatType >
using KDTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, MidpointSplit >
 The standard midpoint-split kd-tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using MeanSplitKDTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, MeanSplit >
 A mean-split kd-tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using BallTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, MidpointSplit >
 A midpoint-split ball tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using MeanSplitBallTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::BallBound, MeanSplit >
 A mean-split ball tree. More...
 
template<typename BoundType , typename MatType = arma::mat>
using VPTreeSplit = VantagePointSplit< BoundType, MatType, 100 >
 The vantage point tree (which is also called the metric tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using VPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HollowBallBound, VPTreeSplit >
 
template<typename MetricType , typename StatisticType , typename MatType >
using MaxRPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMaxSplit >
 A max-split random projection tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using RPTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMeanSplit >
 A mean-split random projection tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using UBTree = BinarySpaceTree< MetricType, StatisticType, MatType, bound::CellBound, UBTreeSplit >
 The Universal B-tree. More...
 
typedef boost::heap::priority_queue< CosineTree *, boost::heap::compare< CompareCosineNode > > CosineNodeQueue
 
template<typename MetricType , typename StatisticType , typename MatType >
using StandardCoverTree = CoverTree< MetricType, StatisticType, MatType, FirstPointIsRoot >
 The standard cover tree, as detailed in the original cover tree paper: More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using RTree = RectangleTree< MetricType, StatisticType, MatType, RTreeSplit, RTreeDescentHeuristic, NoAuxiliaryInformation >
 An implementation of the R tree that satisfies the TreeType policy API. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using RStarTree = RectangleTree< MetricType, StatisticType, MatType, RStarTreeSplit, RStarTreeDescentHeuristic, NoAuxiliaryInformation >
 The R*-tree, a more recent variant of the R tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using XTree = RectangleTree< MetricType, StatisticType, MatType, XTreeSplit, RTreeDescentHeuristic, XTreeAuxiliaryInformation >
 The X-tree, a variant of the R tree with supernodes. More...
 
template<typename TreeType >
using DiscreteHilbertRTreeAuxiliaryInformation = HilbertRTreeAuxiliaryInformation< TreeType, DiscreteHilbertValue >
 The Hilbert R-tree, a variant of the R tree with an ordering along the Hilbert curve. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using HilbertRTree = RectangleTree< MetricType, StatisticType, MatType, HilbertRTreeSplit< 2 >, HilbertRTreeDescentHeuristic, DiscreteHilbertRTreeAuxiliaryInformation >
 
template<typename MetricType , typename StatisticType , typename MatType >
using RPlusTree = RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< RPlusTreeSplitPolicy, MinimalCoverageSweep >, RPlusTreeDescentHeuristic, NoAuxiliaryInformation >
 The R+ tree, a variant of the R tree that avoids overlapping rectangles. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using RPlusPlusTree = RectangleTree< MetricType, StatisticType, MatType, RPlusTreeSplit< RPlusPlusTreeSplitPolicy, MinimalSplitsNumberSweep >, RPlusPlusTreeDescentHeuristic, RPlusPlusTreeAuxiliaryInformation >
 The R++ tree, a variant of the R+ tree with maximum buonding rectangles. More...
 
template<typename MetricType >
using AxisOrthogonalHyperplane = HyperplaneBase< bound::HRectBound< MetricType >, AxisParallelProjVector >
 AxisOrthogonalHyperplane represents a hyperplane orthogonal to an axis.
 
template<typename MetricType >
using Hyperplane = HyperplaneBase< bound::BallBound< MetricType >, ProjVector >
 Hyperplane represents a general hyperplane (not necessarily axis-orthogonal).
 
template<typename MetricType , typename StatisticType , typename MatType >
using SPTree = SpillTree< MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MidpointSpaceSplit >
 The hybrid spill tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using MeanSPTree = SpillTree< MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MeanSpaceSplit >
 A mean-split hybrid spill tree. More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using NonOrtSPTree = SpillTree< MetricType, StatisticType, MatType, Hyperplane, MidpointSpaceSplit >
 A hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal). More...
 
template<typename MetricType , typename StatisticType , typename MatType >
using NonOrtMeanSPTree = SpillTree< MetricType, StatisticType, MatType, Hyperplane, MeanSpaceSplit >
 A mean-split hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal). More...
 
template<typename FitnessFunction = GiniGain, template< typename > class NumericSplitType = BestBinaryNumericSplit, template< typename > class CategoricalSplitType = AllCategoricalSplit, typename DimensionSelectType = AllDimensionSelect>
using DecisionStump = DecisionTree< FitnessFunction, NumericSplitType, CategoricalSplitType, DimensionSelectType, false >
 Convenience typedef for decision stumps (single level decision trees).
 
typedef DecisionTree< InformationGain, BestBinaryNumericSplit, AllCategoricalSplit, AllDimensionSelect, true > ID3DecisionStump
 Convenience typedef for ID3 decision stumps (single level decision trees made with the ID3 algorithm).
 
template<typename FitnessFunction >
using BinaryDoubleNumericSplit = BinaryNumericSplit< FitnessFunction, double >
 
template<typename FitnessFunction >
using HoeffdingDoubleNumericSplit = HoeffdingNumericSplit< FitnessFunction, double >
 Convenience typedef.
 
typedef StreamingDecisionTree< HoeffdingTree<> > HoeffdingTreeType
 
template<typename FitnessFunction = GiniGain, typename DimensionSelectionType = MultipleRandomDimensionSelect, template< typename > class CategoricalSplitType = AllCategoricalSplit>
using ExtraTrees = RandomForest< FitnessFunction, DimensionSelectionType, RandomBinaryNumericSplit, CategoricalSplitType, false >
 Convenience typedef for Extra Trees. More...
 

Functions

template<typename TreeType , typename TraversalInfoType >
bool operator< (const QueueFrame< TreeType, TraversalInfoType > &a, const QueueFrame< TreeType, TraversalInfoType > &b)
 
template<typename TreeType , typename StatisticType >
void BuildStatistics (TreeType *node)
 
template<class TreeType , class Walker >
void EnumerateTree (TreeType *tree, Walker &walker)
 Traverses all nodes of the tree, including the inner ones. More...
 
 HAS_MEM_FUNC (BinaryGains, HasBinaryGains)
 
template<bool UseWeights, typename MatType , typename LabelsType , typename WeightsType >
void Bootstrap (const MatType &dataset, const LabelsType &labels, const WeightsType &weights, MatType &bootstrapDataset, LabelsType &bootstrapLabels, WeightsType &bootstrapWeights)
 Given a dataset, create another dataset via bootstrap sampling, with labels.
 

Variables

const double MAX_OVERLAP = 0.2
 The X-tree paper says that a maximum allowable overlap of 20% works well. More...
 

Detailed Description

Trees and tree-building procedures.

Typedef Documentation

◆ BallTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::BallTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::BallBound, MidpointSplit>

A midpoint-split ball tree.

This tree holds its points only in the leaves, similar to the KDTree and MeanSplitKDTree. However, the bounding shape of each node is a ball, not a hyper-rectangle. This can make the ball tree advantageous in some higher-dimensional situations and for some datasets. The tree construction algorithm here is the same as Omohundro's 'K-d construction algorithm', except the splitting value is the midpoint, not the median. This can result in trees that better reflect the data, although they may be unbalanced.

@techreport{omohundro1989five,
author={S.M. Omohundro},
title={Five balltree construction algorithms},
year={1989},
institution={University of California, Berkeley International Computer
Science Institute Technical Reports},
number={TR-89-063}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, KDTree, MeanSplitBallTree

◆ DiscreteHilbertRTreeAuxiliaryInformation

The Hilbert R-tree, a variant of the R tree with an ordering along the Hilbert curve.

This template typedef satisfies the TreeType policy API.

@inproceedings{kamel1994r,
author = {Kamel, Ibrahim and Faloutsos, Christos},
title = {Hilbert R-tree: An Improved R-tree Using Fractals},
booktitle = {Proceedings of the 20th International Conference on Very Large Data Bases},
series = {VLDB '94},
year = {1994},
isbn = {1-55860-153-8},
pages = {500--509},
numpages = {10},
url = {http://dl.acm.org/citation.cfm?id=645920.673001},
acmid = {673001},
publisher = {Morgan Kaufmann Publishers Inc.},
address = {San Francisco, CA, USA}
}
See also
The TreeType policy in mlpack, RTree, DiscreteHilbertRTree

◆ ExtraTrees

template<typename FitnessFunction = GiniGain, typename DimensionSelectionType = MultipleRandomDimensionSelect, template< typename > class CategoricalSplitType = AllCategoricalSplit>
using mlpack::tree::ExtraTrees = typedef RandomForest<FitnessFunction, DimensionSelectionType, RandomBinaryNumericSplit, CategoricalSplitType, false>

Convenience typedef for Extra Trees.

(Extremely Randomized Trees Forest)

@article{10.1007/s10994-006-6226-1,
author = {Geurts, Pierre and Ernst, Damien and Wehenkel, Louis},
title = {Extremely Randomized Trees},
year = {2006},
issue_date = {April 2006},
publisher = {Kluwer Academic Publishers},
address = {USA},
volume = {63},
number = {1},
issn = {0885-6125},
url = {https://doi.org/10.1007/s10994-006-6226-1},
doi = {10.1007/s10994-006-6226-1},
journal = {Mach. Learn.},
month = apr,
pages = {3–42},
numpages = {40},
}

◆ KDTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::KDTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, MidpointSplit>

The standard midpoint-split kd-tree.

This is not the original formulation by Bentley but instead the later formulation by Deng and Moore, which only holds points in the leaves of the tree. When recursively splitting nodes, the KDTree class select the dimension with maximum variance to split on, and picks the midpoint of the range in that dimension as the value on which to split nodes.

For more information, see the following papers.

@article{bentley1975multidimensional,
title={Multidimensional binary search trees used for associative searching},
author={Bentley, J.L.},
journal={Communications of the ACM},
volume={18},
number={9},
pages={509--517},
year={1975},
publisher={ACM}
}
@inproceedings{deng1995multiresolution,
title={Multiresolution instance-based learning},
author={Deng, K. and Moore, A.W.},
booktitle={Proceedings of the 1995 International Joint Conference on AI
(IJCAI-95)},
pages={1233--1239},
year={1995}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, MeanSplitKDTree

◆ MaxRPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::MaxRPTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMaxSplit>

A max-split random projection tree.

When recursively splitting nodes, the MaxSplitRPTree class selects a random hyperplane and splits a node by the hyperplane. The tree holds points in leaf nodes. In contrast to the k-d tree, children of a MaxSplitRPTree node may overlap.

@inproceedings{dasgupta2008,
author = {Dasgupta, Sanjoy and Freund, Yoav},
title = {Random Projection Trees and Low Dimensional Manifolds},
booktitle = {Proceedings of the Fortieth Annual ACM Symposium on Theory of
Computing},
series = {STOC '08},
year = {2008},
pages = {537--546},
numpages = {10},
publisher = {ACM},
address = {New York, NY, USA},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

◆ MeanSplitBallTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::MeanSplitBallTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::BallBound, MeanSplit>

A mean-split ball tree.

This tree, like the BallTree, holds its points only in the leaves. The tree construction algorithm here is the same as Omohundro's 'K-dc onstruction algorithm', except the splitting value is the mean, not the median. This can result in trees that better reflect the data, although they may be unbalanced.

@techreport{omohundro1989five,
author={S.M. Omohundro},
title={Five balltree construction algorithms},
year={1989},
institution={University of California, Berkeley International Computer
Science Institute Technical Reports},
number={TR-89-063}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

◆ MeanSplitKDTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::MeanSplitKDTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, MeanSplit>

A mean-split kd-tree.

This is the same as the KDTree, but this particular implementation will use the mean of the data in the split dimension as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, KDTree

◆ MeanSPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::MeanSPTree = typedef SpillTree<MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MeanSpaceSplit>

A mean-split hybrid spill tree.

This is the same as the SPTree, but this particular implementation will use the mean of the data in the split dimension as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, SPTree

◆ NonOrtMeanSPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::NonOrtMeanSPTree = typedef SpillTree<MetricType, StatisticType, MatType, Hyperplane, MeanSpaceSplit>

A mean-split hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal).

This is the same as the NonOrtSPTree, but this particular implementation will use the mean of the data in the split projection as the value on which to split, instead of the midpoint. This can sometimes give better performance, but it is not always clear which type of tree is best.

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, MeanSPTree, NonOrtSPTree

◆ NonOrtSPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::NonOrtSPTree = typedef SpillTree<MetricType, StatisticType, MatType, Hyperplane, MidpointSpaceSplit>

A hybrid spill tree considering general splitting hyperplanes (not necessarily axis-orthogonal).

This particular implementation will consider the midpoint of the projection of the data in the vector determined by the farthest pair of points. This can sometimes give better performance, but generally it doesn't because it takes O(d) to calculate the projection of the query point when deciding which node to traverse, while when using a axis-orthogonal hyperplane, as SPTree does, we can do it in O(1).

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, SPTree

◆ RPlusPlusTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::RPlusPlusTree = typedef RectangleTree<MetricType, StatisticType, MatType, RPlusTreeSplit<RPlusPlusTreeSplitPolicy, MinimalSplitsNumberSweep>, RPlusPlusTreeDescentHeuristic, RPlusPlusTreeAuxiliaryInformation>

The R++ tree, a variant of the R+ tree with maximum buonding rectangles.

This template typedef satisfies the TreeType policy API.

@inproceedings{sumak2014r,
author = {{\v{S}}um{\'a}k, Martin and Gursk{\'y}, Peter},
title = {R++-Tree: An Efficient Spatial Access Method for Highly Redundant
Point Data},
booktitle = {New Trends in Databases and Information Systems: 17th East
European Conference on Advances in Databases and Information Systems},
year = {2014},
isbn = {978-3-319-01863-8},
pages = {37--44},
publisher = {Springer International Publishing},
}
See also
The TreeType policy in mlpack, RTree, RTree, RPlusTree, RPlusPlusTree

◆ RPlusTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::RPlusTree = typedef RectangleTree<MetricType, StatisticType, MatType, RPlusTreeSplit<RPlusTreeSplitPolicy, MinimalCoverageSweep>, RPlusTreeDescentHeuristic, NoAuxiliaryInformation>

The R+ tree, a variant of the R tree that avoids overlapping rectangles.

The implementation is modified from the original paper implementation. This template typedef satisfies the TreeType policy API.

@inproceedings{sellis1987r,
author = {Sellis, Timos K. and Roussopoulos, Nick and Faloutsos, Christos},
title = {The R+-Tree: A Dynamic Index for Multi-Dimensional Objects},
booktitle = {Proceedings of the 13th International Conference on Very
Large Data Bases},
series = {VLDB '87},
year = {1987},
isbn = {0-934613-46-X},
pages = {507--518},
numpages = {12},
publisher = {Morgan Kaufmann Publishers Inc.},
address = {San Francisco, CA, USA},
}
See also
The TreeType policy in mlpack, RTree, RTree, RPlusTree

◆ RPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::RPTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::HRectBound, RPTreeMeanSplit>

A mean-split random projection tree.

When recursively splitting nodes, the RPTree class may perform one of two different kinds of split. Depending on the diameter and the average distance between points, the node may be split by a random hyperplane or according to the distance from the mean point. The tree holds points in leaf nodes. In contrast to the k-d tree, children of a MaxSplitRPTree node may overlap.

@inproceedings{dasgupta2008,
author = {Dasgupta, Sanjoy and Freund, Yoav},
title = {Random Projection Trees and Low Dimensional Manifolds},
booktitle = {Proceedings of the Fortieth Annual ACM Symposium on Theory of
Computing},
series = {STOC '08},
year = {2008},
pages = {537--546},
numpages = {10},
publisher = {ACM},
address = {New York, NY, USA},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

◆ RStarTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::RStarTree = typedef RectangleTree<MetricType, StatisticType, MatType, RStarTreeSplit, RStarTreeDescentHeuristic, NoAuxiliaryInformation>

The R*-tree, a more recent variant of the R tree.

This template typedef satisfies the TreeType policy API.

@inproceedings{beckmann1990r,
title={The R*-tree: an efficient and robust access method for points and
rectangles},
author={Beckmann, N. and Kriegel, H.-P. and Schneider, R. and Seeger, B.},
booktitle={Proceedings of the 1990 ACM SIGMOD International Conference on
Management of Data (SIGMOD '90)},
volume={19},
number={2},
year={1990},
publisher={ACM}
}
See also
The TreeType policy in mlpack, RTree

◆ RTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::RTree = typedef RectangleTree<MetricType, StatisticType, MatType, RTreeSplit, RTreeDescentHeuristic, NoAuxiliaryInformation>

An implementation of the R tree that satisfies the TreeType policy API.

This is the same R-tree structure as proposed by Guttman:

@inproceedings{guttman1984r,
title={R-trees: a dynamic index structure for spatial searching},
author={Guttman, A.},
booktitle={Proceedings of the 1984 ACM SIGMOD International Conference on
Management of Data (SIGMOD '84)},
volume={14},
number={2},
year={1984},
publisher={ACM}
}
See also
The TreeType policy in mlpack, RStarTree

◆ SPTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::SPTree = typedef SpillTree<MetricType, StatisticType, MatType, AxisOrthogonalHyperplane, MidpointSpaceSplit>

The hybrid spill tree.

It is a variant of metric-trees in which the children of a node can "spill over" onto each other, and contain shared datapoints.

When recursively splitting nodes, the SPTree class select the dimension with maximum width to split on, and picks the midpoint of the range in that dimension as the value on which to split nodes.

In each case a "overlapping buffer" is defined, included points at a distance less than tau from the decision boundary defined by the midpoint.

For each node, we first split the points considering the overlapping buffer. If either of its children contains more than rho fraction of the total points we undo the overlapping splitting. Instead a conventional partition is used. In this way, we can ensure that each split reduces the number of points of a node by at least a constant factor.

For more information, see the following paper.

@inproceedings{
author = {Ting Liu, Andrew W. Moore, Alexander Gray and Ke Yang},
title = {An Investigation of Practical Approximate Nearest Neighbor
Algorithms},
booktitle = {Advances in Neural Information Processing Systems 17},
year = {2005},
pages = {825--832}
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, SpillTree, MeanSPTree

◆ StandardCoverTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::StandardCoverTree = typedef CoverTree<MetricType, StatisticType, MatType, FirstPointIsRoot>

The standard cover tree, as detailed in the original cover tree paper:

@inproceedings{
author={Beygelzimer, A. and Kakade, S. and Langford, J.},
title={Cover trees for nearest neighbor},
booktitle={Proceedings of the 23rd International Conference on Machine
Learning (ICML 2006)},
pages={97--104},
year={2006}
}

This template typedef satisfies the requirements of the TreeType API.

See also
The TreeType policy in mlpack, CoverTree

◆ UBTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::UBTree = typedef BinarySpaceTree<MetricType, StatisticType, MatType, bound::CellBound, UBTreeSplit>

The Universal B-tree.

When recursively splitting nodes, the class calculates addresses of all points and splits each node according to the median address. Children may overlap since the implementation of a tighter bound requires a lot of arithmetic operations. In order to get a tighter bound increase the CellBound::maxNumBounds constant.

@inproceedings{bayer1997,
author = {Bayer, Rudolf},
title = {The Universal B-Tree for Multidimensional Indexing: General
Concepts},
booktitle = {Proceedings of the International Conference on Worldwide
Computing and Its Applications},
series = {WWCA '97},
year = {1997},
isbn = {3-540-63343-X},
pages = {198--209},
numpages = {12},
publisher = {Springer-Verlag},
address = {London, UK, UK},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, BallTree, MeanSplitKDTree

◆ VPTreeSplit

template<typename BoundType , typename MatType = arma::mat>
using mlpack::tree::VPTreeSplit = typedef VantagePointSplit<BoundType, MatType, 100>

The vantage point tree (which is also called the metric tree.

Vantage point trees and metric trees were invented independently by Yianilos an Uhlmann) is a kind of the binary space tree. When recursively splitting nodes, the VPTree class selects the vantage point and splits the node according to the distance to this point. Thus, points that are closer to the vantage point form the inner subtree. Other points form the outer subtree. The vantage point is contained in the first (inner) node.

This implementation differs from the original algorithms. Namely, vantage points are not contained in intermediate nodes. The tree has points only in the leaves of the tree.

For more information, see the following papers.

@inproceedings{yianilos1993vptrees,
author = {Yianilos, Peter N.},
title = {Data Structures and Algorithms for Nearest Neighbor Search in
General Metric Spaces},
booktitle = {Proceedings of the Fourth Annual ACM-SIAM Symposium on
Discrete Algorithms},
series = {SODA '93},
year = {1993},
isbn = {0-89871-313-7},
pages = {311--321},
numpages = {11},
publisher = {Society for Industrial and Applied Mathematics},
address = {Philadelphia, PA, USA}
}
@article{uhlmann1991metrictrees,
author = {Jeffrey K. Uhlmann},
title = {Satisfying general proximity / similarity queries with metric
trees},
journal = {Information Processing Letters},
volume = {40},
number = {4},
pages = {175 - 179},
year = {1991},
}

This template typedef satisfies the TreeType policy API.

See also
The TreeType policy in mlpack, BinarySpaceTree, VantagePointTree, VPTree

◆ XTree

template<typename MetricType , typename StatisticType , typename MatType >
using mlpack::tree::XTree = typedef RectangleTree<MetricType, StatisticType, MatType, XTreeSplit, RTreeDescentHeuristic, XTreeAuxiliaryInformation>

The X-tree, a variant of the R tree with supernodes.

This template typedef satisfies the TreeType policy API.

@inproceedings{berchtold1996r,
title = {The X-Tree: An Index Structure for High--Dimensional Data},
author = {Berchtold, Stefan and Keim, Daniel A. and Kriegel, Hans-Peter},
booktitle = {Proc. 22th Int. Conf. on Very Large Databases (VLDB'96), Bombay, India},
editor = {Vijayaraman, T. and Buchmann, Alex and Mohan, C. and Sarda, N.},
pages = {28--39},
year = {1996},
publisher = {Morgan Kaufmann}
}
See also
The TreeType policy in mlpack, RTree, RStarTree

Function Documentation

◆ EnumerateTree()

template<class TreeType , class Walker >
void mlpack::tree::EnumerateTree ( TreeType *  tree,
Walker &  walker 
)
inline

Traverses all nodes of the tree, including the inner ones.

On each node two methods of the enumer are called:

Enter(TreeType* node, TreeType* parent); Leave(TreeType* node, TreeType* parent);

Parameters
treeThe tree to traverse.
walkerAn instance of custom class, receiver of the enumeration.

Variable Documentation

◆ MAX_OVERLAP

const double mlpack::tree::MAX_OVERLAP = 0.2

The X-tree paper says that a maximum allowable overlap of 20% works well.

This code should eventually be refactored so as to avoid polluting mlpack::tree with this random double.