mlpack
Classes | Public Types | Public Member Functions | Protected Member Functions | Friends | List of all members
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > Class Template Reference

A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces. More...

#include <cover_tree.hpp>

Classes

class  DualTreeTraverser
 A dual-tree cover tree traverser; see dual_tree_traverser.hpp. More...
 
class  SingleTreeTraverser
 A single-tree cover tree traverser; see single_tree_traverser.hpp for implementation. More...
 

Public Types

typedef MatType Mat
 So that other classes can access the matrix type.
 
typedef MatType::elem_type ElemType
 The type held by the matrix type.
 
template<typename RuleType >
using BreadthFirstDualTreeTraverser = DualTreeTraverser< RuleType >
 

Public Member Functions

 CoverTree (const MatType &dataset, const ElemType base=2.0, MetricType *metric=NULL)
 Create the cover tree with the given dataset and given base. More...
 
 CoverTree (const MatType &dataset, MetricType &metric, const ElemType base=2.0)
 Create the cover tree with the given dataset and the given instantiated metric. More...
 
 CoverTree (MatType &&dataset, const ElemType base=2.0)
 Create the cover tree with the given dataset, taking ownership of the dataset. More...
 
 CoverTree (MatType &&dataset, MetricType &metric, const ElemType base=2.0)
 Create the cover tree with the given dataset and the given instantiated metric, taking ownership of the dataset. More...
 
 CoverTree (const MatType &dataset, const ElemType base, const size_t pointIndex, const int scale, CoverTree *parent, const ElemType parentDistance, arma::Col< size_t > &indices, arma::vec &distances, size_t nearSetSize, size_t &farSetSize, size_t &usedSetSize, MetricType &metric=NULL)
 Construct a child cover tree node. More...
 
 CoverTree (const MatType &dataset, const ElemType base, const size_t pointIndex, const int scale, CoverTree *parent, const ElemType parentDistance, const ElemType furthestDescendantDistance, MetricType *metric=NULL)
 Manually construct a cover tree node; no tree assembly is done in this constructor, and children must be added manually (use Children()). More...
 
 CoverTree (const CoverTree &other)
 Create a cover tree from another tree. More...
 
 CoverTree (CoverTree &&other)
 Move constructor for a Cover Tree, possess all the members of the given tree. More...
 
CoverTreeoperator= (const CoverTree &other)
 Copy the given Cover Tree. More...
 
CoverTreeoperator= (CoverTree &&other)
 Take ownership of the given Cover Tree. More...
 
template<typename Archive >
 CoverTree (Archive &ar, const typename std::enable_if_t< cereal::is_loading< Archive >()> *=0)
 Create a cover tree from a cereal archive.
 
 ~CoverTree ()
 Delete this cover tree node and its children.
 
const MatType & Dataset () const
 Get a reference to the dataset.
 
size_t Point () const
 Get the index of the point which this node represents.
 
size_t Point (const size_t) const
 For compatibility with other trees; the argument is ignored.
 
bool IsLeaf () const
 
size_t NumPoints () const
 
const CoverTreeChild (const size_t index) const
 Get a particular child node.
 
CoverTreeChild (const size_t index)
 Modify a particular child node.
 
CoverTree *& ChildPtr (const size_t index)
 
size_t NumChildren () const
 Get the number of children.
 
const std::vector< CoverTree * > & Children () const
 Get the children.
 
std::vector< CoverTree * > & Children ()
 Modify the children manually (maybe not a great idea).
 
size_t NumDescendants () const
 Get the number of descendant points. More...
 
size_t Descendant (const size_t index) const
 Get the index of a particular descendant point. More...
 
int Scale () const
 Get the scale of this node.
 
int & Scale ()
 Modify the scale of this node. Be careful...
 
ElemType Base () const
 Get the base.
 
ElemTypeBase ()
 Modify the base; don't do this, you'll break everything.
 
const StatisticType & Stat () const
 Get the statistic for this node.
 
StatisticType & Stat ()
 Modify the statistic for this node.
 
template<typename VecType >
size_t GetNearestChild (const VecType &point, typename std::enable_if_t< IsVector< VecType >::value > *=0)
 Return the index of the nearest child node to the given query point. More...
 
template<typename VecType >
size_t GetFurthestChild (const VecType &point, typename std::enable_if_t< IsVector< VecType >::value > *=0)
 Return the index of the furthest child node to the given query point. More...
 
size_t GetNearestChild (const CoverTree &queryNode)
 Return the index of the nearest child node to the given query node. More...
 
size_t GetFurthestChild (const CoverTree &queryNode)
 Return the index of the furthest child node to the given query node. More...
 
ElemType MinDistance (const CoverTree &other) const
 Return the minimum distance to another node.
 
ElemType MinDistance (const CoverTree &other, const ElemType distance) const
 Return the minimum distance to another node given that the point-to-point distance has already been calculated. More...
 
ElemType MinDistance (const arma::vec &other) const
 Return the minimum distance to another point.
 
ElemType MinDistance (const arma::vec &other, const ElemType distance) const
 Return the minimum distance to another point given that the distance from the center to the point has already been calculated. More...
 
ElemType MaxDistance (const CoverTree &other) const
 Return the maximum distance to another node.
 
ElemType MaxDistance (const CoverTree &other, const ElemType distance) const
 Return the maximum distance to another node given that the point-to-point distance has already been calculated. More...
 
ElemType MaxDistance (const arma::vec &other) const
 Return the maximum distance to another point.
 
ElemType MaxDistance (const arma::vec &other, const ElemType distance) const
 Return the maximum distance to another point given that the distance from the center to the point has already been calculated. More...
 
math::RangeType< ElemTypeRangeDistance (const CoverTree &other) const
 Return the minimum and maximum distance to another node.
 
math::RangeType< ElemTypeRangeDistance (const CoverTree &other, const ElemType distance) const
 Return the minimum and maximum distance to another node given that the point-to-point distance has already been calculated. More...
 
math::RangeType< ElemTypeRangeDistance (const arma::vec &other) const
 Return the minimum and maximum distance to another point.
 
math::RangeType< ElemTypeRangeDistance (const arma::vec &other, const ElemType distance) const
 Return the minimum and maximum distance to another point given that the point-to-point distance has already been calculated. More...
 
CoverTreeParent () const
 Get the parent node.
 
CoverTree *& Parent ()
 Modify the parent node.
 
ElemType ParentDistance () const
 Get the distance to the parent.
 
ElemTypeParentDistance ()
 Modify the distance to the parent.
 
ElemType FurthestPointDistance () const
 Get the distance to the furthest point. This is always 0 for cover trees.
 
ElemType FurthestDescendantDistance () const
 Get the distance from the center of the node to the furthest descendant.
 
ElemTypeFurthestDescendantDistance ()
 Modify the distance from the center of the node to the furthest descendant. More...
 
ElemType MinimumBoundDistance () const
 Get the minimum distance from the center to any bound edge (this is the same as furthestDescendantDistance). More...
 
void Center (arma::vec &center) const
 Get the center of the node and store it in the given vector.
 
MetricType & Metric () const
 Get the instantiated metric.
 
template<typename Archive >
void serialize (Archive &ar, const uint32_t)
 Serialize the tree. More...
 
size_t DistanceComps () const
 
size_t & DistanceComps ()
 

Protected Member Functions

 CoverTree ()
 A default constructor. More...
 

Friends

class cereal::access
 Friend access is given for the default constructor.
 

Detailed Description

template<typename MetricType = metric::LMetric<2, true>, typename StatisticType = EmptyStatistic, typename MatType = arma::mat, typename RootPointPolicy = FirstPointIsRoot>
class mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >

A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces.

Each non-leaf node references a point and has a nonzero number of children, including a "self-child" which references the same point. A leaf node represents only one point.

The tree can be thought of as a hierarchy with the root node at the top level and the leaf nodes at the bottom level. Each level in the tree has an assigned 'scale' i. The tree follows these two invariants:

Note that in the cover tree paper, there is a third invariant (the 'separation invariant'), but that does not apply to our implementation, because we have relaxed the invariant.

The value 'b' refers to the base, which is a parameter of the tree. These three properties make the cover tree very good for fast, high-dimensional nearest-neighbor search.

The theoretical structure of the tree contains many 'implicit' nodes which only have a "self-child" (a child referencing the same point, but at a lower scale level). This practical implementation only constructs explicit nodes – non-leaf nodes with more than one child. A leaf node has no children, and its scale level is INT_MIN.

For more information on cover trees, see

@inproceedings{
author = {Beygelzimer, Alina and Kakade, Sham and Langford, John},
title = {Cover trees for nearest neighbor},
booktitle = {Proceedings of the 23rd International Conference on Machine
Learning},
series = {ICML '06},
year = {2006},
pages = {97--104]
}

For information on runtime bounds of the nearest-neighbor computation using cover trees, see the following paper, presented at NIPS 2009:

@inproceedings{
author = {Ram, P., and Lee, D., and March, W.B., and Gray, A.G.},
title = {Linear-time Algorithms for Pairwise Statistical Problems},
booktitle = {Advances in Neural Information Processing Systems 22},
editor = {Y. Bengio and D. Schuurmans and J. Lafferty and C.K.I. Williams
and A. Culotta},
pages = {1527--1535},
year = {2009}
}

The CoverTree class offers three template parameters; a custom metric type can be used with MetricType (this class defaults to the L2-squared metric). The root node's point can be chosen with the RootPointPolicy; by default, the FirstPointIsRoot policy is used, meaning the first point in the dataset is used. The StatisticType policy allows you to define statistics which can be gathered during the creation of the tree.

Template Parameters
MetricTypeMetric type to use during tree construction.
RootPointPolicyDetermines which point to use as the root node.
StatisticTypeStatistic to be used during tree creation.
MatTypeType of matrix to build the tree on (generally mat or sp_mat).

Constructor & Destructor Documentation

◆ CoverTree() [1/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( const MatType &  dataset,
const ElemType  base = 2.0,
MetricType *  metric = NULL 
)

Create the cover tree with the given dataset and given base.

The dataset will not be modified during the building procedure (unlike BinarySpaceTree).

The last argument will be removed in mlpack 1.1.0 (see #274 and #273).

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building (default 2.0).
metricMetric to use (default NULL).

◆ CoverTree() [2/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( const MatType &  dataset,
MetricType &  metric,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset and the given instantiated metric.

Optionally, set the base. The dataset will not be modified during the building procedure (unlike BinarySpaceTree).

Parameters
datasetReference to the dataset to build a tree on.
metricInstantiated metric to use during tree building.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [3/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( MatType &&  dataset,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset, taking ownership of the dataset.

Optionally, set the base.

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [4/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( MatType &&  dataset,
MetricType &  metric,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset and the given instantiated metric, taking ownership of the dataset.

Optionally, set the base.

Parameters
datasetReference to the dataset to build a tree on.
metricInstantiated metric to use during tree building.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [5/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( const MatType &  dataset,
const ElemType  base,
const size_t  pointIndex,
const int  scale,
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > *  parent,
const ElemType  parentDistance,
arma::Col< size_t > &  indices,
arma::vec &  distances,
size_t  nearSetSize,
size_t &  farSetSize,
size_t &  usedSetSize,
MetricType &  metric = NULL 
)

Construct a child cover tree node.

This constructor is not meant to be used externally, but it could be used to insert another node into a tree. This procedure uses only one vector for the near set, the far set, and the used set (this is to prevent unnecessary memory allocation in recursive calls to this constructor). Therefore, the size of the near set, far set, and used set must be passed in. The near set will be entirely used up, and some of the far set may be used. The value of usedSetSize will be set to the number of points used in the construction of this node, and the value of farSetSize will be modified to reflect the number of points in the far set after the construction of this node.

If you are calling this manually, be careful that the given scale is as small as possible, or you may be creating an implicit node in your tree.

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building.
pointIndexIndex of the point this node references.
scaleScale of this level in the tree.
parentParent of this node (NULL indicates no parent).
parentDistanceDistance to the parent node.
indicesArray of indices, ordered [ nearSet | farSet | usedSet ]; will be modified to [ farSet | usedSet ].
distancesArray of distances, ordered the same way as the indices. These represent the distances between the point specified by pointIndex and each point in the indices array.
nearSetSizeSize of the near set; if 0, this will be a leaf.
farSetSizeSize of the far set; may be modified (if this node uses any points in the far set).
usedSetSizeThe number of points used will be added to this number.
metricMetric to use (default NULL).

◆ CoverTree() [6/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( const MatType &  dataset,
const ElemType  base,
const size_t  pointIndex,
const int  scale,
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > *  parent,
const ElemType  parentDistance,
const ElemType  furthestDescendantDistance,
MetricType *  metric = NULL 
)

Manually construct a cover tree node; no tree assembly is done in this constructor, and children must be added manually (use Children()).

This constructor is useful when the tree is being "imported" into the CoverTree class after being created in some other manner.

Parameters
datasetReference to the dataset this node is a part of.
baseBase that was used for tree building.
pointIndexIndex of the point in the dataset which this node refers to.
scaleScale of this node's level in the tree.
parentParent node (NULL indicates no parent).
parentDistanceDistance to parent node point.
furthestDescendantDistanceDistance to furthest descendant point.
metricInstantiated metric (optional).

◆ CoverTree() [7/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other)

Create a cover tree from another tree.

Be careful! This may use a lot of memory and take a lot of time. This will also make a copy of the dataset.

Parameters
otherCover tree to copy from.

◆ CoverTree() [8/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &&  other)

Move constructor for a Cover Tree, possess all the members of the given tree.

Parameters
otherCover Tree to move.

◆ CoverTree() [9/9]

template<typename MetricType, typename StatisticType , typename MatType, typename RootPointPolicy >
mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree ( )
protected

A default constructor.

Default constructor, only for use with cereal.

This is meant to only be used with cereal, which is allowed with the friend declaration below. This does not return a valid tree! This method must be protected, so that the serialization shim can work with the default constructor.

Member Function Documentation

◆ Descendant()

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Descendant ( const size_t  index) const
inline

Get the index of a particular descendant point.

Return the index of a particular descendant point.

◆ FurthestDescendantDistance()

template<typename MetricType = metric::LMetric<2, true>, typename StatisticType = EmptyStatistic, typename MatType = arma::mat, typename RootPointPolicy = FirstPointIsRoot>
ElemType& mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::FurthestDescendantDistance ( )
inline

Modify the distance from the center of the node to the furthest descendant.

◆ GetFurthestChild() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
template<typename VecType >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::GetFurthestChild ( const VecType &  point,
typename std::enable_if_t< IsVector< VecType >::value > *  = 0 
)

Return the index of the furthest child node to the given query point.

If this is a leaf node, it will return NumChildren() (invalid index).

◆ GetFurthestChild() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::GetFurthestChild ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  queryNode)

Return the index of the furthest child node to the given query node.

If it can't decide, it will return NumChildren() (invalid index).

◆ GetNearestChild() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
template<typename VecType >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::GetNearestChild ( const VecType &  point,
typename std::enable_if_t< IsVector< VecType >::value > *  = 0 
)

Return the index of the nearest child node to the given query point.

If this is a leaf node, it will return NumChildren() (invalid index).

◆ GetNearestChild() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::GetNearestChild ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  queryNode)

Return the index of the nearest child node to the given query node.

If it can't decide, it will return NumChildren() (invalid index).

◆ MaxDistance() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::MaxDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the maximum distance to another node given that the point-to-point distance has already been calculated.

◆ MaxDistance() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::MaxDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the maximum distance to another point given that the distance from the center to the point has already been calculated.

◆ MinDistance() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::MinDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the minimum distance to another node given that the point-to-point distance has already been calculated.

◆ MinDistance() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::MinDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the minimum distance to another point given that the distance from the center to the point has already been calculated.

◆ MinimumBoundDistance()

template<typename MetricType = metric::LMetric<2, true>, typename StatisticType = EmptyStatistic, typename MatType = arma::mat, typename RootPointPolicy = FirstPointIsRoot>
ElemType mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::MinimumBoundDistance ( ) const
inline

Get the minimum distance from the center to any bound edge (this is the same as furthestDescendantDistance).

◆ NumDescendants()

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
size_t mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::NumDescendants ( ) const
inline

Get the number of descendant points.

Return the number of descendant points.

◆ operator=() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > & mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::operator= ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other)

Copy the given Cover Tree.

Parameters
otherThe tree to be copied.

◆ operator=() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > & mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::operator= ( CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &&  other)

Take ownership of the given Cover Tree.

Parameters
otherThe tree to take ownership of.

◆ RangeDistance() [1/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
math::RangeType< typename CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType > mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::RangeDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the minimum and maximum distance to another node given that the point-to-point distance has already been calculated.

◆ RangeDistance() [2/2]

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
math::RangeType< typename CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::ElemType > mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::RangeDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the minimum and maximum distance to another point given that the point-to-point distance has already been calculated.

◆ serialize()

template<typename MetricType , typename StatisticType , typename MatType , typename RootPointPolicy >
template<typename Archive >
void mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::serialize ( Archive &  ar,
const uint32_t   
)

Serialize the tree.

Serialize to/from a cereal archive.


The documentation for this class was generated from the following files: