mlpack
Classes | Public Types | Public Member Functions | List of all members
mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType > Class Template Reference

The RASearchRules class is a template helper class used by RASearch class when performing rank-approximate search via random-sampling. More...

#include <ra_search_rules.hpp>

Public Types

typedef tree::TraversalInfo< TreeType > TraversalInfoType
 

Public Member Functions

 RASearchRules (const arma::mat &referenceSet, const arma::mat &querySet, const size_t k, MetricType &metric, const double tau=5, const double alpha=0.95, const bool naive=false, const bool sampleAtLeaves=false, const bool firstLeafExact=false, const size_t singleSampleLimit=20, const bool sameSet=false)
 Construct the RASearchRules object. More...
 
void GetResults (arma::Mat< size_t > &neighbors, arma::mat &distances)
 Store the list of candidates for each query point in the given matrices. More...
 
double BaseCase (const size_t queryIndex, const size_t referenceIndex)
 Get the distance from the query point to the reference point. More...
 
double Score (const size_t queryIndex, TreeType &referenceNode)
 Get the score for recursion order. More...
 
double Score (const size_t queryIndex, TreeType &referenceNode, const double baseCaseResult)
 Get the score for recursion order. More...
 
double Rescore (const size_t queryIndex, TreeType &referenceNode, const double oldScore)
 Re-evaluate the score for recursion order. More...
 
double Score (TreeType &queryNode, TreeType &referenceNode)
 Get the score for recursion order. More...
 
double Score (TreeType &queryNode, TreeType &referenceNode, const double baseCaseResult)
 Get the score for recursion order, passing the base case result (in the situation where it may be needed to calculate the recursion order). More...
 
double Rescore (TreeType &queryNode, TreeType &referenceNode, const double oldScore)
 Re-evaluate the score for recursion order. More...
 
size_t NumDistComputations ()
 
size_t NumEffectiveSamples ()
 
const TraversalInfoTypeTraversalInfo () const
 
TraversalInfoTypeTraversalInfo ()
 
size_t MinimumBaseCases () const
 Get the minimum number of base cases that must be performed for each query point for an acceptable result. More...
 

Detailed Description

template<typename SortPolicy, typename MetricType, typename TreeType>
class mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >

The RASearchRules class is a template helper class used by RASearch class when performing rank-approximate search via random-sampling.

Template Parameters
SortPolicyThe sort policy for distances.
MetricTypeThe metric to use for computation.
TreeTypeThe tree type to use; must adhere to the TreeType API.

Constructor & Destructor Documentation

◆ RASearchRules()

template<typename SortPolicy , typename MetricType , typename TreeType >
mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::RASearchRules ( const arma::mat &  referenceSet,
const arma::mat &  querySet,
const size_t  k,
MetricType &  metric,
const double  tau = 5,
const double  alpha = 0.95,
const bool  naive = false,
const bool  sampleAtLeaves = false,
const bool  firstLeafExact = false,
const size_t  singleSampleLimit = 20,
const bool  sameSet = false 
)

Construct the RASearchRules object.

This is usually done from within the RASearch class at search time.

Parameters
referenceSetSet of reference data.
querySetSet of query data.
kNumber of neighbors to search for.
metricInstantiated metric.
tauThe rank-approximation in percentile of the data.
alphaThe desired success probability.
naiveIf true, the rank-approximate search will be performed by directly sampling the whole set instead of using the stratified sampling on the tree.
sampleAtLeavesSample at leaves for faster but less accurate computation.
firstLeafExactTraverse to the first leaf without approximation.
singleSampleLimitThe limit on the largest node that can be approximated by sampling.
sameSetIf true, the query and reference set are taken to be the same, and a query point will not return itself in the results.

Member Function Documentation

◆ BaseCase()

template<typename SortPolicy , typename MetricType , typename TreeType >
force_inline double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::BaseCase ( const size_t  queryIndex,
const size_t  referenceIndex 
)
inline

Get the distance from the query point to the reference point.

This will update the list of candidates with the new point if appropriate.

Parameters
queryIndexIndex of query point.
referenceIndexIndex of reference point.

◆ GetResults()

template<typename SortPolicy , typename MetricType , typename TreeType >
void mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::GetResults ( arma::Mat< size_t > &  neighbors,
arma::mat &  distances 
)

Store the list of candidates for each query point in the given matrices.

Parameters
neighborsMatrix storing lists of neighbors for each query point.
distancesMatrix storing distances of neighbors for each query point.

◆ MinimumBaseCases()

template<typename SortPolicy , typename MetricType , typename TreeType >
size_t mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::MinimumBaseCases ( ) const
inline

Get the minimum number of base cases that must be performed for each query point for an acceptable result.

This is only needed in defeatist search mode.

◆ Rescore() [1/2]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Rescore ( const size_t  queryIndex,
TreeType &  referenceNode,
const double  oldScore 
)
inline

Re-evaluate the score for recursion order.

A low score indicates priority for recursion, while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned). This is used when the score has already been calculated, but another recursion may have modified the bounds for pruning. So the old score is checked against the new pruning bound.

For rank-approximation, it also checks if the number of samples left for a query to satisfy the rank constraint is small enough at this point of the algorithm, then this node is approximated by sampling and given a new score of 'DBL_MAX'.

Parameters
queryIndexIndex of query point.
referenceNodeCandidate node to be recursed into.
oldScoreOld score produced by Score() (or Rescore()).

◆ Rescore() [2/2]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Rescore ( TreeType &  queryNode,
TreeType &  referenceNode,
const double  oldScore 
)
inline

Re-evaluate the score for recursion order.

A low score indicates priority for recursion, while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned). This is used when the score has already been calculated, but another recursion may have modified the bounds for pruning. So the old score is checked against the new pruning bound.

For the rank-approximation, we check if the referenceNode can be approximated by sampling. If it can be, enough samples are made for every query in the queryNode. No further query-tree traversal is performed.

The 'NumSamplesMade' query stat is propagated up the tree. And then if pruning occurs (by distance or by sampling), the 'NumSamplesMade' stat is not propagated down the tree. If no pruning occurs, the stat is propagated down the tree.

Parameters
queryNodeCandidate query node to recurse into.
referenceNodeCandidate reference node to recurse into.
oldScoreOld score produced by Socre() (or Rescore()).

◆ Score() [1/4]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Score ( const size_t  queryIndex,
TreeType &  referenceNode 
)
inline

Get the score for recursion order.

A low score indicates priority for recursion, while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned).

For rank-approximation, the scoring function first checks if pruning by distance is possible. If yes, then the node is given the score of 'DBL_MAX' and the expected number of samples from that node are added to the number of samples made for the query.

If no, then the function tries to see if the node can be pruned by approximation. If number of samples required from this node is small enough, then that number of samples are acquired from this node and the score is set to be 'DBL_MAX'.

If the pruning by approximation is not possible either, the algorithm continues with the usual tree-traversal.

Parameters
queryIndexIndex of query point.
referenceNodeCandidate node to be recursed into.

◆ Score() [2/4]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Score ( const size_t  queryIndex,
TreeType &  referenceNode,
const double  baseCaseResult 
)
inline

Get the score for recursion order.

A low score indicates priority for recursion, while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned).

For rank-approximation, the scoring function first checks if pruning by distance is possible. If yes, then the node is given the score of 'DBL_MAX' and the expected number of samples from that node are added to the number of samples made for the query.

If no, then the function tries to see if the node can be pruned by approximation. If number of samples required from this node is small enough, then that number of samples are acquired from this node and the score is set to be 'DBL_MAX'.

If the pruning by approximation is not possible either, the algorithm continues with the usual tree-traversal.

Parameters
queryIndexIndex of query point.
referenceNodeCandidate node to be recursed into.
baseCaseResultResult of BaseCase(queryIndex, referenceNode).

◆ Score() [3/4]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Score ( TreeType &  queryNode,
TreeType &  referenceNode 
)
inline

Get the score for recursion order.

A low score indicates priority for recursionm while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned).

For the rank-approximation, we check if the referenceNode can be approximated by sampling. If it can be, enough samples are made for every query in the queryNode. No further query-tree traversal is performed.

The 'NumSamplesMade' query stat is propagated up the tree. And then if pruning occurs (by distance or by sampling), the 'NumSamplesMade' stat is not propagated down the tree. If no pruning occurs, the stat is propagated down the tree.

Parameters
queryNodeCandidate query node to recurse into.
referenceNodeCandidate reference node to recurse into.

◆ Score() [4/4]

template<typename SortPolicy , typename MetricType , typename TreeType >
double mlpack::neighbor::RASearchRules< SortPolicy, MetricType, TreeType >::Score ( TreeType &  queryNode,
TreeType &  referenceNode,
const double  baseCaseResult 
)
inline

Get the score for recursion order, passing the base case result (in the situation where it may be needed to calculate the recursion order).

A low score indicates priority for recursion, while DBL_MAX indicates that the node should not be recursed into at all (it should be pruned).

For the rank-approximation, we check if the referenceNode can be approximated by sampling. If it can be, enough samples are made for every query in the queryNode. No further query-tree traversal is performed.

The 'NumSamplesMade' query stat is propagated up the tree. And then if pruning occurs (by distance or by sampling), the 'NumSamplesMade' stat is not propagated down the tree. If no pruning occurs, the stat is propagated down the tree.

Parameters
queryNodeCandidate query node to recurse into.
referenceNodeCandidate reference node to recurse into.
baseCaseResultResult of BaseCase(queryIndex, referenceNode).

The documentation for this class was generated from the following files: