|
| DBSCAN (const double epsilon, const size_t minPoints, const bool batchMode=true, RangeSearchType rangeSearch=RangeSearchType(), PointSelectionPolicy pointSelector=PointSelectionPolicy()) |
| Construct the DBSCAN object with the given parameters. More...
|
|
template<typename MatType > |
size_t | Cluster (const MatType &data, arma::mat ¢roids) |
| Performs DBSCAN clustering on the data, returning number of clusters and also the centroid of each cluster. More...
|
|
template<typename MatType > |
size_t | Cluster (const MatType &data, arma::Row< size_t > &assignments) |
| Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments. More...
|
|
template<typename MatType > |
size_t | Cluster (const MatType &data, arma::Row< size_t > &assignments, arma::mat ¢roids) |
| Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments. More...
|
|
template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = OrderedPointSelection>
class mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering technique described in the following paper:
@inproceedings{ester1996density,
title={
A density-based algorithm
for discovering clusters in large spatial
databases with noise.},
author={Ester, M. and Kriegel, H.-P. and Sander, J. and Xu, X.},
booktitle={Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining (KDD '96)},
pages={226--231},
year={1996}
}
The DBSCAN algorithm iteratively clusters points using range searches with a specified radius parameter. This implementation allows configuration of the range search technique used and the point selection strategy by means of template parameters.
- Template Parameters
-
RangeSearchType | Class to use for range searching. |
PointSelectionPolicy | Strategy for selecting next point to cluster with. |
template<typename RangeSearchType , typename PointSelectionPolicy >
mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::DBSCAN |
( |
const double |
epsilon, |
|
|
const size_t |
minPoints, |
|
|
const bool |
batchMode = true , |
|
|
RangeSearchType |
rangeSearch = RangeSearchType() , |
|
|
PointSelectionPolicy |
pointSelector = PointSelectionPolicy() |
|
) |
| |
Construct the DBSCAN object with the given parameters.
The batchMode parameter should be set to false in the case where RAM issues will be encountered (i.e. if the dataset is very large or if epsilon is large). When batchMode is false, each point will be searched iteratively, which could be slower but will use less memory.
- Parameters
-
epsilon | Size of range query. |
minPoints | Minimum number of points for each cluster. |
batchMode | If true, all points are searched in batch. |
rangeSearch | Optional instantiated RangeSearch object. |
pointSelector | OptionL instantiated PointSelectionPolicy object. |
template<typename RangeSearchType , typename PointSelectionPolicy >
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::Cluster |
( |
const MatType & |
data, |
|
|
arma::Row< size_t > & |
assignments |
|
) |
| |
Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments.
Performs DBSCAN clustering on the data, returning the number of clusters and also the list of cluster assignments.
If assignments[i] == SIZE_MAX, then the point is considered "noise".
- Template Parameters
-
MatType | Type of matrix (arma::mat or arma::sp_mat). |
- Parameters
-
data | Dataset to cluster. |
assignments | Vector to store cluster assignments. |
template<typename RangeSearchType , typename PointSelectionPolicy >
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::Cluster |
( |
const MatType & |
data, |
|
|
arma::Row< size_t > & |
assignments, |
|
|
arma::mat & |
centroids |
|
) |
| |
Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments.
If assignments[i] == SIZE_MAX, then the point is considered "noise".
- Template Parameters
-
MatType | Type of matrix (arma::mat or arma::sp_mat). |
- Parameters
-
data | Dataset to cluster. |
assignments | Vector to store cluster assignments. |
centroids | Matrix in which centroids are stored. |