mlpack
Static Public Member Functions | Static Public Attributes | List of all members
mlpack::cv::SilhouetteScore Class Reference

The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result. More...

#include <silhouette_score.hpp>

Static Public Member Functions

template<typename DataType , typename Metric >
static double Overall (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric)
 Find the overall silhouette score. More...
 
template<typename DataType >
static arma::rowvec SamplesScore (const DataType &distances, const arma::Row< size_t > &labels)
 Find the individual silhouette scores for precomputted dissimilarites. More...
 
template<typename DataType , typename Metric >
static arma::rowvec SamplesScore (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric)
 Find silhouette score of all individual elements. More...
 
static double MeanDistanceFromCluster (const arma::colvec &distances, const arma::Row< size_t > &labels, const size_t &label, const bool &sameCluster=false)
 Find mean distance of element from a given cluster. More...
 

Static Public Attributes

static const bool NeedsMinimization = false
 Information for hyper-parameter tuning code. More...
 

Detailed Description

The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result.

It provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model, considering the inter-cluster and intra-cluster dissimilarities. Silhoutte Score is dependent on the metric used to calculate the dissimilarities. The best possible score is \( s(i) = 1.0 \). Smaller values of Silhouette Score indicate poor clustering. Negative values would occur when a wrong label was put on the element. Values near zero indicate overlapping clusters. For an element i \( a(i) \) is within cluster average dissimilarity and \( b(i) \) is minimum of average dissimilarity from other clusters. the Silhouette Score \( s(i) \) of a Sample is calculated by

\begin{eqnarray*} s(i) &=& \frac{b(i) - a(i)}{max\{b(i), a(i)\}} \end{eqnarray*}

The Overall Silhouette Score is the mean of individual silhoutte scores.

Member Function Documentation

◆ MeanDistanceFromCluster()

double mlpack::cv::SilhouetteScore::MeanDistanceFromCluster ( const arma::colvec &  distances,
const arma::Row< size_t > &  labels,
const size_t &  label,
const bool &  sameCluster = false 
)
static

Find mean distance of element from a given cluster.

Parameters
distancescolvec containing distances from other elements.
labelsLabels assigned to data by clustering.
labellabel of the target cluster.
sameClustertrue if calculating mean distance from same cluster.
Returns
(double) distance from the cluster.

◆ Overall()

template<typename DataType , typename Metric >
double mlpack::cv::SilhouetteScore::Overall ( const DataType &  X,
const arma::Row< size_t > &  labels,
const Metric &  metric 
)
static

Find the overall silhouette score.

Parameters
XColumn-major data used for clustering.
labelsLabels assigned to data by clustering.
metricMetric to be used to calculate dissimilarity.
Returns
(double) silhouette score.

◆ SamplesScore() [1/2]

template<typename DataType >
arma::rowvec mlpack::cv::SilhouetteScore::SamplesScore ( const DataType &  distances,
const arma::Row< size_t > &  labels 
)
static

Find the individual silhouette scores for precomputted dissimilarites.

Parameters
distancesSquare matrix containing distances between data points.
labelsLabels assigned to data by clustering.
Returns
(arma::rowvec) element-wise silhouette score.

◆ SamplesScore() [2/2]

template<typename DataType , typename Metric >
arma::rowvec mlpack::cv::SilhouetteScore::SamplesScore ( const DataType &  X,
const arma::Row< size_t > &  labels,
const Metric &  metric 
)
static

Find silhouette score of all individual elements.

(Distance not precomputed).

Parameters
XColumn-major data used for clustering.
labelsLabels assigned to data by clustering.
metricMetric to be used to calculate dissimilarity.
Returns
(arma::rowvec) element-wise silhouette score.

Member Data Documentation

◆ NeedsMinimization

const bool mlpack::cv::SilhouetteScore::NeedsMinimization = false
static

Information for hyper-parameter tuning code.

It indicates that we want to maximize the metric.


The documentation for this class was generated from the following files: