The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result.
More...
|
template<typename DataType , typename Metric > |
static double | Overall (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric) |
| Find the overall silhouette score. More...
|
|
template<typename DataType > |
static arma::rowvec | SamplesScore (const DataType &distances, const arma::Row< size_t > &labels) |
| Find the individual silhouette scores for precomputted dissimilarites. More...
|
|
template<typename DataType , typename Metric > |
static arma::rowvec | SamplesScore (const DataType &X, const arma::Row< size_t > &labels, const Metric &metric) |
| Find silhouette score of all individual elements. More...
|
|
static double | MeanDistanceFromCluster (const arma::colvec &distances, const arma::Row< size_t > &labels, const size_t &label, const bool &sameCluster=false) |
| Find mean distance of element from a given cluster. More...
|
|
The Silhouette Score is a metric of performance for clustering that represents the quality of clusters made as a result.
It provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model, considering the inter-cluster and intra-cluster dissimilarities. Silhoutte Score is dependent on the metric used to calculate the dissimilarities. The best possible score is \( s(i) = 1.0 \). Smaller values of Silhouette Score indicate poor clustering. Negative values would occur when a wrong label was put on the element. Values near zero indicate overlapping clusters. For an element i \( a(i) \) is within cluster average dissimilarity and \( b(i) \) is minimum of average dissimilarity from other clusters. the Silhouette Score \( s(i) \) of a Sample is calculated by
\begin{eqnarray*} s(i) &=& \frac{b(i) - a(i)}{max\{b(i), a(i)\}} \end{eqnarray*}
The Overall Silhouette Score is the mean of individual silhoutte scores.