mlpack
|
A Gaussian Mixture Model (GMM). More...
#include <gmm.hpp>
Public Member Functions | |
GMM () | |
Create an empty Gaussian Mixture Model, with zero gaussians. | |
GMM (const size_t gaussians, const size_t dimensionality) | |
Create a GMM with the given number of Gaussians, each of which have the specified dimensionality. More... | |
GMM (const std::vector< distribution::GaussianDistribution > &dists, const arma::vec &weights) | |
Create a GMM with the given dists and weights. More... | |
GMM (const GMM &other) | |
Copy constructor for GMMs. | |
GMM & | operator= (const GMM &other) |
Copy operator for GMMs. | |
size_t | Gaussians () const |
Return the number of gaussians in the model. | |
size_t | Dimensionality () const |
Return the dimensionality of the model. | |
const distribution::GaussianDistribution & | Component (size_t i) const |
Return a const reference to a component distribution. More... | |
distribution::GaussianDistribution & | Component (size_t i) |
Return a reference to a component distribution. More... | |
const arma::vec & | Weights () const |
Return a const reference to the a priori weights of each Gaussian. | |
arma::vec & | Weights () |
Return a reference to the a priori weights of each Gaussian. | |
double | Probability (const arma::vec &observation) const |
Return the probability that the given observation came from this distribution. More... | |
void | Probability (const arma::mat &observation, arma::vec &probs) const |
Return the probability of the given observation matrix. More... | |
double | LogProbability (const arma::vec &observation) const |
Return the log probability that the given observation came from this distribution. More... | |
void | LogProbability (const arma::mat &observation, arma::vec &logProbs) const |
Return the log-probability of the given observation (x) matrix. More... | |
double | Probability (const arma::vec &observation, const size_t component) const |
Return the probability that the given observation came from the given Gaussian component in this distribution. More... | |
double | LogProbability (const arma::vec &observation, const size_t component) const |
Return the log probability that the given observation came from the given Gaussian component in this distribution. More... | |
arma::vec | Random () const |
Return a randomly generated observation according to the probability distribution defined by this object. More... | |
template<typename FittingType = EMFit<>> | |
double | Train (const arma::mat &observations, const size_t trials=1, const bool useExistingModel=false, FittingType fitter=FittingType()) |
Estimate the probability distribution directly from the given observations, using the given algorithm in the FittingType class to fit the data. More... | |
template<typename FittingType = EMFit<>> | |
double | Train (const arma::mat &observations, const arma::vec &probabilities, const size_t trials=1, const bool useExistingModel=false, FittingType fitter=FittingType()) |
Estimate the probability distribution directly from the given observations, taking into account the probability of each observation actually being from this distribution, and using the given algorithm in the FittingType class to fit the data. More... | |
void | Classify (const arma::mat &observations, arma::Row< size_t > &labels) const |
Classify the given observations as being from an individual component in this GMM. More... | |
template<typename Archive > | |
void | serialize (Archive &ar, const uint32_t) |
Serialize the GMM. More... | |
A Gaussian Mixture Model (GMM).
This class uses maximum likelihood loss functions to estimate the parameters of the GMM on a given dataset via the given fitting mechanism, defined by the FittingType template parameter. The GMM can be trained using normal data, or data with probabilities of being from this GMM (see GMM::Train() for more information).
The Train() method uses a template type 'FittingType'. The FittingType template class must provide a way for the GMM to train on data. It must provide the following two functions:
These functions should produce a trained GMM from the given observations and probabilities. These may modify the size of the model (by increasing the size of the mean and covariance vectors as well as the weight vectors), but the method should expect that these vectors are already set to the size of the GMM as specified in the constructor.
For a sample implementation, see the EMFit class; this class uses the EM algorithm to train a GMM, and is the default fitting type for the Train() method.
The GMM, once trained, can be used to generate random points from the distribution and estimate the probability of points being from the distribution. The parameters of the GMM can be obtained through the accessors and mutators.
Example use:
mlpack::gmm::GMM::GMM | ( | const size_t | gaussians, |
const size_t | dimensionality | ||
) |
|
inline |
Create a GMM with the given dists and weights.
dists | Distributions of the model. |
weights | Weights of the model. |
void mlpack::gmm::GMM::Classify | ( | const arma::mat & | observations, |
arma::Row< size_t > & | labels | ||
) | const |
Classify the given observations as being from an individual component in this GMM.
The resultant classifications are stored in the 'labels' object, and each label will be between 0 and (Gaussians() - 1). Supposing that a point was classified with label 2, and that our GMM object was called 'gmm', one could access the relevant Gaussian distribution as follows:
observations | List of observations to classify. |
labels | Object which will be filled with labels. |
observation | Observation matrix for classification. |
labels | Save the labels for the given observation matrix. |
|
inline |
Return a const reference to a component distribution.
i | Index of component. |
|
inline |
Return a reference to a component distribution.
i | Index of component. |
double mlpack::gmm::GMM::LogProbability | ( | const arma::vec & | observation | ) | const |
Return the log probability that the given observation came from this distribution.
Return the log probability of the given observation being from this GMM.
observation | Observation vector to evaluate the probability of. |
observation | Observation vector to compute log-probabilty. |
void mlpack::gmm::GMM::LogProbability | ( | const arma::mat & | observation, |
arma::vec & | logProbs | ||
) | const |
Return the log-probability of the given observation (x) matrix.
Return the log probability of the given observation GMM matrix.
observation | Observation matrix. |
logProbs | Vector to store log-probability value of observation. |
observation | Observation matrix to compute log-probabilty. |
logProbs | Stores the value of log-probability for Observation. |
double mlpack::gmm::GMM::LogProbability | ( | const arma::vec & | observation, |
const size_t | component | ||
) | const |
Return the log probability that the given observation came from the given Gaussian component in this distribution.
Return the log probability of the given observation being from the given component in the mixture.
observation | Observation to evaluate the probability of. |
component | Index of the component of the GMM to be considered. |
observation | Observation vector to compute log-probabilty. |
component | Calculate the log-probability for given observation vector. |
double mlpack::gmm::GMM::Probability | ( | const arma::vec & | observation | ) | const |
Return the probability that the given observation came from this distribution.
Return the probability of the given observation being from this GMM.
observation | Observation vector to evaluate the probability of. |
observation | Observation vector to compute probabilty. |
void mlpack::gmm::GMM::Probability | ( | const arma::mat & | observation, |
arma::vec & | probs | ||
) | const |
Return the probability of the given observation matrix.
Return the probability of the given observation GMM matrix.
observation | Observation matrix. |
probs | Vector to store probability value of observation x. |
observation | Observation matrix to compute probabilty. |
probs | Stores the value of probability for x. |
double mlpack::gmm::GMM::Probability | ( | const arma::vec & | observation, |
const size_t | component | ||
) | const |
Return the probability that the given observation came from the given Gaussian component in this distribution.
Return the probability of the given observation being from the given component in the mixture.
observation | Observation to evaluate the probability of. |
component | Index of the component of the GMM to be considered. |
observation | Observation matrix to compute probabilty. |
component | Calculate the probability for given component. |
arma::vec mlpack::gmm::GMM::Random | ( | ) | const |
Return a randomly generated observation according to the probability distribution defined by this object.
void mlpack::gmm::GMM::serialize | ( | Archive & | ar, |
const uint32_t | |||
) |
Serialize the GMM.
Serialize the object.
double mlpack::gmm::GMM::Train | ( | const arma::mat & | observations, |
const size_t | trials = 1 , |
||
const bool | useExistingModel = false , |
||
FittingType | fitter = FittingType() |
||
) |
Estimate the probability distribution directly from the given observations, using the given algorithm in the FittingType class to fit the data.
Fit the GMM to the given observations.
The fitting will be performed 'trials' times; from these trials, the model with the greatest log-likelihood will be selected. By default, only one trial is performed. The log-likelihood of the best fitting is returned.
Optionally, the existing model can be used as an initial model for the estimation by setting 'useExistingModel' to true. If the fitting procedure is deterministic after the initial position is given, then 'trials' should be set to 1.
FittingType | The type of fitting method which should be used (EMFit<> is suggested). |
observations | Observations of the model. |
trials | Number of trials to perform; the model in these trials with the greatest log-likelihood will be selected. |
useExistingModel | If true, the existing model is used as an initial model for the estimation. |
fitter | The fitter to use, optional. |
double mlpack::gmm::GMM::Train | ( | const arma::mat & | observations, |
const arma::vec & | probabilities, | ||
const size_t | trials = 1 , |
||
const bool | useExistingModel = false , |
||
FittingType | fitter = FittingType() |
||
) |
Estimate the probability distribution directly from the given observations, taking into account the probability of each observation actually being from this distribution, and using the given algorithm in the FittingType class to fit the data.
Fit the GMM to the given observations, each of which has a certain probability of being from this distribution.
The fitting will be performed 'trials' times; from these trials, the model with the greatest log-likelihood will be selected. By default, only one trial is performed. The log-likelihood of the best fitting is returned.
Optionally, the existing model can be used as an initial model for the estimation by setting 'useExistingModel' to true. If the fitting procedure is deterministic after the initial position is given, then 'trials' should be set to 1.
observations | Observations of the model. |
probabilities | Probability of each observation being from this distribution. |
trials | Number of trials to perform; the model in these trials with the greatest log-likelihood will be selected. |
useExistingModel | If true, the existing model is used as an initial model for the estimation. |
fitter | The fitter to use, optional. |