mlpack
Public Types | Public Member Functions | List of all members
mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType > Class Template Reference

The simple Naive Bayes classifier. More...

#include <naive_bayes_classifier.hpp>

Public Types

typedef ModelMatType::elem_type ElemType
 

Public Member Functions

template<typename MatType >
 NaiveBayesClassifier (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const bool incrementalVariance=false, const double epsilon=1e-10)
 Initializes the classifier as per the input and then trains it by calculating the sample mean and variances. More...
 
 NaiveBayesClassifier (const size_t dimensionality=0, const size_t numClasses=0, const double epsilon=1e-10)
 Initialize the Naive Bayes classifier without performing training. More...
 
template<typename MatType >
void Train (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const bool incremental=true)
 Train the Naive Bayes classifier on the given dataset. More...
 
template<typename VecType >
void Train (const VecType &point, const size_t label)
 Train the Naive Bayes classifier on the given point. More...
 
template<typename VecType >
size_t Classify (const VecType &point) const
 Classify the given point, using the trained NaiveBayesClassifier model. More...
 
template<typename VecType , typename ProbabilitiesVecType >
void Classify (const VecType &point, size_t &prediction, ProbabilitiesVecType &probabilities) const
 Classify the given point using the trained NaiveBayesClassifier model and also return estimates of the probability for each class in the given vector. More...
 
template<typename MatType >
void Classify (const MatType &data, arma::Row< size_t > &predictions) const
 Classify the given points using the trained NaiveBayesClassifier model. More...
 
template<typename MatType , typename ProbabilitiesMatType >
void Classify (const MatType &data, arma::Row< size_t > &predictions, ProbabilitiesMatType &probabilities) const
 Classify the given points using the trained NaiveBayesClassifier model and also return estimates of the probabilities for each class in the given matrix. More...
 
const ModelMatType & Means () const
 Get the sample means for each class.
 
ModelMatType & Means ()
 Modify the sample means for each class.
 
const ModelMatType & Variances () const
 Get the sample variances for each class.
 
ModelMatType & Variances ()
 Modify the sample variances for each class.
 
const ModelMatType & Probabilities () const
 Get the prior probabilities for each class.
 
ModelMatType & Probabilities ()
 Modify the prior probabilities for each class.
 
template<typename Archive >
void serialize (Archive &ar, const uint32_t)
 Serialize the classifier.
 

Detailed Description

template<typename ModelMatType = arma::mat>
class mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >

The simple Naive Bayes classifier.

This class trains on the data by calculating the sample mean and variance of the features with respect to each of the labels, and also the class probabilities. The class labels are assumed to be positive integers (starting with 0), and are expected to be the last row of the data input to the constructor.

Mathematically, it computes P(X_i = x_i | Y = y_j) for each feature X_i for each of the labels y_j. Along with this, it also computes the class probabilities P(Y = y_j).

For classifying a data point (x_1, x_2, ..., x_n), it computes the following: arg max_y(P(Y = y)*P(X_1 = x_1 | Y = y) * ... * P(X_n = x_n | Y = y))

Example use:

extern arma::mat training_data, testing_data;
NaiveBayesClassifier<> nbc(training_data, 5);
arma::vec results;
nbc.Classify(testing_data, results);

The ModelMatType template parameter specifies the internal matrix type that NaiveBayesClassifier will use to hold the means, variances, and weights that make up the Naive Bayes model. This can be arma::mat, arma::fmat, or any other Armadillo (or Armadillo-compatible) object. Because ModelMatType may be different than the type of the data the model is trained on, now training is possible with subviews, sparse matrices, or anything else, while still storing the model as a ModelMatType internally.

Template Parameters
ModelMatTypeInternal matrix type to use to store the model.

Constructor & Destructor Documentation

◆ NaiveBayesClassifier() [1/2]

template<typename ModelMatType >
template<typename MatType >
mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::NaiveBayesClassifier ( const MatType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const bool  incrementalVariance = false,
const double  epsilon = 1e-10 
)

Initializes the classifier as per the input and then trains it by calculating the sample mean and variances.

Example use:

extern arma::mat training_data, testing_data;
extern arma::Row<size_t> labels;
NaiveBayesClassifier nbc(training_data, labels, 5);
Parameters
dataTraining data points.
labelsLabels corresponding to training data points.
numClassesNumber of classes in this classifier.
incrementalVarianceIf true, an incremental algorithm is used to calculate the variance; this can prevent loss of precision in some cases, but will be somewhat slower to calculate.
epsilonSmall value to prevent log of zero.

◆ NaiveBayesClassifier() [2/2]

template<typename ModelMatType >
mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::NaiveBayesClassifier ( const size_t  dimensionality = 0,
const size_t  numClasses = 0,
const double  epsilon = 1e-10 
)

Initialize the Naive Bayes classifier without performing training.

All of the parameters of the model will be initialized to zero. Be sure to use Train() before calling Classify(), otherwise the results may be meaningless.

Member Function Documentation

◆ Classify() [1/4]

template<typename ModelMatType >
template<typename VecType >
size_t mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Classify ( const VecType &  point) const

Classify the given point, using the trained NaiveBayesClassifier model.

The predicted label is returned.

Parameters
pointPoint to classify.

◆ Classify() [2/4]

template<typename ModelMatType >
template<typename VecType , typename ProbabilitiesVecType >
void mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Classify ( const VecType &  point,
size_t &  prediction,
ProbabilitiesVecType &  probabilities 
) const

Classify the given point using the trained NaiveBayesClassifier model and also return estimates of the probability for each class in the given vector.

Parameters
pointPoint to classify.
predictionThis will be set to the predicted class of the point.
probabilitiesThis will be filled with class probabilities for the point.

◆ Classify() [3/4]

template<typename ModelMatType >
template<typename MatType >
void mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Classify ( const MatType &  data,
arma::Row< size_t > &  predictions 
) const

Classify the given points using the trained NaiveBayesClassifier model.

The predicted labels for each point are stored in the given vector.

arma::mat test_data; // each column is a test point
arma::Row<size_t> results;
...
nbc.Classify(test_data, results);
Parameters
dataList of data points.
predictionsVector that class predictions will be placed into.

◆ Classify() [4/4]

template<typename ModelMatType >
template<typename MatType , typename ProbabilitiesMatType >
void mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Classify ( const MatType &  data,
arma::Row< size_t > &  predictions,
ProbabilitiesMatType &  probabilities 
) const

Classify the given points using the trained NaiveBayesClassifier model and also return estimates of the probabilities for each class in the given matrix.

The predicted labels for each point are stored in the given vector.

arma::mat test_data; // each column is a test point
arma::Row<size_t> results;
arma::mat resultsProbs;
...
nbc.Classify(test_data, results, resultsProbs);
Parameters
dataSet of points to classify.
predictionsThis will be filled with predictions for each point.
probabilitiesThis will be filled with class probabilities for each point. Each row represents a point.
Template Parameters
MatTypeType of data to be classified.
ProbabilitiesMatTypeType to store output probabilities in.

◆ Train() [1/2]

template<typename ModelMatType >
template<typename MatType >
void mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Train ( const MatType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const bool  incremental = true 
)

Train the Naive Bayes classifier on the given dataset.

If the incremental algorithm is used, the current model is used as a starting point (this is the default). If the incremental algorithm is not used, then the current model is ignored and the new model will be trained only on the given data. Note that even if the incremental algorithm is not used, the data must have the same dimensionality and number of classes that the model was initialized with. If you want to change the dimensionality or number of classes, either re-initialize or call Means(), Variances(), and Probabilities() individually to set them to the right size.

Parameters
dataThe dataset to train on.
labelsThe labels for the dataset.
numClassesThe numbe of classes in the dataset.
incrementalWhether or not to use the incremental algorithm for training.

◆ Train() [2/2]

template<typename ModelMatType >
template<typename VecType >
void mlpack::naive_bayes::NaiveBayesClassifier< ModelMatType >::Train ( const VecType &  point,
const size_t  label 
)

Train the Naive Bayes classifier on the given point.

This will use the incremental algorithm for updating the model parameters. The data must be the same dimensionality as the existing model parameters.

Parameters
pointData point to train on.
labelLabel of data point.

The documentation for this class was generated from the following files: