Class: Rumale::NaiveBayes::ComplementNB

Inherits:
BaseNaiveBayes show all
Defined in:
rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb

Overview

ComplementNB is a class that implements Complement Naive Bayes classifier.

Reference

  • Rennie, J. D. M., Shih, L., Teevan, J., and Karger, D. R., “Tackling the Poor Assumptions of Naive Bayes Text Classifiers,” ICML’ 03, pp. 616–623, 2013.

Examples:

require 'rumale/naive_bayes/complement_nb'

estimator = Rumale::NaiveBayes::ComplementNB.new(smoothing_param: 1.0)
estimator.fit(training_samples, training_labels)
results = estimator.predict(testing_samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Methods inherited from BaseNaiveBayes

#predict, #predict_log_proba, #predict_proba

Methods included from Base::Classifier

#predict, #score

Constructor Details

#initialize(smoothing_param: 1.0, norm: false) ⇒ ComplementNB

Create a new classifier with Complement Naive Bayes.

Parameters:

  • smoothing_param (Float) (defaults to: 1.0)

    The smoothing parameter.

  • norm (Boolean) (defaults to: false)

    The flag indicating whether to normlize the weight vectors.



35
36
37
38
39
40
41
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 35

def initialize(smoothing_param: 1.0, norm: false)
  super()
  @params = {
    smoothing_param: smoothing_param,
    norm: norm
  }
end

Instance Attribute Details

#class_priorsNumo::DFloat (readonly)

Return the prior probabilities of the classes.

Returns:

  • (Numo::DFloat)

    (shape: [n_classes])



25
26
27
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 25

def class_priors
  @class_priors
end

#classesNumo::Int32 (readonly)

Return the class labels.

Returns:

  • (Numo::Int32)

    (size: n_classes)



21
22
23
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 21

def classes
  @classes
end

#feature_probsNumo::DFloat (readonly)

Return the conditional probabilities for features of each class.

Returns:

  • (Numo::DFloat)

    (shape: [n_classes, n_features])



29
30
31
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 29

def feature_probs
  @feature_probs
end

Instance Method Details

#decision_function(x) ⇒ Numo::DFloat

Calculate confidence scores for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to compute the scores.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_classes]) Confidence scores per sample for each class.



75
76
77
78
79
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 75

def decision_function(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  @class_log_probs + x.dot(@weights.transpose)
end

#fit(x, y) ⇒ ComplementNB

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

  • y (Numo::Int32)

    (shape: [n_samples]) The categorical variables (e.g. labels) to be used for fitting the model.

Returns:



49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# File 'rumale-naive_bayes/lib/rumale/naive_bayes/complement_nb.rb', line 49

def fit(x, y)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  y = ::Rumale::Validation.check_convert_label_array(y)
  ::Rumale::Validation.check_sample_size(x, y)

  n_samples, = x.shape
  @classes = Numo::Int32[*y.to_a.uniq.sort]
  @class_priors = Numo::DFloat[*@classes.to_a.map { |l| y.eq(l).count.fdiv(n_samples) }]
  @class_log_probs = Numo::NMath.log(@class_priors)
  compl_features = Numo::DFloat[*@classes.to_a.map { |l| x[y.ne(l).where, true].sum(axis: 0) }]
  compl_features += @params[:smoothing_param]
  n_classes = @classes.size
  @feature_probs = compl_features / compl_features.sum(axis: 1).reshape(n_classes, 1)
  feature_log_probs = Numo::NMath.log(@feature_probs)
  @weights = if normalize?
               feature_log_probs / feature_log_probs.sum(axis: 1).reshape(n_classes, 1)
             else
               -feature_log_probs
             end
  self
end