Class: Rumale::LinearModel::SVC

Inherits:
BaseEstimator show all
Includes:
Base::Classifier
Defined in:
rumale-linear_model/lib/rumale/linear_model/svc.rb

Overview

Note:

Rumale::SVM provides linear support vector classifier based on LIBLINEAR. If you prefer execution speed, you should use Rumale::SVM::LinearSVC. github.com/yoshoku/rumale-svm

SVC is a class that implements Support Vector Classifier with the squared hinge loss. For multiclass classification problem, it uses one-vs-the-rest strategy.

Examples:

require 'rumale/linear_model/svc'

estimator =
  Rumale::LinearModel::SVC.new(reg_param: 1.0)
estimator.fit(training_samples, traininig_labels)
results = estimator.predict(testing_samples)

Instance Attribute Summary collapse

Attributes inherited from BaseEstimator

#bias_term, #weight_vec

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Methods included from Base::Classifier

#score

Constructor Details

#initialize(reg_param: 1.0, fit_bias: true, bias_scale: 1.0, max_iter: 1000, tol: 1e-4, probability: false, n_jobs: nil, verbose: false) ⇒ SVC

Create a new linear classifier with Support Vector Machine with the squared hinge loss.

Parameters:

  • reg_param (Float) (defaults to: 1.0)

    The regularization parameter.

  • fit_bias (Boolean) (defaults to: true)

    The flag indicating whether to fit the bias term.

  • bias_scale (Float) (defaults to: 1.0)

    The scale of the bias term.

  • max_iter (Integer) (defaults to: 1000)

    The maximum number of epochs that indicates how many times the whole data is given to the training process.

  • tol (Float) (defaults to: 1e-4)

    The tolerance of loss for terminating optimization.

  • probability (Boolean) (defaults to: false)

    The flag indicating whether to perform probability estimation.

  • n_jobs (Integer) (defaults to: nil)

    The number of jobs for running the fit and predict methods in parallel. If nil is given, the methods do not execute in parallel. If zero or less is given, it becomes equal to the number of processors. This parameter is ignored if the Parallel gem is not loaded.

  • verbose (Boolean) (defaults to: false)

    The flag indicating whether to output loss during iteration. ‘iterate.dat’ file is generated by lbfgsb.rb.



50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 50

def initialize(reg_param: 1.0, fit_bias: true, bias_scale: 1.0, max_iter: 1000, tol: 1e-4, probability: false,
               n_jobs: nil, verbose: false)
  super()
  @params = {
    reg_param: reg_param,
    fit_bias: fit_bias,
    bias_scale: bias_scale,
    max_iter: max_iter,
    tol: tol,
    probability: probability,
    n_jobs: n_jobs,
    verbose: verbose
  }
end

Instance Attribute Details

#classesNumo::Int32 (readonly)

Return the class labels.

Returns:

  • (Numo::Int32)

    (shape: [n_classes])



33
34
35
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 33

def classes
  @classes
end

Instance Method Details

#decision_function(x) ⇒ Numo::DFloat

Calculate confidence scores for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to compute the scores.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_classes]) Confidence score per sample.



110
111
112
113
114
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 110

def decision_function(x)
  x = Rumale::Validation.check_convert_sample_array(x)

  x.dot(@weight_vec.transpose) + @bias_term
end

#fit(x, y) ⇒ SVC

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

  • y (Numo::Int32)

    (shape: [n_samples]) The labels to be used for fitting the model.

Returns:

  • (SVC)

    The learned classifier itself.



70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 70

def fit(x, y)
  x = Rumale::Validation.check_convert_sample_array(x)
  y = Rumale::Validation.check_convert_label_array(y)
  Rumale::Validation.check_sample_size(x, y)

  @classes = Numo::Int32[*y.to_a.uniq.sort]
  x = expand_feature(x) if fit_bias?

  if multiclass_problem?
    n_classes = @classes.size
    n_features = x.shape[1]
    n_features -= 1 if fit_bias?
    @weight_vec = Numo::DFloat.zeros(n_classes, n_features)
    @bias_term = Numo::DFloat.zeros(n_classes)
    @prob_param = Numo::DFloat.zeros(n_classes, 2)
    models = if enable_parallel?
               parallel_map(n_classes) do |n|
                 bin_y = Numo::Int32.cast(y.eq(@classes[n])) * 2 - 1
                 partial_fit(x, bin_y)
               end
             else
               Array.new(n_classes) do |n|
                 bin_y = Numo::Int32.cast(y.eq(@classes[n])) * 2 - 1
                 partial_fit(x, bin_y)
               end
             end
    models.each_with_index { |model, n| @weight_vec[n, true], @bias_term[n], @prob_param[n, true] = model }
  else
    negative_label = @classes[0]
    bin_y = Numo::Int32.cast(y.ne(negative_label)) * 2 - 1
    @weight_vec, @bias_term, @prob_param = partial_fit(x, bin_y)
  end

  self
end

#predict(x) ⇒ Numo::Int32

Predict class labels for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to predict the labels.

Returns:

  • (Numo::Int32)

    (shape: [n_samples]) Predicted class label per sample.



120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 120

def predict(x)
  x = Rumale::Validation.check_convert_sample_array(x)

  n_samples = x.shape[0]
  predicted = if multiclass_problem?
                decision_values = decision_function(x)
                if enable_parallel?
                  parallel_map(n_samples) { |n| @classes[decision_values[n, true].max_index] }
                else
                  Array.new(n_samples) { |n| @classes[decision_values[n, true].max_index] }
                end
              else
                decision_values = decision_function(x).ge(0.0).to_a
                Array.new(n_samples) { |n| @classes[decision_values[n]] }
              end
  Numo::Int32.asarray(predicted)
end

#predict_proba(x) ⇒ Numo::DFloat

Predict probability for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to predict the probailities.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_classes]) Predicted probability of each class per sample.



142
143
144
145
146
147
148
149
150
151
152
153
154
155
# File 'rumale-linear_model/lib/rumale/linear_model/svc.rb', line 142

def predict_proba(x)
  x = Rumale::Validation.check_convert_sample_array(x)

  if multiclass_problem?
    probs = 1.0 / (Numo::NMath.exp(@prob_param[true, 0] * decision_function(x) + @prob_param[true, 1]) + 1.0)
    (probs.transpose / probs.sum(axis: 1)).transpose.dup
  else
    n_samples = x.shape[0]
    probs = Numo::DFloat.zeros(n_samples, 2)
    probs[true, 1] = 1.0 / (Numo::NMath.exp(@prob_param[0] * decision_function(x) + @prob_param[1]) + 1.0)
    probs[true, 0] = 1.0 - probs[true, 1]
    probs
  end
end