Class: Rumale::Decomposition::PCA

Inherits:
Base::Estimator show all
Includes:
Base::Transformer
Defined in:
rumale-decomposition/lib/rumale/decomposition/pca.rb

Overview

PCA is a class that implements Principal Component Analysis.

Reference

  • Sharma, A., and Paliwal, K K., “Fast principal component analysis using fixed-point algorithm,” Pattern Recognition Letters, 28, pp. 1151–1155, 2007.

Examples:

require 'rumale/decomposition/pca'

decomposer = Rumale::Decomposition::PCA.new(n_components: 2, solver: 'fpt')
representaion = decomposer.fit_transform(samples)

# If Numo::Linalg is installed, you can specify 'evd' for the solver option.
require 'numo/linalg/autoloader'
require 'rumale/decomposition/pca'

decomposer = Rumale::Decomposition::PCA.new(n_components: 2, solver: 'evd')
representaion = decomposer.fit_transform(samples)

# If Numo::Linalg is loaded and the solver option is not given,
# the solver option is choosen 'evd' automatically.
decomposer = Rumale::Decomposition::PCA.new(n_components: 2)
representaion = decomposer.fit_transform(samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Constructor Details

#initialize(n_components: 2, solver: 'auto', max_iter: 100, tol: 1.0e-4, random_seed: nil) ⇒ PCA

Create a new transformer with PCA.

Parameters:

  • n_components (Integer) (defaults to: 2)

    The number of principal components.

  • solver (String) (defaults to: 'auto')

    The algorithm for the optimization (‘auto’, ‘fpt’ or ‘evd’). ‘auto’ chooses the ‘evd’ solver if Numo::Linalg is loaded. Otherwise, it chooses the ‘fpt’ solver. ‘fpt’ uses the fixed-point algorithm. ‘evd’ performs eigen value decomposition of the covariance matrix of samples.

  • max_iter (Integer) (defaults to: 100)

    The maximum number of iterations. If solver = ‘evd’, this parameter is ignored.

  • tol (Float) (defaults to: 1.0e-4)

    The tolerance of termination criterion. If solver = ‘evd’, this parameter is ignored.

  • random_seed (Integer) (defaults to: nil)

    The seed value using to initialize the random generator.



58
59
60
61
62
63
64
65
66
67
68
69
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 58

def initialize(n_components: 2, solver: 'auto', max_iter: 100, tol: 1.0e-4, random_seed: nil)
  super()
  @params = {
    n_components: n_components,
    solver: 'fpt',
    max_iter: max_iter,
    tol: tol,
    random_seed: random_seed || srand
  }
  @params[:solver] = 'evd' if (solver == 'auto' && enable_linalg?(warning: false)) || solver == 'evd'
  @rng = Random.new(@params[:random_seed])
end

Instance Attribute Details

#componentsNumo::DFloat (readonly)

Returns the principal components.

Returns:

  • (Numo::DFloat)

    (shape: [n_components, n_features])



38
39
40
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 38

def components
  @components
end

#meanNumo::DFloat (readonly)

Returns the mean vector.

Returns:

  • (Numo::DFloat)

    (shape: [n_features])



42
43
44
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 42

def mean
  @mean
end

#rngRandom (readonly)

Return the random generator.

Returns:

  • (Random)


46
47
48
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 46

def rng
  @rng
end

Instance Method Details

#fit(x) ⇒ PCA

Fit the model with given training data.

Returns The learned transformer itself.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

Returns:

  • (PCA)

    The learned transformer itself.



76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 76

def fit(x, _y = nil)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  # initialize some variables.
  @components = nil
  n_samples, n_features = x.shape
  sub_rng = @rng.dup
  # centering.
  @mean = x.mean(0)
  centered_x = x - @mean
  # optimization.
  covariance_mat = centered_x.transpose.dot(centered_x) / (n_samples - 1)
  if @params[:solver] == 'evd' && enable_linalg?
    _, evecs = Numo::Linalg.eigh(covariance_mat, vals_range: (n_features - @params[:n_components])...n_features)
    comps = evecs.reverse(1).transpose
    @components = @params[:n_components] == 1 ? comps[0, true].dup : comps.dup
  else
    @params[:n_components].times do
      comp_vec = ::Rumale::Utils.rand_uniform(n_features, sub_rng)
      @params[:max_iter].times do
        updated = orthogonalize(covariance_mat.dot(comp_vec))
        break if (updated.dot(comp_vec) - 1).abs < @params[:tol]

        comp_vec = updated
      end
      @components = @components.nil? ? comp_vec : Numo::NArray.vstack([@components, comp_vec])
    end
  end
  self
end

#fit_transform(x) ⇒ Numo::DFloat

Fit the model with training data, and then transform them with the learned model.

Returns (shape: [n_samples, n_components]) The transformed data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The transformed data



112
113
114
115
116
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 112

def fit_transform(x, _y = nil)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  fit(x).transform(x)
end

#inverse_transform(z) ⇒ Numo::DFloat

Inverse transform the given transformed data with the learned model.

Parameters:

  • z (Numo::DFloat)

    (shape: [n_samples, n_components]) The data to be restored into original space with the learned model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_featuress]) The restored data.



132
133
134
135
136
137
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 132

def inverse_transform(z)
  z = ::Rumale::Validation.check_convert_sample_array(z)

  c = @components.shape[1].nil? ? @components.expand_dims(0) : @components
  z.dot(c) + @mean
end

#transform(x) ⇒ Numo::DFloat

Transform the given data with the learned model.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The data to be transformed with the learned model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The transformed data.



122
123
124
125
126
# File 'rumale-decomposition/lib/rumale/decomposition/pca.rb', line 122

def transform(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  (x - @mean).dot(@components.transpose)
end