Class: Rumale::Manifold::MDS

Inherits:
Base::Estimator show all
Includes:
Base::Transformer
Defined in:
rumale-manifold/lib/rumale/manifold/mds.rb

Overview

MDS is a class that implements Metric Multidimensional Scaling (MDS) with Scaling by MAjorizing a COmplicated Function (SMACOF) algorithm.

Reference

  • Groenen, P J. F. and van de Velden, M., “Multidimensional Scaling by Majorization: A Review,” J. of Statistical Software, Vol. 73 (8), 2016.

Examples:

require 'rumale/manifold/mds'

mds = Rumale::Manifold::MDS.new(init: 'pca', max_iter: 500, random_seed: 1)
representations = mds.fit_transform(samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Constructor Details

#initialize(n_components: 2, metric: 'euclidean', init: 'random', max_iter: 300, tol: nil, verbose: false, random_seed: nil) ⇒ MDS

Create a new transformer with MDS.

Parameters:

  • n_components (Integer) (defaults to: 2)

    The number of dimensions on representation space.

  • metric (String) (defaults to: 'euclidean')

    The metric to calculate the distances in original space. If metric is ‘euclidean’, Euclidean distance is calculated for distance in original space. If metric is ‘precomputed’, the fit and fit_transform methods expect to be given a distance matrix.

  • init (String) (defaults to: 'random')

    The init is a method to initialize the representaion space. If init is ‘random’, the representaion space is initialized with normal random variables. If init is ‘pca’, the result of principal component analysis as the initial value of the representation space.

  • max_iter (Integer) (defaults to: 300)

    The maximum number of iterations.

  • tol (Float) (defaults to: nil)

    The tolerance of stress value for terminating optimization. If tol is nil, it does not use stress value as a criterion for terminating the optimization.

  • verbose (Boolean) (defaults to: false)

    The flag indicating whether to output stress value during iteration.

  • random_seed (Integer) (defaults to: nil)

    The seed value using to initialize the random generator.



56
57
58
59
60
61
62
63
64
65
66
67
68
69
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 56

def initialize(n_components: 2, metric: 'euclidean', init: 'random',
               max_iter: 300, tol: nil, verbose: false, random_seed: nil)
  super()
  @params = {
    n_components: n_components,
    max_iter: max_iter,
    tol: tol,
    metric: metric,
    init: init,
    verbose: verbose,
    random_seed: random_seed || srand
  }
  @rng = Random.new(@params[:random_seed])
end

Instance Attribute Details

#embeddingNumo::DFloat (readonly)

Return the data in representation space.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components])



28
29
30
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 28

def embedding
  @embedding
end

#n_iterInteger (readonly)

Return the number of iterations run for optimization

Returns:

  • (Integer)


36
37
38
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 36

def n_iter
  @n_iter
end

#rngRandom (readonly)

Return the random generator.

Returns:

  • (Random)


40
41
42
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 40

def rng
  @rng
end

#stressFloat (readonly)

Return the stress function value after optimization.

Returns:

  • (Float)


32
33
34
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 32

def stress
  @stress
end

Instance Method Details

#fit(x) ⇒ MDS

Fit the model with given training data.

Returns The learned transformer itself.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_samples, n_samples]).

Returns:

  • (MDS)

    The learned transformer itself.



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 77

def fit(x, _not_used = nil)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  if @params[:metric] == 'precomputed' && x.shape[0] != x.shape[1]
    raise ArgumentError, 'Expect the input distance matrix to be square.'
  end

  # initialize some varibales.
  n_samples = x.shape[0]
  hi_distance_mat = @params[:metric] == 'precomputed' ? x : ::Rumale::PairwiseMetric.euclidean_distance(x)
  @embedding = init_embedding(x)
  lo_distance_mat = ::Rumale::PairwiseMetric.euclidean_distance(@embedding)
  @stress = calc_stress(hi_distance_mat, lo_distance_mat)
  @n_iter = 0
  # perform optimization.
  @params[:max_iter].times do |t|
    # guttman tarnsform.
    ratio = hi_distance_mat / lo_distance_mat
    ratio[ratio.diag_indices] = 0.0
    ratio[lo_distance_mat.eq(0)] = 0.0
    tmp_mat = -ratio
    tmp_mat[tmp_mat.diag_indices] += ratio.sum(axis: 1)
    @embedding = 1.fdiv(n_samples) * tmp_mat.dot(@embedding)
    lo_distance_mat = ::Rumale::PairwiseMetric.euclidean_distance(@embedding)
    # check convergence.
    new_stress = calc_stress(hi_distance_mat, lo_distance_mat)
    if terminate?(@stress, new_stress)
      @stress = new_stress
      break
    end
    # next step.
    @n_iter = t + 1
    @stress = new_stress
    puts "[MDS] stress function after #{@n_iter} iterations: #{@stress}" if @params[:verbose] && (@n_iter % 100).zero?
  end
  self
end

#fit_transform(x) ⇒ Numo::DFloat

Fit the model with training data, and then transform them with the learned model.

Returns (shape: [n_samples, n_components]) The transformed data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_samples, n_samples]).

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The transformed data



120
121
122
123
124
125
# File 'rumale-manifold/lib/rumale/manifold/mds.rb', line 120

def fit_transform(x, _not_used = nil)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  fit(x)
  @embedding.dup
end