Class: Rumale::Ensemble::StackingRegressor

Inherits:
Base::Estimator show all
Includes:
Base::Regressor
Defined in:
rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb

Overview

StackingRegressor is a class that implements regressor with stacking method.

Reference

  • Zhou, Z-H., “Ensemble Methods - Foundations and Algorithms,” CRC Press Taylor and Francis Group, Chapman and Hall/CRC, 2012.

Examples:

require 'rumale/ensemble/stacking_regressor'

estimators = {
  las: Rumale::LinearModel::Lasso.new(reg_param: 1e-2, random_seed: 1),
  mlp: Rumale::NeuralNetwork::MLPRegressor.new(hidden_units: [256], random_seed: 1),
  rnd: Rumale::Ensemble::RandomForestRegressor.new(random_seed: 1)
}
meta_estimator = Rumale::LinearModel::Ridge.new
regressor = Rumale::Ensemble::StackedRegressor.new(
  estimators: estimators, meta_estimator: meta_estimator, random_seed: 1
)
regressor.fit(training_samples, training_values)
results = regressor.predict(testing_samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Methods included from Base::Regressor

#score

Constructor Details

#initialize(estimators:, meta_estimator: nil, n_splits: 5, shuffle: true, passthrough: false, random_seed: nil) ⇒ StackingRegressor

Create a new regressor with stacking method.

Parameters:

  • estimators (Hash<Symbol,Regressor>)

    The base regressors for extracting meta features.

  • meta_estimator (Regressor/Nil) (defaults to: nil)

    The meta regressor that predicts values. If nil is given, Ridge is used.

  • n_splits (Integer) (defaults to: 5)

    The number of folds for cross validation with k-fold on meta feature extraction in training phase.

  • shuffle (Boolean) (defaults to: true)

    The flag indicating whether to shuffle the dataset on cross validation.

  • passthrough (Boolean) (defaults to: false)

    The flag indicating whether to concatenate the original features and meta features when training the meta regressor.

  • random_seed (Integer/Nil) (defaults to: nil)

    The seed value using to initialize the random generator on cross validation.



50
51
52
53
54
55
56
57
58
59
60
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 50

def initialize(estimators:, meta_estimator: nil, n_splits: 5, shuffle: true, passthrough: false, random_seed: nil)
  super()
  @estimators = estimators
  @meta_estimator = meta_estimator || ::Rumale::LinearModel::Ridge.new
  @params = {
    n_splits: n_splits,
    shuffle: shuffle,
    passthrough: passthrough,
    random_seed: random_seed || srand
  }
end

Instance Attribute Details

#estimatorsHash<Symbol,Regressor> (readonly)

Return the base regressors.

Returns:

  • (Hash<Symbol,Regressor>)


35
36
37
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 35

def estimators
  @estimators
end

#meta_estimatorRegressor (readonly)

Return the meta regressor.

Returns:

  • (Regressor)


39
40
41
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 39

def meta_estimator
  @meta_estimator
end

Instance Method Details

#fit(x, y) ⇒ StackedRegressor

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

  • y (Numo::DFloat)

    (shape: [n_samples, n_outputs]) The target variables to be used for fitting the model.

Returns:

  • (StackedRegressor)

    The learned regressor itself.



67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 67

def fit(x, y)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  y = ::Rumale::Validation.check_convert_target_value_array(y)
  ::Rumale::Validation.check_sample_size(x, y)

  n_samples, n_features = x.shape
  n_outputs = y.ndim == 1 ? 1 : y.shape[1]

  # training base regressors with all training data.
  @estimators.each_key { |name| @estimators[name].fit(x, y) }

  # detecting size of output for each base regressor.
  @output_size = detect_output_size(n_features)

  # extracting meta features with base regressors.
  n_components = @output_size.values.sum
  z = Numo::DFloat.zeros(n_samples, n_components)

  kf = ::Rumale::ModelSelection::KFold.new(
    n_splits: @params[:n_splits], shuffle: @params[:shuffle], random_seed: @params[:random_seed]
  )

  kf.split(x, y).each do |train_ids, valid_ids|
    x_train = x[train_ids, true]
    y_train = n_outputs == 1 ? y[train_ids] : y[train_ids, true]
    x_valid = x[valid_ids, true]
    f_start = 0
    @estimators.each_key do |name|
      est_fold = Marshal.load(Marshal.dump(@estimators[name]))
      f_last = f_start + @output_size[name]
      f_position = @output_size[name] == 1 ? f_start : f_start...f_last
      z[valid_ids, f_position] = est_fold.fit(x_train, y_train).predict(x_valid)
      f_start = f_last
    end
  end

  # concatenating original features.
  z = Numo::NArray.hstack([z, x]) if @params[:passthrough]

  # training meta regressor.
  @meta_estimator.fit(z, y)

  self
end

#fit_transform(x, y) ⇒ Numo::DFloat

Fit the model with training data, and then transform them with the learned model.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for fitting the model.

  • y (Numo::DFloat)

    (shape: [n_samples, n_outputs]) The target variables to be used for fitting the model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The meta features for training data.



149
150
151
152
153
154
155
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 149

def fit_transform(x, y)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  y = ::Rumale::Validation.check_convert_target_value_array(y)
  ::Rumale::Validation.check_sample_size(x, y)

  fit(x, y).transform(x)
end

#predict(x) ⇒ Numo::DFloat

Predict values for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to predict the values.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_outputs]) The predicted values per sample.



116
117
118
119
120
121
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 116

def predict(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  z = transform(x)
  @meta_estimator.predict(z)
end

#transform(x) ⇒ Numo::DFloat

Transform the given data with the learned model.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to be transformed with the learned model.

Returns:

  • (Numo::DFloat)

    (shape: [n_samples, n_components]) The meta features for samples.



127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# File 'rumale-ensemble/lib/rumale/ensemble/stacking_regressor.rb', line 127

def transform(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)

  n_samples = x.shape[0]
  n_components = @output_size.values.sum
  z = Numo::DFloat.zeros(n_samples, n_components)
  f_start = 0
  @estimators.each_key do |name|
    f_last = f_start + @output_size[name]
    f_position = @output_size[name] == 1 ? f_start : f_start...f_last
    z[true, f_position] = @estimators[name].predict(x)
    f_start = f_last
  end
  z = Numo::NArray.hstack([z, x]) if @params[:passthrough]
  z
end