Class: Rumale::Ensemble::GradientBoostingRegressor
- Inherits:
-
Base::Estimator
- Object
- Base::Estimator
- Rumale::Ensemble::GradientBoostingRegressor
- Includes:
- Base::Regressor
- Defined in:
- rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb
Overview
GradientBoostingRegressor is a class that implements gradient tree boosting for regression. The class use L2 loss for the loss function.
Reference
-
Friedman, J H. “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, 29 (5), pp. 1189–1232, 2001.
-
Friedman, J H. “Stochastic Gradient Boosting,” Computational Statistics and Data Analysis, 38 (4), pp. 367–378, 2002.
-
Chen, T., and Guestrin, C., “XGBoost: A Scalable Tree Boosting System,” Proc. KDD’16, pp. 785–794, 2016.
Instance Attribute Summary collapse
-
#estimators ⇒ Array<GradientTreeRegressor>
readonly
Return the set of estimators.
-
#feature_importances ⇒ Numo::DFloat
readonly
Return the importance for each feature.
-
#rng ⇒ Random
readonly
Return the random generator for random selection of feature index.
Attributes inherited from Base::Estimator
Instance Method Summary collapse
-
#apply(x) ⇒ Numo::Int32
Return the index of the leaf that each sample reached.
-
#fit(x, y) ⇒ GradientBoostingRegressor
Fit the model with given training data.
-
#initialize(n_estimators: 100, learning_rate: 0.1, reg_lambda: 0.0, subsample: 1.0, max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, n_jobs: nil, random_seed: nil) ⇒ GradientBoostingRegressor
constructor
Create a new regressor with gradient tree boosting.
-
#predict(x) ⇒ Numo::DFloat
Predict values for samples.
Methods included from Base::Regressor
Constructor Details
#initialize(n_estimators: 100, learning_rate: 0.1, reg_lambda: 0.0, subsample: 1.0, max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, n_jobs: nil, random_seed: nil) ⇒ GradientBoostingRegressor
Create a new regressor with gradient tree boosting.
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 63 def initialize(n_estimators: 100, learning_rate: 0.1, reg_lambda: 0.0, subsample: 1.0, max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, n_jobs: nil, random_seed: nil) super() @params = { n_estimators: n_estimators, learning_rate: learning_rate, reg_lambda: reg_lambda, subsample: subsample, max_depth: max_depth, max_leaf_nodes: max_leaf_nodes, min_samples_leaf: min_samples_leaf, max_features: max_features, n_jobs: n_jobs, random_seed: random_seed || srand } @rng = Random.new(@params[:random_seed]) end |
Instance Attribute Details
#estimators ⇒ Array<GradientTreeRegressor> (readonly)
Return the set of estimators.
33 34 35 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 33 def estimators @estimators end |
#feature_importances ⇒ Numo::DFloat (readonly)
Return the importance for each feature. The feature importances are calculated based on the numbers of times the feature is used for splitting.
38 39 40 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 38 def feature_importances @feature_importances end |
#rng ⇒ Random (readonly)
Return the random generator for random selection of feature index.
42 43 44 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 42 def rng @rng end |
Instance Method Details
#apply(x) ⇒ Numo::Int32
Return the index of the leaf that each sample reached.
128 129 130 131 132 133 134 135 136 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 128 def apply(x) n_outputs = @estimators.first.is_a?(Array) ? @estimators.size : 1 leaf_ids = if n_outputs > 1 Array.new(n_outputs) { |n| @estimators[n].map { |tree| tree.apply(x) } } else @estimators.map { |tree| tree.apply(x) } end Numo::Int32[*leaf_ids].transpose.dup end |
#fit(x, y) ⇒ GradientBoostingRegressor
Fit the model with given training data.
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 87 def fit(x, y) # initialize some variables. n_features = x.shape[1] @params[:max_features] = n_features if @params[:max_features].nil? @params[:max_features] = [[1, @params[:max_features]].max, n_features].min # rubocop:disable Style/ComparableClamp n_outputs = y.shape[1].nil? ? 1 : y.shape[1] # train regressor. @base_predictions = n_outputs > 1 ? y.mean(0) : y.mean @estimators = if n_outputs > 1 multivar_estimators(x, y) else partial_fit(x, y, @base_predictions) end # calculate feature importances. @feature_importances = if n_outputs > 1 multivar_feature_importances else @estimators.sum(&:feature_importances) end self end |
#predict(x) ⇒ Numo::DFloat
Predict values for samples.
113 114 115 116 117 118 119 120 121 122 |
# File 'rumale-ensemble/lib/rumale/ensemble/gradient_boosting_regressor.rb', line 113 def predict(x) n_outputs = @estimators.first.is_a?(Array) ? @estimators.size : 1 if n_outputs > 1 multivar_predict(x) elsif enable_parallel? parallel_map(@params[:n_estimators]) { |n| @estimators[n].predict(x) }.sum + @base_predictions else @estimators.sum { |tree| tree.predict(x) } + @base_predictions end end |