Class: Rumale::NearestNeighbors::KNeighborsRegressor

Inherits:
Base::Estimator show all
Includes:
Base::Regressor
Defined in:
rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb

Overview

KNeighborsRegressor is a class that implements the regressor with the k-nearest neighbors rule. The current implementation uses the Euclidean distance for finding the neighbors.

Examples:

require 'rumale/nearest_neighbors/k_neighbors_regressor'

estimator =
  Rumale::NearestNeighbors::KNeighborsRegressor.new(n_neighbors: 5)
estimator.fit(training_samples, traininig_target_values)
results = estimator.predict(testing_samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Methods included from Base::Regressor

#score

Constructor Details

#initialize(n_neighbors: 5, metric: 'euclidean') ⇒ KNeighborsRegressor

Create a new regressor with the nearest neighbor rule.

Parameters:

  • n_neighbors (Integer) (defaults to: 5)

    The number of neighbors.

  • metric (String) (defaults to: 'euclidean')

    The metric to calculate the distances. If metric is ‘euclidean’, Euclidean distance is calculated for distance between points. If metric is ‘precomputed’, the fit and predict methods expect to be given a distance matrix.



40
41
42
43
44
45
46
# File 'rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb', line 40

def initialize(n_neighbors: 5, metric: 'euclidean')
  super()
  @params = {
    n_neighbors: n_neighbors,
    metric: (metric == 'precomputed' ? 'precomputed' : 'euclidean')
  }
end

Instance Attribute Details

#prototypesNumo::DFloat (readonly)

Return the prototypes for the nearest neighbor regressor. If the metric is ‘precomputed’, that returns nil. If the algorithm is ‘vptree’, that returns Rumale::NearestNeighbors::VPTree.

Returns:

  • (Numo::DFloat)

    (shape: [n_training_samples, n_features])



28
29
30
# File 'rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb', line 28

def prototypes
  @prototypes
end

#valuesNumo::DFloat (readonly)

Return the values of the prototypes

Returns:

  • (Numo::DFloat)

    (shape: [n_training_samples, n_outputs])



32
33
34
# File 'rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb', line 32

def values
  @values
end

Instance Method Details

#fit(x, y) ⇒ KNeighborsRegressor

Fit the model with given training data.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_training_samples, n_features]) The training data to be used for fitting the model. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_training_samples, n_training_samples]).

  • y (Numo::DFloat)

    (shape: [n_training_samples, n_outputs]) The target values to be used for fitting the model.

Returns:



54
55
56
57
58
59
60
61
62
63
64
65
# File 'rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb', line 54

def fit(x, y)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  y = ::Rumale::Validation.check_convert_target_value_array(y)
  ::Rumale::Validation.check_sample_size(x, y)
  if @params[:metric] == 'precomputed' && x.shape[0] != x.shape[1]
    raise ArgumentError, 'Expect the input distance matrix to be square.'
  end

  @prototypes = x.dup if @params[:metric] == 'euclidean'
  @values = y.dup
  self
end

#predict(x) ⇒ Numo::DFloat

Predict values for samples.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_testing_samples, n_features]) The samples to predict the values. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_testing_samples, n_training_samples]).

Returns:

  • (Numo::DFloat)

    (shape: [n_testing_samples, n_outputs]) Predicted values per sample.



72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'rumale-nearest_neighbors/lib/rumale/nearest_neighbors/k_neighbors_regressor.rb', line 72

def predict(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  if @params[:metric] == 'precomputed' && x.shape[1] != @values.shape[0]
    raise ArgumentError, 'Expect the size input matrix to be n_testing_samples-by-n_training_samples.'
  end

  # Initialize some variables.
  n_samples = x.shape[0]
  n_prototypes, n_outputs = @values.shape
  n_neighbors = [@params[:n_neighbors], n_prototypes].min
  # Predict values for the given samples.
  distance_matrix = @params[:metric] == 'precomputed' ? x : ::Rumale::PairwiseMetric.euclidean_distance(x, @prototypes)
  predicted_values = Array.new(n_samples) do |n|
    neighbor_ids = distance_matrix[n, true].to_a.each_with_index.sort.map(&:last)[0...n_neighbors]
    n_outputs.nil? ? @values[neighbor_ids].mean : @values[neighbor_ids, true].mean(0).to_a
  end
  Numo::DFloat[*predicted_values]
end