Class: Rumale::ModelSelection::GroupKFold

Inherits:

Object

Object
Rumale::ModelSelection::GroupKFold

show all

Includes:: Base::Splitter

Defined in:: rumale-model_selection/lib/rumale/model_selection/group_k_fold.rb

Overview

GroupKFold is a class that generates the set of data indices for K-fold cross-validation. The data points belonging to the same group do not be split into different folds. The number of groups should be greater than or equal to the number of splits.

Examples:

require 'rumale/model_selection/group_k_fold'

cv = Rumale::ModelSelection::GroupKFold.new(n_splits: 3)
x = Numo::DFloat.new(8, 2).rand
groups = Numo::Int32[1, 1, 1, 2, 2, 3, 3, 3]
cv.split(x, nil, groups).each do |train_ids, test_ids|
  puts '---'
  pp train_ids
  pp test_ids
end

# ---
# [0, 1, 2, 3, 4]
# [5, 6, 7]
# ---
# [3, 4, 5, 6, 7]
# [0, 1, 2]
# ---
# [0, 1, 2, 5, 6, 7]
# [3, 4]

Instance Attribute Summary collapse

#n_splits ⇒ Integer readonly

Return the number of folds.

Instance Method Summary collapse

#initialize(n_splits: 5) ⇒ GroupKFold constructor

Create a new data splitter for grouped K-fold cross validation.
#split(x, y, groups) ⇒ Array

Generate data indices for grouped K-fold cross validation.

Constructor Details

#initialize(n_splits: 5) ⇒ `GroupKFold`

Create a new data splitter for grouped K-fold cross validation.

Parameters:

n_splits (Integer) (defaults to: 5) —

The number of folds.



44
45
46

# File 'rumale-model_selection/lib/rumale/model_selection/group_k_fold.rb', line 44

def initialize(n_splits: 5)
  @n_splits = n_splits
end

Instance Attribute Details

#n_splits ⇒ `Integer` (readonly)

Return the number of folds.

Returns:

(Integer)



39
40
41

# File 'rumale-model_selection/lib/rumale/model_selection/group_k_fold.rb', line 39

def n_splits
  @n_splits
end

Instance Method Details

#split(x, y, groups) ⇒ `Array`

Generate data indices for grouped K-fold cross validation.