Class: Rumale::Clustering::KMedoids
- Inherits:
-
Base::Estimator
- Object
- Base::Estimator
- Rumale::Clustering::KMedoids
- Includes:
- Base::ClusterAnalyzer
- Defined in:
- rumale-clustering/lib/rumale/clustering/k_medoids.rb
Overview
KMedoids is a class that implements K-Medoids cluster analysis.
Reference
-
Arthur, D., and Vassilvitskii, S., “k-means++: the advantages of careful seeding,” Proc. SODA’07, pp. 1027–1035, 2007.
Instance Attribute Summary collapse
-
#medoid_ids ⇒ Numo::Int32
readonly
Return the indices of medoids.
-
#rng ⇒ Random
readonly
Return the random generator.
Attributes inherited from Base::Estimator
Instance Method Summary collapse
-
#fit(x) ⇒ KMedoids
Analysis clusters with given training data.
-
#fit_predict(x) ⇒ Numo::Int32
Analysis clusters and assign samples to clusters.
-
#initialize(n_clusters: 8, metric: 'euclidean', init: 'k-means++', max_iter: 50, tol: 1.0e-4, random_seed: nil) ⇒ KMedoids
constructor
Create a new cluster analyzer with K-Medoids method.
-
#predict(x) ⇒ Numo::Int32
Predict cluster labels for samples.
Methods included from Base::ClusterAnalyzer
Constructor Details
#initialize(n_clusters: 8, metric: 'euclidean', init: 'k-means++', max_iter: 50, tol: 1.0e-4, random_seed: nil) ⇒ KMedoids
Create a new cluster analyzer with K-Medoids method.
40 41 42 43 44 45 46 47 48 49 50 51 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 40 def initialize(n_clusters: 8, metric: 'euclidean', init: 'k-means++', max_iter: 50, tol: 1.0e-4, random_seed: nil) super() @params = { n_clusters: n_clusters, metric: (metric == 'precomputed' ? 'precomputed' : 'euclidean'), init: (init == 'random' ? 'random' : 'k-means++'), max_iter: max_iter, tol: tol, random_seed: random_seed || srand } @rng = Random.new(@params[:random_seed]) end |
Instance Attribute Details
#medoid_ids ⇒ Numo::Int32 (readonly)
Return the indices of medoids.
24 25 26 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 24 def medoid_ids @medoid_ids end |
#rng ⇒ Random (readonly)
Return the random generator.
28 29 30 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 28 def rng @rng end |
Instance Method Details
#fit(x) ⇒ KMedoids
Analysis clusters with given training data.
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 59 def fit(x, _y = nil) x = ::Rumale::Validation.check_convert_sample_array(x) raise ArgumentError, 'the input distance matrix should be square' if check_invalid_array_shape(x) # initialize some varibales. distance_mat = @params[:metric] == 'precomputed' ? x : ::Rumale::PairwiseMetric.euclidean_distance(x) init_cluster_centers(distance_mat) error = distance_mat[true, @medoid_ids].mean @params[:max_iter].times do |_t| cluster_labels = assign_cluster(distance_mat[true, @medoid_ids]) @params[:n_clusters].times do |n| assigned_ids = cluster_labels.eq(n).where @medoid_ids[n] = assigned_ids[distance_mat[assigned_ids, assigned_ids].sum(axis: 1).min_index] end new_error = distance_mat[true, @medoid_ids].mean break if (error - new_error).abs <= @params[:tol] error = new_error end @cluster_centers = x[@medoid_ids, true].dup if @params[:metric] == 'euclidean' self end |
#fit_predict(x) ⇒ Numo::Int32
Analysis clusters and assign samples to clusters.
103 104 105 106 107 108 109 110 111 112 113 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 103 def fit_predict(x) x = ::Rumale::Validation.check_convert_sample_array(x) raise ArgumentError, 'the input distance matrix should be square' if check_invalid_array_shape(x) fit(x) if @params[:metric] == 'precomputed' predict(x[true, @medoid_ids]) else predict(x) end end |
#predict(x) ⇒ Numo::Int32
Predict cluster labels for samples.
87 88 89 90 91 92 93 94 95 96 |
# File 'rumale-clustering/lib/rumale/clustering/k_medoids.rb', line 87 def predict(x) x = ::Rumale::Validation.check_convert_sample_array(x) distance_mat = @params[:metric] == 'precomputed' ? x : ::Rumale::PairwiseMetric.euclidean_distance(x, @cluster_centers) if @params[:metric] == 'precomputed' && distance_mat.shape[1] != @medoid_ids.size raise ArgumentError, 'the shape of input matrix should be n_samples-by-n_clusters' end assign_cluster(distance_mat) end |