Class: Rumale::Clustering::SingleLinkage

Inherits:
Base::Estimator show all
Includes:
Base::ClusterAnalyzer
Defined in:
rumale-clustering/lib/rumale/clustering/single_linkage.rb

Overview

SingleLinkage is a class that implements hierarchical cluster analysis with single linakge method. This class is used internally for HDBSCAN.

Reference

  • Mullner, D., “Modern hierarchical, agglomerative clustering algorithms,” arXiv:1109.2378, 2011.

Examples:

require 'rumale/clustering/single_linkage'

analyzer = Rumale::Clustering::SingleLinkage.new(n_clusters: 2)
cluster_labels = analyzer.fit_predict(samples)

Instance Attribute Summary collapse

Attributes inherited from Base::Estimator

#params

Instance Method Summary collapse

Methods included from Base::ClusterAnalyzer

#score

Constructor Details

#initialize(n_clusters: 2, metric: 'euclidean') ⇒ SingleLinkage

Create a new cluster analyzer with single linkage algorithm.

Parameters:

  • n_clusters (Integer) (defaults to: 2)

    The number of clusters.

  • metric (String) (defaults to: 'euclidean')

    The metric to calculate the distances. If metric is ‘euclidean’, Euclidean distance is calculated for distance between points. If metric is ‘precomputed’, the fit and fit_transform methods expect to be given a distance matrix.



38
39
40
41
42
43
44
# File 'rumale-clustering/lib/rumale/clustering/single_linkage.rb', line 38

def initialize(n_clusters: 2, metric: 'euclidean')
  super()
  @params = {
    n_clusters: n_clusters,
    metric: (metric == 'precomputed' ? 'precomputed' : 'euclidean')
  }
end

Instance Attribute Details

#hierarchyArray<SingleLinkage::Node> (readonly)

Return the hierarchical structure.

Returns:

  • (Array<SingleLinkage::Node>)

    (shape: [n_samples - 1])



30
31
32
# File 'rumale-clustering/lib/rumale/clustering/single_linkage.rb', line 30

def hierarchy
  @hierarchy
end

#labelsNumo::Int32 (readonly)

Return the cluster labels.

Returns:

  • (Numo::Int32)

    (shape: [n_samples])



26
27
28
# File 'rumale-clustering/lib/rumale/clustering/single_linkage.rb', line 26

def labels
  @labels
end

Instance Method Details

#fit(x) ⇒ SingleLinkage

Analysis clusters with given training data.

Returns The learned cluster analyzer itself.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The training data to be used for cluster analysis. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_samples, n_samples]).

Returns:

Raises:

  • (ArgumentError)


52
53
54
55
56
57
58
# File 'rumale-clustering/lib/rumale/clustering/single_linkage.rb', line 52

def fit(x, _y = nil)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  raise ArgumentError, 'the input distance matrix should be square' if check_invalid_array_shape(x)

  fit_predict(x)
  self
end

#fit_predict(x) ⇒ Numo::Int32

Analysis clusters and assign samples to clusters.

Parameters:

  • x (Numo::DFloat)

    (shape: [n_samples, n_features]) The samples to be used for cluster analysis. If the metric is ‘precomputed’, x must be a square distance matrix (shape: [n_samples, n_samples]).

Returns:

  • (Numo::Int32)

    (shape: [n_samples]) Predicted cluster label per sample.

Raises:

  • (ArgumentError)


65
66
67
68
69
70
71
# File 'rumale-clustering/lib/rumale/clustering/single_linkage.rb', line 65

def fit_predict(x)
  x = ::Rumale::Validation.check_convert_sample_array(x)
  raise ArgumentError, 'the input distance matrix should be square' if check_invalid_array_shape(x)

  distance_mat = @params[:metric] == 'precomputed' ? x : ::Rumale::PairwiseMetric.euclidean_distance(x)
  @labels = partial_fit(distance_mat)
end