Class: Hnswlib::HnswIndex Deprecated

Inherits:
Object
  • Object
show all
Defined in:
lib/hnswlib.rb

Overview

Deprecated.

This class was prepared as a class with an interface similar to Annoy, but it is not very useful and will be deleted in the next version.

HnswIndex is a class that provides functions for k-nearest eighbors search.

Examples:

require 'hnswlib'

index = Hnswlib::HnswIndex.new(n_features: 100, max_item: 10000)

5000.times do |item_id|
  item_vec = Array.new(100) { rand - 0.5 }
  index.add_item(item_id, item_vec)
end

index.get_nns_by_item(0, 100)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(n_features:, max_item:, metric: 'l2', m: 16, ef_construction: 200, random_seed: 100, allow_replace_removed: false) ⇒ HnswIndex

Create a new search index.

Parameters:

  • n_features (Integer)

    The number of features (dimensions) of stored vector.

  • max_item (Integer)

    The maximum number of items.

  • metric (String) (defaults to: 'l2')

    The distance metric between vectors (‘l2’, ‘dot’, or ‘cosine’).

  • m (Integer) (defaults to: 16)

    The maximum number of outgoing connections in the graph

  • ef_construction (Integer) (defaults to: 200)

    The size of the dynamic list for the nearest neighbors. It controls the index time/accuracy trade-off.

  • random_seed (Integer) (defaults to: 100)

    The seed value using to initialize the random generator.

  • allow_replace_removed (Boolean) (defaults to: false)

    The flag to replace removed element when adding new element.



36
37
38
39
40
41
42
43
# File 'lib/hnswlib.rb', line 36

def initialize(n_features:, max_item:, metric: 'l2', m: 16, ef_construction: 200,
               random_seed: 100, allow_replace_removed: false)
  @metric = metric
  space = @metric == 'dot' ? 'ip' : 'l2'
  @index = Hnswlib::HierarchicalNSW.new(space: space, dim: n_features)
  @index.init_index(max_elements: max_item, m: m, ef_construction: ef_construction,
                    random_seed: random_seed, allow_replace_deleted: allow_replace_removed)
end

Instance Attribute Details

#metricString (readonly)

Returns the metric of index.

Returns:

  • (String)


25
26
27
# File 'lib/hnswlib.rb', line 25

def metric
  @metric
end

Instance Method Details

#add_item(i, v, replace_removed: false) ⇒ Boolean

Add item to be indexed.

Parameters:

  • i (Integer)

    The ID of item.

  • v (Array)

    The vector of item.

  • replace_removed (Boolean) (defaults to: false)

    The flag to replace a removed element.

Returns:

  • (Boolean)


51
52
53
# File 'lib/hnswlib.rb', line 51

def add_item(i, v, replace_removed: false)
  @index.add_point(v, i, replace_deleted: replace_removed)
end

#get_distance(i, j) ⇒ Float or Integer

Calculate the distances between items.

Parameters:

  • i (Integer)

    The ID of item.

  • j (Integer)

    The ID of item.

Returns:

  • (Float or Integer)


129
130
131
132
133
# File 'lib/hnswlib.rb', line 129

def get_distance(i, j)
  vi = @index.get_point(i)
  vj = @index.get_point(j)
  @index.space.distance(vi, vj)
end

#get_item(i) ⇒ Array

Return the item vector.

Parameters:

  • i (Integer)

    The ID of item.

Returns:

  • (Array)


59
60
61
# File 'lib/hnswlib.rb', line 59

def get_item(i)
  @index.get_point(i)
end

#get_nns_by_item(i, n, include_distances: false, filter: nil) ⇒ Array<Integer> or Array<Array<Integer>, Array<Float>>

Search the n closest items.

Parameters:

  • i (Integer)

    The ID of query item.

  • n (Integer)

    The number of nearest neighbors.

  • include_distances (Boolean) (defaults to: false)

    The flag indicating whether to returns all corresponding distances.

  • filter (Proc) (defaults to: nil)

    The function that filters elements by its labels.

Returns:

  • (Array<Integer> or Array<Array<Integer>, Array<Float>>)


77
78
79
80
81
# File 'lib/hnswlib.rb', line 77

def get_nns_by_item(i, n, include_distances: false, filter: nil)
  v = @index.get_point(i)
  ids, dists = @index.search_knn(v, n, filter: filter)
  include_distances ? [ids, dists] : ids
end

#get_nns_by_vector(v, n, include_distances: false, filter: nil) ⇒ Array<Integer> or Array<Array<Integer>, Array<Float>>

Search the n closest items.

Parameters:

  • v (Array)

    The vector of query item.

  • n (Integer)

    The number of nearest neighbors.

  • include_distances (Boolean) (defaults to: false)

    The flag indicating whether to returns all corresponding distances.

  • filter (Proc) (defaults to: nil)

    The function that filters elements by its labels.

Returns:

  • (Array<Integer> or Array<Array<Integer>, Array<Float>>)


90
91
92
93
# File 'lib/hnswlib.rb', line 90

def get_nns_by_vector(v, n, include_distances: false, filter: nil)
  ids, dists = @index.search_knn(v, n, filter: filter)
  include_distances ? [ids, dists] : ids
end

#load(filename, allow_replace_removed: false) ⇒ Object

Load a search index from disk.

Parameters:

  • filename (String)

    The filename of search index.

  • allow_replace_removed (Boolean) (defaults to: false)

    The flag to replace removed element when adding new element.



120
121
122
# File 'lib/hnswlib.rb', line 120

def load(filename, allow_replace_removed: false)
  @index.load_index(filename, allow_replace_deleted: allow_replace_removed)
end

#max_itemInteger

Return the maximum number of items.

Returns:

  • (Integer)


152
153
154
# File 'lib/hnswlib.rb', line 152

def max_item
  @index.max_elements
end

#n_featuresInteger

Returns the number of features of indexed item.

Returns:

  • (Integer)


145
146
147
# File 'lib/hnswlib.rb', line 145

def n_features
  @index.space.dim
end

#n_itemsInteger

Return the number of items in the search index.

Returns:

  • (Integer)


138
139
140
# File 'lib/hnswlib.rb', line 138

def n_items
  @index.current_count
end

#remove_item(i) ⇒ Object

Remove the item vector.

Parameters:

  • i (Integer)

    The ID of item.



66
67
68
# File 'lib/hnswlib.rb', line 66

def remove_item(i)
  @index.mark_deleted(i)
end

#resize_index(new_max_item) ⇒ Object

Reize the search index.

Parameters:

  • new_max_item (Integer)

    The maximum number of items.



98
99
100
# File 'lib/hnswlib.rb', line 98

def resize_index(new_max_item)
  @index.reisze_index(new_max_item)
end

#save(filename) ⇒ Object

Save the search index to disk.

Parameters:

  • filename (String)

    The filename of search index.



112
113
114
# File 'lib/hnswlib.rb', line 112

def save(filename)
  @index.save_index(filename)
end

#set_ef(ef) ⇒ Object

Set the size of the dynamic list for the nearest neighbors.

Parameters:

  • ef (Integer)

    The size of the dynamic list.



105
106
107
# File 'lib/hnswlib.rb', line 105

def set_ef(ef)
  @index.set_ef(ef)
end