Class: Rumale::FeatureExtraction::FeatureHasher
- Inherits:
-
Base::Estimator
- Object
- Base::Estimator
- Rumale::FeatureExtraction::FeatureHasher
- Includes:
- Base::Transformer
- Defined in:
- rumale-feature_extraction/lib/rumale/feature_extraction/feature_hasher.rb
Overview
Encode array of feature-value hash to vectors with feature hashing (hashing trick). This encoder turns array of mappings (Array<Hash>) with pairs of feature names and values into Numo::NArray. This encoder employs signed 32-bit Murmurhash3 as the hash function.
Instance Attribute Summary
Attributes inherited from Base::Estimator
Instance Method Summary collapse
-
#fit(x) ⇒ FeatureHasher
This method does not do anything.
-
#fit_transform(x) ⇒ Numo::DFloat
Encode given the array of feature-value hash.
-
#initialize(n_features: 1024, alternate_sign: true) ⇒ FeatureHasher
constructor
Create a new encoder for converting array of hash consisting of feature names and values to vectors with feature hashing algorith.
-
#transform(x) ⇒ Numo::DFloat
Encode given the array of feature-value hash.
Constructor Details
#initialize(n_features: 1024, alternate_sign: true) ⇒ FeatureHasher
Create a new encoder for converting array of hash consisting of feature names and values to vectors with feature hashing algorith.
35 36 37 38 39 40 41 |
# File 'rumale-feature_extraction/lib/rumale/feature_extraction/feature_hasher.rb', line 35 def initialize(n_features: 1024, alternate_sign: true) super() @params = { n_features: n_features, alternate_sign: alternate_sign } end |
Instance Method Details
#fit(x) ⇒ FeatureHasher
This method does not do anything. The encoder does not require training.
48 49 50 |
# File 'rumale-feature_extraction/lib/rumale/feature_extraction/feature_hasher.rb', line 48 def fit(_x = nil, _y = nil) self end |
#fit_transform(x) ⇒ Numo::DFloat
Encode given the array of feature-value hash. This method has the same output as the transform method because the encoder does not require training.
59 60 61 |
# File 'rumale-feature_extraction/lib/rumale/feature_extraction/feature_hasher.rb', line 59 def fit_transform(x, _y = nil) fit(x).transform(x) end |
#transform(x) ⇒ Numo::DFloat
Encode given the array of feature-value hash.
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
# File 'rumale-feature_extraction/lib/rumale/feature_extraction/feature_hasher.rb', line 67 def transform(x) x = [x] unless x.is_a?(Array) n_samples = x.size z = Numo::DFloat.zeros(n_samples, n_features) x.each_with_index do |f, i| f.each do |k, v| k = "#{k}=#{v}" if v.is_a?(String) val = v.is_a?(String) ? 1 : v next if val.zero? h = Mmh3.hash32(k) fid = h.abs % n_features val *= h >= 0 ? 1 : -1 if alternate_sign? z[i, fid] = val end end z end |