Class: SentencePiece::SentencePieceProcessor
- Inherits:
-
Object
- Object
- SentencePiece::SentencePieceProcessor
- Defined in:
- ext/sentencepiece/dummy.rb
Overview
This class is a wrapper of SentencePieceProcessor class in SentencePiece.
Instance Method Summary collapse
-
#bos_id ⇒ Integer
Returns BOS (<s>) id.
-
#decode(ids, out_type: 'int') ⇒ String
Decodes a list of ids or sentence pieces into a text.
-
#decode_ids(ids) ⇒ String
Decodes a list of ids into a text.
-
#decode_ids_as_serialized_proto(ids) ⇒ String
Decodes a list of ids into a serialized proto.
-
#decode_pieces(pieces) ⇒ String
Decodes a list of sentence pieces into a text.
-
#decode_pieces_as_serialized_proto(pieces) ⇒ String
Decodes a list of sentence pieces into a serialized proto.
-
#encode(text, out_type: 'int') ⇒ Array<Integer>/Array<String>
Encodes a text into a list of ids or sentence pieces.
-
#encode_as_ids(text) ⇒ Array<Integer>
Encodes a text into a list of ids.
-
#encode_as_pieces(text) ⇒ Array<String>
Encodes a text into a list of sentence pieces.
-
#encode_as_serialized_proto(text) ⇒ String
Encodes a text into a serialized proto.
-
#eos_id ⇒ Object
Returns EOS (</s>) id.
-
#id_to_piece(id) ⇒ String
Returns the string representation of vocab id.
-
#initialize(model_file: nil) ⇒ SentencePieceProcessor
constructor
Creates a new SentencePieceProcessor instance.
-
#load(model_file) ⇒ Object
Loads a SentencePiece model.
-
#nbest_encode_as_ids(text, nbest_size:) ⇒ Array<Integer>
Encodes a text into a list of ids with nbest results.
-
#nbest_encode_as_pieces(text, nbest_size:) ⇒ Array<String>
Encodes a text into a list of sentence pieces with nbest results.
-
#nbest_encode_as_serialized_proto(text, nbest_size:) ⇒ String
Encodes a text into a serialized proto with nbest results.
-
#pad_id ⇒ Object
Returns PAD (<pad>) id.
-
#piece_size ⇒ Integer
Returns the number of sentence pieces.
-
#piece_to_id(piece) ⇒ Integer
Returns the vocab id of the sentence piece.
-
#sample_encode_as_ids(text, nbest_size:, alpha:) ⇒ Array<Integer>
Encodes a text into a list of ids by sampling mode.
-
#sample_encode_as_pieces(text, nbest_size:, alpha:) ⇒ Array<String>
Encodes a text into a list of sentence pieces by sampling mode.
-
#sample_encode_as_serialized_proto(text, nbest_size:, alpha:) ⇒ String
Encodes a text into a serialized proto by sampling mode.
-
#unk_id ⇒ Integer
Returns unknown (<unk>) id.
Constructor Details
#initialize(model_file: nil) ⇒ SentencePieceProcessor
Creates a new SentencePieceProcessor instance.
60 |
# File 'ext/sentencepiece/dummy.rb', line 60 def initialize(model_file: nil); end |
Instance Method Details
#bos_id ⇒ Integer
Returns BOS (<s>) id.
196 |
# File 'ext/sentencepiece/dummy.rb', line 196 def bos_id(); end |
#decode(ids, out_type: 'int') ⇒ String
Decodes a list of ids or sentence pieces into a text.
145 |
# File 'ext/sentencepiece/dummy.rb', line 145 def decode(ids, out_type: 'int'); end |
#decode_ids(ids) ⇒ String
Decodes a list of ids into a text.
151 |
# File 'ext/sentencepiece/dummy.rb', line 151 def decode_ids(ids); end |
#decode_ids_as_serialized_proto(ids) ⇒ String
Decodes a list of ids into a serialized proto.
157 |
# File 'ext/sentencepiece/dummy.rb', line 157 def decode_ids_as_serialized_proto(ids); end |
#decode_pieces(pieces) ⇒ String
Decodes a list of sentence pieces into a text.
163 |
# File 'ext/sentencepiece/dummy.rb', line 163 def decode_pieces(pieces); end |
#decode_pieces_as_serialized_proto(pieces) ⇒ String
Decodes a list of sentence pieces into a serialized proto.
169 |
# File 'ext/sentencepiece/dummy.rb', line 169 def decode_pieces_as_serialized_proto(pieces); end |
#encode(text, out_type: 'int') ⇒ Array<Integer>/Array<String>
Encodes a text into a list of ids or sentence pieces.
74 |
# File 'ext/sentencepiece/dummy.rb', line 74 def encode(text, out_type: 'int'); end |
#encode_as_ids(text) ⇒ Array<Integer>
Encodes a text into a list of ids.
80 |
# File 'ext/sentencepiece/dummy.rb', line 80 def encode_as_ids(text); end |
#encode_as_pieces(text) ⇒ Array<String>
Encodes a text into a list of sentence pieces.
86 |
# File 'ext/sentencepiece/dummy.rb', line 86 def encode_as_pieces(text); end |
#encode_as_serialized_proto(text) ⇒ String
Encodes a text into a serialized proto.
92 |
# File 'ext/sentencepiece/dummy.rb', line 92 def encode_as_serialized_proto(text); end |
#eos_id ⇒ Object
Returns EOS (</s>) id.
199 |
# File 'ext/sentencepiece/dummy.rb', line 199 def eos_id(); end |
#id_to_piece(id) ⇒ String
Returns the string representation of vocab id.
175 |
# File 'ext/sentencepiece/dummy.rb', line 175 def id_to_piece(id); end |
#load(model_file) ⇒ Object
Loads a SentencePiece model.
66 |
# File 'ext/sentencepiece/dummy.rb', line 66 def load(model_file); end |
#nbest_encode_as_ids(text, nbest_size:) ⇒ Array<Integer>
Encodes a text into a list of ids with nbest results.
99 |
# File 'ext/sentencepiece/dummy.rb', line 99 def nbest_encode_as_ids(text, nbest_size:); end |
#nbest_encode_as_pieces(text, nbest_size:) ⇒ Array<String>
Encodes a text into a list of sentence pieces with nbest results.
106 |
# File 'ext/sentencepiece/dummy.rb', line 106 def nbest_encode_as_pieces(text, nbest_size:); end |
#nbest_encode_as_serialized_proto(text, nbest_size:) ⇒ String
Encodes a text into a serialized proto with nbest results.
113 |
# File 'ext/sentencepiece/dummy.rb', line 113 def nbest_encode_as_serialized_proto(text, nbest_size:); end |
#pad_id ⇒ Object
Returns PAD (<pad>) id.
202 |
# File 'ext/sentencepiece/dummy.rb', line 202 def pad_id(); end |
#piece_size ⇒ Integer
Returns the number of sentence pieces.
186 |
# File 'ext/sentencepiece/dummy.rb', line 186 def piece_size(); end |
#piece_to_id(piece) ⇒ Integer
Returns the vocab id of the sentence piece.
181 |
# File 'ext/sentencepiece/dummy.rb', line 181 def piece_to_id(piece); end |
#sample_encode_as_ids(text, nbest_size:, alpha:) ⇒ Array<Integer>
Encodes a text into a list of ids by sampling mode.
121 |
# File 'ext/sentencepiece/dummy.rb', line 121 def sample_encode_as_ids(text, nbest_size:, alpha:); end |
#sample_encode_as_pieces(text, nbest_size:, alpha:) ⇒ Array<String>
Encodes a text into a list of sentence pieces by sampling mode.
129 |
# File 'ext/sentencepiece/dummy.rb', line 129 def sample_encode_as_pieces(text, nbest_size:, alpha:); end |
#sample_encode_as_serialized_proto(text, nbest_size:, alpha:) ⇒ String
Encodes a text into a serialized proto by sampling mode.
137 |
# File 'ext/sentencepiece/dummy.rb', line 137 def sample_encode_as_serialized_proto(text, nbest_size:, alpha:); end |
#unk_id ⇒ Integer
Returns unknown (<unk>) id.
191 |
# File 'ext/sentencepiece/dummy.rb', line 191 def unk_id(); end |