Class: LLaMACpp::ContextParams

Inherits:

Object

Object
LLaMACpp::ContextParams

show all

Defined in:: ext/llama_cpp/dummy.rb

Overview

Class for parameters of context.

Instance Method Summary collapse

#attention_type ⇒ Integer

Returns the attention type.
#attention_type=(attention_type) ⇒ Object

Sets the attention type.
#defrag_thold ⇒ Float

Returns defragment the KV cache.
#defrag_thold=(defrag_thold) ⇒ Object

Sets the defragment the KV cache.
#embeddings ⇒ Boolean

Returns the flag for embeddings mode only.
#embeddings=(flag) ⇒ Object

Sets the flag for embeddings mode only.
#flash_attn ⇒ Boolean

Returns the flag whether to use flash attention.
#flash_attn=(flag) ⇒ Object

Sets the flag whether to use flash attention.
#logits_all ⇒ Boolean

Returns the flag to compute all logits.
#logits_all=(flag) ⇒ Object

Sets the flag to compute all logits.
#n_batch ⇒ Integer

Returns the logical maximum batch size.
#n_batch=(n_batch) ⇒ Object

Sets the logical maximum batch size.
#n_ctx ⇒ Integer

Returns the number of text context.
#n_ctx=(n_ctx) ⇒ Object

Sets the number of text context.
#n_seq_max ⇒ Integer

Returns the max number of sequences.
#n_seq_max=(n_seq_max) ⇒ Object

Sets the max number of sequences.
#n_ubatch ⇒ Integer

Returns the physical maximum batch size.
#n_ubatch=(n_ubatch) ⇒ Object

Sets the physical maximum batch size.
#offload_kqv=(flag) ⇒ Object

Sets the flag whether to offload the KQV ops.
#offload_kwv ⇒ Boolean

Returns the flag whether to offload the KQV ops.
#pooling_type ⇒ Integer

Returns the pooling type.
#pooling_type=(pooling_type) ⇒ Object

Sets the pooling type.
#rope_freq_base ⇒ Float

Returns the RoPE base frequency.
#rope_freq_base=(rope_freq_base) ⇒ Object

Sets the RoPE base frequency.
#rope_freq_scale ⇒ Float

Returns the RoPE frequency scaling factor.
#rope_freq_scale=(rope_freq_scale) ⇒ Object

Sets the RoPE frequency scaling factor.
#rope_scaling_type ⇒ Integer

Returns the RoPE scaling type.
#rope_scaling_type=(scaling_type) ⇒ Object

Sets the RoPE scaling type.
#seed ⇒ Integer

Return the random seed.
#seed=(seed) ⇒ Object

Sets the random seed.
#type_k ⇒ Integer

Returns the data type for K cache.
#type_k=(type_k) ⇒ Object

Sets the data type for K cache.
#type_v ⇒ Integer

Returns the data type for V cache.
#type_v=(type_v) ⇒ Object

Sets the data type for V cache.
#yarn_attn_factor ⇒ Float

Returns the YaRN magnitude scaling factor.
#yarn_attn_factor=(yarn_attn_factor) ⇒ Object

Sets the YaRN magnitude scaling factor.
#yarn_beta_fast ⇒ Float

Returns the YaRN low correction dim.
#yarn_beta_fast=(yarn_beta_fast) ⇒ Object

Sets the YaRN low correction dim.
#yarn_beta_slow ⇒ Float

Returns the YaRN high correction dim.
#yarn_beta_slow=(yarn_beta_slow) ⇒ Object

Sets the YaRN high correction dim.
#yarn_ext_factor ⇒ Float

Returns the YaRN extrapolation mix factor.
#yarn_ext_factor=(yarn_ext_factor) ⇒ Object

Sets the YaRN extrapolation mix factor.
#yarn_orig_ctx ⇒ Integer

Returns the YaRN original context size.
#yarn_orig_ctx=(yarn_orig_ctx) ⇒ Object

Sets the YaRN original context size.

Instance Method Details

#attention_type ⇒ `Integer`

Returns the attention type.

Returns:

(Integer)

1267	# File 'ext/llama_cpp/dummy.rb', line 1267 def attention_type; end

#attention_type=(attention_type) ⇒ `Object`

Sets the attention type.

Parameters:

attention_type (Integer)

1263	# File 'ext/llama_cpp/dummy.rb', line 1263 def attention_type=(attention_type); end

#defrag_thold ⇒ `Float`

Returns defragment the KV cache.

Returns:

(Float)

1331	# File 'ext/llama_cpp/dummy.rb', line 1331 def defrag_thold; end

#defrag_thold=(defrag_thold) ⇒ `Object`

Sets the defragment the KV cache.

Parameters:

defrag_thold (Float)

1327	# File 'ext/llama_cpp/dummy.rb', line 1327 def defrag_thold=(defrag_thold); end

#embeddings ⇒ `Boolean`

Returns the flag for embeddings mode only.

Returns:

(Boolean)

1363	# File 'ext/llama_cpp/dummy.rb', line 1363 def embeddings; end

#embeddings=(flag) ⇒ `Object`

Sets the flag for embeddings mode only.

Parameters:

flag (Boolean)

1359	# File 'ext/llama_cpp/dummy.rb', line 1359 def embeddings=(flag); end

#flash_attn ⇒ `Boolean`

Returns the flag whether to use flash attention.

Returns:

(Boolean)

1379	# File 'ext/llama_cpp/dummy.rb', line 1379 def flash_attn; end

#flash_attn=(flag) ⇒ `Object`

Sets the flag whether to use flash attention.

Parameters:

flag (Boolean)

1375	# File 'ext/llama_cpp/dummy.rb', line 1375 def flash_attn=(flag); end

#logits_all ⇒ `Boolean`

Returns the flag to compute all logits.

Returns:

(Boolean)

1355	# File 'ext/llama_cpp/dummy.rb', line 1355 def logits_all; end

#logits_all=(flag) ⇒ `Object`

Sets the flag to compute all logits.

Parameters:

flag (Boolean)

1351	# File 'ext/llama_cpp/dummy.rb', line 1351 def logits_all=(flag); end

#n_batch ⇒ `Integer`

Returns the logical maximum batch size.

Returns:

(Integer)

1223	# File 'ext/llama_cpp/dummy.rb', line 1223 def n_batch; end

#n_batch=(n_batch) ⇒ `Object`

Sets the logical maximum batch size.

Parameters:

n_batch (Integer)

1218	# File 'ext/llama_cpp/dummy.rb', line 1218 def n_batch=(n_batch); end

#n_ctx ⇒ `Integer`

Returns the number of text context

Returns:

(Integer)

1213	# File 'ext/llama_cpp/dummy.rb', line 1213 def n_ctx; end

#n_ctx=(n_ctx) ⇒ `Object`

Sets the number of text context

Parameters:

n_ctx (Integer)

1209	# File 'ext/llama_cpp/dummy.rb', line 1209 def n_ctx=(n_ctx); end

#n_seq_max ⇒ `Integer`

Returns the max number of sequences.

Returns:

(Integer)

1243	# File 'ext/llama_cpp/dummy.rb', line 1243 def n_seq_max; end

#n_seq_max=(n_seq_max) ⇒ `Object`

Sets the max number of sequences.

Parameters:

n_seq_max (Integer)

1238	# File 'ext/llama_cpp/dummy.rb', line 1238 def n_seq_max=(n_seq_max); end

#n_ubatch ⇒ `Integer`

Returns the physical maximum batch size.

Returns:

(Integer)

1233	# File 'ext/llama_cpp/dummy.rb', line 1233 def n_ubatch; end

#n_ubatch=(n_ubatch) ⇒ `Object`

Sets the physical maximum batch size.

Parameters:

n_ubatch (Integer)

1228	# File 'ext/llama_cpp/dummy.rb', line 1228 def n_ubatch=(n_ubatch); end

#offload_kqv=(flag) ⇒ `Object`

Sets the flag whether to offload the KQV ops.

Parameters:

flag (Boolean)

1367	# File 'ext/llama_cpp/dummy.rb', line 1367 def offload_kqv=(flag); end

#offload_kwv ⇒ `Boolean`

Returns the flag whether to offload the KQV ops.

Returns:

(Boolean)

1371	# File 'ext/llama_cpp/dummy.rb', line 1371 def offload_kwv; end

#pooling_type ⇒ `Integer`

Returns the pooling type.

Returns:

(Integer)

1259	# File 'ext/llama_cpp/dummy.rb', line 1259 def pooling_type; end

#pooling_type=(pooling_type) ⇒ `Object`

Sets the pooling type.

Parameters:

pooling_type (Integer)

1255	# File 'ext/llama_cpp/dummy.rb', line 1255 def pooling_type=(pooling_type); end

#rope_freq_base ⇒ `Float`

Returns the RoPE base frequency.

Returns:

(Float)

1275	# File 'ext/llama_cpp/dummy.rb', line 1275 def rope_freq_base; end

#rope_freq_base=(rope_freq_base) ⇒ `Object`

Sets the RoPE base frequency.

Parameters:

rope_freq_base (Float)

1271	# File 'ext/llama_cpp/dummy.rb', line 1271 def rope_freq_base=(rope_freq_base); end

#rope_freq_scale ⇒ `Float`

Returns the RoPE frequency scaling factor.

Returns:

(Float)

1283	# File 'ext/llama_cpp/dummy.rb', line 1283 def rope_freq_scale; end

#rope_freq_scale=(rope_freq_scale) ⇒ `Object`

Sets the RoPE frequency scaling factor.

Parameters:

rope_freq_scale (Float)

1279	# File 'ext/llama_cpp/dummy.rb', line 1279 def rope_freq_scale=(rope_freq_scale); end

#rope_scaling_type ⇒ `Integer`

Returns the RoPE scaling type.

Returns:

(Integer)

1251	# File 'ext/llama_cpp/dummy.rb', line 1251 def rope_scaling_type; end

#rope_scaling_type=(scaling_type) ⇒ `Object`

Sets the RoPE scaling type.

Parameters:

scaling_type (Integer)

1247	# File 'ext/llama_cpp/dummy.rb', line 1247 def rope_scaling_type=(scaling_type); end

#seed ⇒ `Integer`

Return the random seed.

Returns:

(Integer)

1205	# File 'ext/llama_cpp/dummy.rb', line 1205 def seed; end

#seed=(seed) ⇒ `Object`

Sets the random seed.

Parameters:

seed (Integer)

1201	# File 'ext/llama_cpp/dummy.rb', line 1201 def seed=(seed); end

#type_k ⇒ `Integer`

Returns the data type for K cache.

Returns:

(Integer)

1339	# File 'ext/llama_cpp/dummy.rb', line 1339 def type_k; end

#type_k=(type_k) ⇒ `Object`

Sets the data type for K cache.

Parameters:

type_k (Integer)

1335	# File 'ext/llama_cpp/dummy.rb', line 1335 def type_k=(type_k); end

#type_v ⇒ `Integer`

Returns the data type for V cache.

Returns:

(Integer)

1347	# File 'ext/llama_cpp/dummy.rb', line 1347 def type_v; end

#type_v=(type_v) ⇒ `Object`

Sets the data type for V cache.

Parameters:

type_v (Integer)

1343	# File 'ext/llama_cpp/dummy.rb', line 1343 def type_v=(type_v); end

#yarn_attn_factor ⇒ `Float`

Returns the YaRN magnitude scaling factor.

Returns:

(Float)

1299	# File 'ext/llama_cpp/dummy.rb', line 1299 def yarn_attn_factor; end

#yarn_attn_factor=(yarn_attn_factor) ⇒ `Object`

Sets the YaRN magnitude scaling factor.

Parameters:

yarn_attn_factor (Float)

1295	# File 'ext/llama_cpp/dummy.rb', line 1295 def yarn_attn_factor=(yarn_attn_factor); end

#yarn_beta_fast ⇒ `Float`

Returns the YaRN low correction dim.

Returns:

(Float)

1307	# File 'ext/llama_cpp/dummy.rb', line 1307 def yarn_beta_fast; end

#yarn_beta_fast=(yarn_beta_fast) ⇒ `Object`

Sets the YaRN low correction dim.

Parameters:

yarn_beta_fast (Float)

1303	# File 'ext/llama_cpp/dummy.rb', line 1303 def yarn_beta_fast=(yarn_beta_fast); end

#yarn_beta_slow ⇒ `Float`

Returns the YaRN high correction dim.

Returns:

(Float)

1315	# File 'ext/llama_cpp/dummy.rb', line 1315 def yarn_beta_slow; end

#yarn_beta_slow=(yarn_beta_slow) ⇒ `Object`

Sets the YaRN high correction dim.

Parameters:

yarn_beta_slow (Float)

1311	# File 'ext/llama_cpp/dummy.rb', line 1311 def yarn_beta_slow=(yarn_beta_slow); end

#yarn_ext_factor ⇒ `Float`

Returns the YaRN extrapolation mix factor.

Returns:

(Float)

1291	# File 'ext/llama_cpp/dummy.rb', line 1291 def yarn_ext_factor; end

#yarn_ext_factor=(yarn_ext_factor) ⇒ `Object`

Sets the YaRN extrapolation mix factor.

Parameters:

yarn_ext_factor (Float)

1287	# File 'ext/llama_cpp/dummy.rb', line 1287 def yarn_ext_factor=(yarn_ext_factor); end

#yarn_orig_ctx ⇒ `Integer`

Returns the YaRN original context size.

Returns:

(Integer)

1323	# File 'ext/llama_cpp/dummy.rb', line 1323 def yarn_orig_ctx; end

#yarn_orig_ctx=(yarn_orig_ctx) ⇒ `Object`

Sets the YaRN original context size.

Parameters:

yarn_orig_ctx (Integer)

1319	# File 'ext/llama_cpp/dummy.rb', line 1319 def yarn_orig_ctx=(yarn_orig_ctx); end

Class: LLaMACpp::ContextParams

Overview

Instance Method Summary collapse

Instance Method Details

#attention_type ⇒ Integer

#attention_type=(attention_type) ⇒ Object

#defrag_thold ⇒ Float

#defrag_thold=(defrag_thold) ⇒ Object

#embeddings ⇒ Boolean

#embeddings=(flag) ⇒ Object

#flash_attn ⇒ Boolean

#flash_attn=(flag) ⇒ Object

#logits_all ⇒ Boolean

#logits_all=(flag) ⇒ Object

#n_batch ⇒ Integer

#n_batch=(n_batch) ⇒ Object

#n_ctx ⇒ Integer

#n_ctx=(n_ctx) ⇒ Object

#n_seq_max ⇒ Integer

#n_seq_max=(n_seq_max) ⇒ Object

#n_ubatch ⇒ Integer

#n_ubatch=(n_ubatch) ⇒ Object

#offload_kqv=(flag) ⇒ Object

#offload_kwv ⇒ Boolean

#pooling_type ⇒ Integer

#pooling_type=(pooling_type) ⇒ Object

#rope_freq_base ⇒ Float

#rope_freq_base=(rope_freq_base) ⇒ Object

#rope_freq_scale ⇒ Float

#rope_freq_scale=(rope_freq_scale) ⇒ Object

#rope_scaling_type ⇒ Integer

#rope_scaling_type=(scaling_type) ⇒ Object

#seed ⇒ Integer

#seed=(seed) ⇒ Object

#type_k ⇒ Integer

#type_k=(type_k) ⇒ Object

#type_v ⇒ Integer

#type_v=(type_v) ⇒ Object

#yarn_attn_factor ⇒ Float

#yarn_attn_factor=(yarn_attn_factor) ⇒ Object

#yarn_beta_fast ⇒ Float

#yarn_beta_fast=(yarn_beta_fast) ⇒ Object

#yarn_beta_slow ⇒ Float

#yarn_beta_slow=(yarn_beta_slow) ⇒ Object

#yarn_ext_factor ⇒ Float

#yarn_ext_factor=(yarn_ext_factor) ⇒ Object

#yarn_orig_ctx ⇒ Integer

#yarn_orig_ctx=(yarn_orig_ctx) ⇒ Object

#attention_type ⇒ `Integer`

#attention_type=(attention_type) ⇒ `Object`

#defrag_thold ⇒ `Float`

#defrag_thold=(defrag_thold) ⇒ `Object`

#embeddings ⇒ `Boolean`

#embeddings=(flag) ⇒ `Object`

#flash_attn ⇒ `Boolean`

#flash_attn=(flag) ⇒ `Object`

#logits_all ⇒ `Boolean`

#logits_all=(flag) ⇒ `Object`

#n_batch ⇒ `Integer`

#n_batch=(n_batch) ⇒ `Object`

#n_ctx ⇒ `Integer`

#n_ctx=(n_ctx) ⇒ `Object`

#n_seq_max ⇒ `Integer`

#n_seq_max=(n_seq_max) ⇒ `Object`

#n_ubatch ⇒ `Integer`

#n_ubatch=(n_ubatch) ⇒ `Object`

#offload_kqv=(flag) ⇒ `Object`

#offload_kwv ⇒ `Boolean`

#pooling_type ⇒ `Integer`

#pooling_type=(pooling_type) ⇒ `Object`

#rope_freq_base ⇒ `Float`

#rope_freq_base=(rope_freq_base) ⇒ `Object`

#rope_freq_scale ⇒ `Float`

#rope_freq_scale=(rope_freq_scale) ⇒ `Object`

#rope_scaling_type ⇒ `Integer`

#rope_scaling_type=(scaling_type) ⇒ `Object`

#seed ⇒ `Integer`

#seed=(seed) ⇒ `Object`

#type_k ⇒ `Integer`

#type_k=(type_k) ⇒ `Object`

#type_v ⇒ `Integer`

#type_v=(type_v) ⇒ `Object`

#yarn_attn_factor ⇒ `Float`

#yarn_attn_factor=(yarn_attn_factor) ⇒ `Object`

#yarn_beta_fast ⇒ `Float`

#yarn_beta_fast=(yarn_beta_fast) ⇒ `Object`

#yarn_beta_slow ⇒ `Float`

#yarn_beta_slow=(yarn_beta_slow) ⇒ `Object`

#yarn_ext_factor ⇒ `Float`

#yarn_ext_factor=(yarn_ext_factor) ⇒ `Object`

#yarn_orig_ctx ⇒ `Integer`

#yarn_orig_ctx=(yarn_orig_ctx) ⇒ `Object`