Class: LLaMACpp::ModelQuantizeParams

Inherits:
Object
  • Object
show all
Defined in:
ext/llama_cpp/dummy.rb

Overview

Class for quantization parameters

Instance Method Summary collapse

Instance Method Details

#allow_quantizationBoolean

Returns the flag to allow quantizing non-f32/f16 tensors.

Returns:

  • (Boolean)


1406
# File 'ext/llama_cpp/dummy.rb', line 1406

def allow_quantization; end

#allow_quantization=(flag) ⇒ Object

Sets the flag to allow quantizing non-f32/f16 tensors.

Parameters:

  • flag (Boolean)


1402
# File 'ext/llama_cpp/dummy.rb', line 1402

def allow_quantization=(flag); end

#ftypeInteger

Returns the file type of quantized model.

Returns:

  • (Integer)


1398
# File 'ext/llama_cpp/dummy.rb', line 1398

def ftype; end

#ftype=(ftype) ⇒ Object

Sets the file type of quantized model.

Parameters:

  • ftype (Integer)


1394
# File 'ext/llama_cpp/dummy.rb', line 1394

def ftype=(ftype); end

#keep_splitBoolean

Returns the flag to quantize to the same number of shards.

Returns:

  • (Boolean)


1438
# File 'ext/llama_cpp/dummy.rb', line 1438

def keep_split; end

#keep_split=(flag) ⇒ Object

Sets the flag to quantize to the same number of shards.

Parameters:

  • flag (Boolean)


1434
# File 'ext/llama_cpp/dummy.rb', line 1434

def keep_split=(flag); end

#n_threadInteger

Returns the number of threads.

Returns:

  • (Integer)


1390
# File 'ext/llama_cpp/dummy.rb', line 1390

def n_thread; end

#n_thread=(n_thread) ⇒ Object

Sets the number of threads.

Parameters:

  • n_thread (Intger)


1386
# File 'ext/llama_cpp/dummy.rb', line 1386

def n_thread=(n_thread); end

#only_copyBoolean

Returns the flag to only copy tensors.

Returns:

  • (Boolean)


1422
# File 'ext/llama_cpp/dummy.rb', line 1422

def only_copy; end

#only_copy=(flag) ⇒ Object

Sets the flag to only copy tensors.

Parameters:

  • flag (Boolean)


1418
# File 'ext/llama_cpp/dummy.rb', line 1418

def only_copy=(flag); end

#prue=(flag) ⇒ Object

Sets the flag to disable k-quant mixtures and quantize all tensors to the same type.

Parameters:

  • flag (Boolean)


1426
# File 'ext/llama_cpp/dummy.rb', line 1426

def prue=(flag); end

#pureBoolean

Returns the flag to disable k-quant mixtures and quantize all tensors to the same type.

Returns:

  • (Boolean)


1430
# File 'ext/llama_cpp/dummy.rb', line 1430

def pure; end

#quantize_output_tensorBoolean

Returns the flag to quantize output.weight.

Returns:

  • (Boolean)


1414
# File 'ext/llama_cpp/dummy.rb', line 1414

def quantize_output_tensor; end

#quantize_output_tensor=(flag) ⇒ Object

Sets the flag to quantize output.weight.

Parameters:

  • flag (Boolean)


1410
# File 'ext/llama_cpp/dummy.rb', line 1410

def quantize_output_tensor=(flag); end