You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi
Using llama cpp I converted the llama-3.2-3B-instruct tofp16 ggufformat now I run the cmd given in the documentation python3 -m onnxruntime_genai.models.builder -m meta-llama/Llama-3.2-3B-Instruct -i /home/ubuntu/CPU_Serving/metaLlama3B/metaLlama3B-3.2B-F16.gguf -o Llama3B_gguf_onnx -p int4 -e cpu
I am getting the error -
Valid precision + execution provider combinations are: FP32 CPU, FP32 CUDA, FP16 CUDA, FP16 DML, INT4 CPU, INT4 CUDA, INT4 DML Extra options: {} GroupQueryAttention (GQA) is used in this model. Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3277, in <module> create_model(args.model_name, args.input, args.output, args.precision, args.execution_provider, args.cache_dir, **extra_options) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3159, in create_model onnx_model.make_model(input_path) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 2019, in make_model model = GGUFModel.from_pretrained(self.model_type, input_path, self.head_size, self.hidden_size, self.intermediate_size, self.num_attn_heads, self.num_kv_heads, self.vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 240, in from_pretrained model = GGUFModel(input_path, head_size, hidden_size, intermediate_size, num_attn_heads, num_kv_heads, vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 104, in __init__ curr_layer_id = int(name.split(".")[1]) ValueError: invalid literal for int() with base 10: 'weight'
The text was updated successfully, but these errors were encountered:
The GGUF to ONNX path needs to be updated as several necessary changes that were made in the PyTorch to ONNX path need to be brought over. Until those updates are added, you can try converting with the PyTorch to ONNX path in the meantime.
Hi
Using llama cpp I converted the
llama-3.2-3B-instruct
tofp16 gguf
format now I run the cmd given in the documentationpython3 -m onnxruntime_genai.models.builder -m meta-llama/Llama-3.2-3B-Instruct -i /home/ubuntu/CPU_Serving/metaLlama3B/metaLlama3B-3.2B-F16.gguf -o Llama3B_gguf_onnx -p int4 -e cpu
I am getting the error -
Valid precision + execution provider combinations are: FP32 CPU, FP32 CUDA, FP16 CUDA, FP16 DML, INT4 CPU, INT4 CUDA, INT4 DML Extra options: {} GroupQueryAttention (GQA) is used in this model. Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3277, in <module> create_model(args.model_name, args.input, args.output, args.precision, args.execution_provider, args.cache_dir, **extra_options) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3159, in create_model onnx_model.make_model(input_path) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 2019, in make_model model = GGUFModel.from_pretrained(self.model_type, input_path, self.head_size, self.hidden_size, self.intermediate_size, self.num_attn_heads, self.num_kv_heads, self.vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 240, in from_pretrained model = GGUFModel(input_path, head_size, hidden_size, intermediate_size, num_attn_heads, num_kv_heads, vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 104, in __init__ curr_layer_id = int(name.split(".")[1]) ValueError: invalid literal for int() with base 10: 'weight'
The text was updated successfully, but these errors were encountered: