Error while converting Llama 3B fp16 gguf model #1137

rakshit2020 · 2024-12-10T10:53:06Z

Hi
Using llama cpp I converted the llama-3.2-3B-instruct tofp16 ggufformat now I run the cmd given in the documentation
python3 -m onnxruntime_genai.models.builder -m meta-llama/Llama-3.2-3B-Instruct -i /home/ubuntu/CPU_Serving/metaLlama3B/metaLlama3B-3.2B-F16.gguf -o Llama3B_gguf_onnx -p int4 -e cpu

I am getting the error -

Valid precision + execution provider combinations are: FP32 CPU, FP32 CUDA, FP16 CUDA, FP16 DML, INT4 CPU, INT4 CUDA, INT4 DML Extra options: {} GroupQueryAttention (GQA) is used in this model. Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/miniconda3/envs/olmo/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3277, in <module> create_model(args.model_name, args.input, args.output, args.precision, args.execution_provider, args.cache_dir, **extra_options) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 3159, in create_model onnx_model.make_model(input_path) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/builder.py", line 2019, in make_model model = GGUFModel.from_pretrained(self.model_type, input_path, self.head_size, self.hidden_size, self.intermediate_size, self.num_attn_heads, self.num_kv_heads, self.vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 240, in from_pretrained model = GGUFModel(input_path, head_size, hidden_size, intermediate_size, num_attn_heads, num_kv_heads, vocab_size) File "/home/ubuntu/CPU_Serving/CPU_Opti/lib/python3.10/site-packages/onnxruntime_genai/models/gguf_model.py", line 104, in __init__ curr_layer_id = int(name.split(".")[1]) ValueError: invalid literal for int() with base 10: 'weight'

The text was updated successfully, but these errors were encountered:

kunal-vaishnavi · 2025-01-09T17:38:05Z

The GGUF to ONNX path needs to be updated as several necessary changes that were made in the PyTorch to ONNX path need to be brought over. Until those updates are added, you can try converting with the PyTorch to ONNX path in the meantime.

microsoft-github-policy-service bot added the ep:DML label Dec 10, 2024

kunal-vaishnavi added bug Something isn't working and removed ep:DML labels Dec 13, 2024

kunal-vaishnavi self-assigned this Dec 13, 2024

microsoft-github-policy-service bot added the ep:DML label Dec 13, 2024

kunal-vaishnavi removed the ep:DML label Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while converting Llama 3B fp16 gguf model #1137

Error while converting Llama 3B fp16 gguf model #1137

rakshit2020 commented Dec 10, 2024

kunal-vaishnavi commented Jan 9, 2025

Error while converting Llama 3B fp16 gguf model #1137

Error while converting Llama 3B fp16 gguf model #1137

Comments

rakshit2020 commented Dec 10, 2024

kunal-vaishnavi commented Jan 9, 2025