You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By setting a sequence of 360+ zeros using generatorParams.SetInputIDs(...) DirectML (C# but is not a C# related problem)
I get the following bug in 0.5.2:
OnnxRuntimeGenAIException: D:\a\_work\1\onnxruntime-genai\src\dml\dml_command_recorder.cpp(143)\onnxruntime-genai.dll!00007FFFD2D7AF63: (caller: 00007FFFD2D7CA85) Exception(2) tid(22ac) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.
Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess (System.IntPtr nativeResult) (at D:/a/_work/1/onnxruntime-genai/src/csharp/Result.cs:25)
Microsoft.ML.OnnxRuntimeGenAI.Generator.ComputeLogits () (at D:/a/_work/1/onnxruntime-genai/src/csharp/Generator.cs:25)
Main.Generate () (at Assets/Main.cs:202)
System.Threading.Tasks.Task.InnerInvoke () (at <9d9536d9127f4a489d989c7a566aee1c>:0)
System.Threading.Tasks.Task.Execute () (at <9d9536d9127f4a489d989c7a566aee1c>:0)
The sequences of zeros is just an example as it will also crash on other seemingly random prompts.
This means that GenAI is not usable in production since we cannot guarantee that a given prompt will not crash the GPU. My suggestion would be to create some tests with hundreds of initial long random token strings when testing on different GPUs. Then you should detect this bug next time.
I have no idea what could be causing this. My wild guess would be something to do with the int4 encoding/decoding as that's the main new thing recently.
Windows 10. GPU Quadro P5000.
The text was updated successfully, but these errors were encountered:
By setting a sequence of 360+ zeros using generatorParams.SetInputIDs(...) DirectML (C# but is not a C# related problem)
I get the following bug in 0.5.2:
The sequences of zeros is just an example as it will also crash on other seemingly random prompts.
This means that GenAI is not usable in production since we cannot guarantee that a given prompt will not crash the GPU. My suggestion would be to create some tests with hundreds of initial long random token strings when testing on different GPUs. Then you should detect this bug next time.
I have no idea what could be causing this. My wild guess would be something to do with the int4 encoding/decoding as that's the main new thing recently.
Windows 10. GPU Quadro P5000.
The text was updated successfully, but these errors were encountered: