You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue maybe related to #16 and #138, about representing a N-D block ptrs on a 1-D flatten ptr with div and modulo arithmetic operations on offsets. Looks like torch-inductor failed to pattern match and generate proper tt.make_tensor_ptr when there is non-power-of-two tensor dim (100 in my case), and there's arith.divsi op in the ptr offset arithmetic ops, causing PtrAnalysis to fail.
The text was updated successfully, but these errors were encountered:
There is a work-in-progress fallback mode for when PtrAnalysis fails which will help these cases. The mode will help with compilation, but the codegen will not be as efficient because we will end up having to load each individual element into a tensor. The fallback mode should be ready in the coming weeks.
Triton python code
Triton IR
Crash log
Additional information
The permute kernel is the codegen result of torch-inductor, from:
This issue maybe related to #16 and #138, about representing a N-D block ptrs on a 1-D flatten ptr with div and modulo arithmetic operations on offsets. Looks like torch-inductor failed to pattern match and generate proper
tt.make_tensor_ptr
when there is non-power-of-two tensor dim (100 in my case), and there'sarith.divsi
op in the ptr offset arithmetic ops, causingPtrAnalysis
to fail.The text was updated successfully, but these errors were encountered: