-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: cpu backend make ttir or ttir to ttsharedir failed with some tests in triton-lang/kernels #185
Comments
Thanks so much for reporting the issue. torch-inductor generated code is definitely an area that we currently have a lot of trouble due to the non-structured pointer access patterns. We currently have a planned feature that will lower all of these non-structured accesses to gather / scatter which should hopefully solve most of these problems. I'll update the issue as soon as we have any meaningful progress. |
In order to temporarily fix the MaskAnalysis Fail bug, I added the following to the parseSplat method in MaskAnalysis.cpp:
After adding, the parseSplat method is implemented as follows:
I understand that this can fix the error, but there is a problem. Although there is no error in converting ttir to ttsharedir, a segmentation fault occurs when using python test_inductor.py in the end, without any other information, only the word "segmentation fault". However, if I add the following code, the code runs successfully and produces the correct results:
I don't know why this is happening? |
do you have stack traces enabled? If you have your branch published, I can run it on my setup to see if I can get more information |
Your code handles splat i1 and splat !tt.ptr<> well as shown below %10 = tt.splat %5 : i1 -> tensor<1x1024xi1> loc(#loc10) |
Triton python code
Actually it is a piece of code from https://github.com/triton-lang/kernels/blob/main/test/test_inductor.py
It seems that almost all of the kernel tests in this repo cannot run cpu backend without crash, sometimes crash at ast to ttir, somtimes at ttir to ttsharedir.
import pytest
import torch
import triton
import triton.language as tl
def test_normalization_with_remat(device):
Triton IR
part of ttir around tt.load
%9 = tt.addptr %arg3, %5 : !tt.ptr, i32 loc(#loc9)
%10 = tt.splat %9 : !tt.ptr -> tensor<1x1x!tt.ptr> loc(#loc9)
%11 = tt.load %10, %2 : tensor<1x1x!tt.ptr> loc(#loc10)
%12 = tt.addptr %arg4, %5 : !tt.ptr, i32 loc(#loc11)
Crash log
error: PtrAnalysis: pointer is not replace with tts.make_tptr so loadOp cannot be rewritten
It is seems that the tt.splat has no implementation of rewriteSplatOp to createTTSMakeTensorPtrOp.
Additional information
No response
The text was updated successfully, but these errors were encountered: