Question about ReduceConverter #115
-
In ReduceConverter, the reduction along the innermost axis is replaced by transpose followed by the reduction along the outer axis, as shown in the following codes
Is it for optimization on SIMD based devices? Do you think the reduction along the inner axis is slower than the reduction along the outer axis with a transpose op? Or there are other reasons? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
@jingchangshi Hey thank you for your question, and sorry for the late reply. I don't get notifications from the discussion so I missed this. Will get back to you with some thoughts as soon as possible. |
Beta Was this translation helpful? Give feedback.
-
@haishanzzzz would be the best person to ask this, here's a snippet from our old internal code -- hopefully it gives some more context:
|
Beta Was this translation helpful? Give feedback.
@haishanzzzz would be the best person to ask this, here's a snippet from our old internal code -- hopefully it gives some more context: