-
Notifications
You must be signed in to change notification settings - Fork 270
Rewrite StaticallyIndexedArray to use C-array instead of Tuple #3587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| // This avoids deep template instantiation while maintaining the same interface | ||
| template <typename T, index_t N> | ||
| struct StaticallyIndexedArrayImpl | ||
| struct StaticallyIndexedArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we are doing here is essentially a vector of a vector, no? Maybe we can refactor this into the vector_type class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current major problem with this class it has to be interface-compatible with a Tuple. Need to be careful with the call sites
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can retire the StaticallyIndexedArray and replace with StaticallyIndexedArray_v2
Replace the recursive template metaprogramming implementation of StaticallyIndexedArray with a simple C-array based struct. This avoids deep template instantiation while maintaining the same interface. Key changes: - StaticallyIndexedArray now stores `T data_[N]` instead of inheriting from Tuple - Added constexpr conversion constructor to convert from any indexed container (Tuple, etc.) - Added arithmetic operators (+, -, *, +=, -=) using C++20 concepts - Added overloads for container_reorder_given_new2old/old2new - Added overloads for get_container_subset and set_container_subset - Specialization for empty array (N=0) Co-Authored-By: Claude <[email protected]>
1b33b98 to
aef254c
Compare
|
Error importing due to merge conflicts – please reopen the PR on ROCm/rocm-libraries |
Standalone PR
This PR is independent and can be merged separately from the main optimization stack.
Related stack: #3585 → #3588 → #3589 → #3590 → #3596
Summary
Build Time Improvement
Tracking issue: #3575
Test plan
example_grouped_conv_fwd_xdl_fp16 1 1 1