forked from iree-org/iree
-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[LLVMGPU] Add a slowpath if expected to overflow gridDim.
For really large models such as llama_70b it may overflow the number of blocks/ gridDim because parallel dims is always tiled to 1. Add some hueristic to use slowpath if compute is expected to overflow max grid dim. Context length can be set/tuned by: --iree-codegen-llvmgpu-context-length=512 512 is the default. This should be a temporary or at most complementary fix. The real fix should be implementing better distribution for parallel dims, and or adding a specialization kernel/dispatch that can pick slowpath vs fast path depending on the size of dynamic dims during runtime.
- Loading branch information
1 parent
3eff4be
commit 62bc699
Showing
2 changed files
with
37 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters