Arm backend: Add support for StaticCache in TOSA+INT#18124
Arm backend: Add support for StaticCache in TOSA+INT#18124tom-arm wants to merge 1 commit intopytorch:mainfrom
Conversation
* Create pass to decompose index_copy to index_put * Enable INT tests for all targets except U55 * Adjust insert cast tests as the graph should delegate Signed-off-by: Tom Allsop <tom.allsop@arm.com> Change-Id: I139728a78884ff1cbdc72568938f7b43f0b12279
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18124
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 Awaiting Approval, 17 New Failures, 1 Cancelled Job, 5 PendingAs of commit f05ee4e with merge base 76dfb19 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Pull request overview
This PR extends the Arm backend’s INT/TOSA lowering path to support transformers.cache_utils.StaticCache by adjusting the pre-quantization graph transformations and enabling previously-xfailed INT tests (with U55 remaining xfailed due to SCATTER limitations).
Changes:
- Add a new transform-for-annotation decomposition pass to rewrite
aten.index_copy{,_}intoaten.index_put{,_}ahead of quantization/lowering. - Update INT test expectations/config to align with the new decomposition behavior (including disabling
fold_quantizewhere needed for StaticCache). - Minor cleanup in
RewriteIndexPutPassby removing an unused parameter from_expand_none_indices.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| backends/arm/test/passes/test_insert_int32_casts_after_int64_placeholders_pass.py | Updates INT test expectations to match index_copy→index_put decomposition and associated cast op presence. |
| backends/arm/test/modules/test_static_cache.py | Enables TOSA+INT/VGF quant StaticCache tests (removing prior xfails), sets fold_quantize=False, and refines U55 xfail reasoning. |
| backends/arm/_passes/rewrite_index_put_pass.py | Removes an unused parameter from _expand_none_indices and updates its call site. |
| backends/arm/_passes/insert_int32_casts_after_int64_placeholders.py | Drops index_copy{,_} from the “requires i64 input” callsite-cast map now that index_copy is decomposed earlier. |
| backends/arm/_passes/decompose_index_copy_pass.py | Introduces DecomposeIndexCopyPass to decompose aten.index_copy{,_} during transform-for-annotation. |
| backends/arm/_passes/arm_pass_manager.py | Wires DecomposeIndexCopyPass into the transform-for-annotation pipeline. |
| backends/arm/_passes/init.py | Exports DecomposeIndexCopyPass from the passes package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
|
|
||
| class DecomposeIndexCopyPass(GetDecompositionPass): | ||
| """Decomposes aten.index_copy into aten.index_put, as well as it's |
There was a problem hiding this comment.
Docstring grammar: "it's" is the contraction for "it is"; here it should be the possessive "its" ("...as well as its surrounding operators").
| """Decomposes aten.index_copy into aten.index_put, as well as it's | |
| """Decomposes aten.index_copy into aten.index_put, as well as its |
Change-Id: I139728a78884ff1cbdc72568938f7b43f0b12279
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell