Skip to content

Conversation

@DannyYuyang-quic
Copy link
Contributor

Summary

Fix follow-up to: #16684

  • add the sliding_window and local_rope_theta parameters to ModelArgs so Gemma2 configs load correctly in static llama flow.
  • integrate all local attention related config under ModelArgs for consistency.

Test plan

python backends/qualcomm/tests/test_qnn_delegate.py TestExampleLLMScript.test_static_llm_model -m SM8750 -s ${SERIAL_NUM} -b build-android -a . --executorch_root . --model_name gemma2-2b

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16687

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6018b6b with merge base fed6ff1 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 19, 2026
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@DannyYuyang-quic
Copy link
Contributor Author

DannyYuyang-quic commented Jan 19, 2026

Follow‑up to: #16684 — thanks to @mergennachin for adding the missing files for Gemma2.

@cccclai @mergennachin hi, while validating the Gemma2 config, I noticed that the recently introduced sliding_window parameter in 2b_config.json was not yet part of ModelArgs, which caused the Gemma2 config loading path to break in static llama flow ModelArgs(**json.load(f)). So this PR adds the missing field in ModelArgs.

In addition, since Transformers was upgraded to 5.0, it has changed how configs are loaded, which also makes maintenance more challenging. I think it was also a good moment to move all local attention related parameters into ModelArgs.

cc: @haowhsu-quic @jethroqti

Comment on lines -24 to +25
"layer_types": ["local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global", "local", "global"],
"rope_local_base_freq": 10000.0
"layer_types": ["sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention", "sliding_attention", "full_attention"]
Copy link
Contributor Author

@DannyYuyang-quic DannyYuyang-quic Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cccclai @mergennachin
Regarding the layer_types added in #16684, I noticed that the current config uses:

"layer_types": ["local", "global", "local", "global", ...]

In this PR, I switched the names to:

"layer_types": ["sliding_attention", "full_attention", ...]

This change was only to preserve HuggingFace’s original naming, there is no other intention behind it.
we completely open to reverting back to "local" / "global" if that is the preferred naming style on your side.
Please let me know which naming you prefer, and I'm happy to keep it accordingly.

Copy link
Contributor

@mergennachin mergennachin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline

@DannyYuyang-quic DannyYuyang-quic force-pushed the dev1/danny/integrate_local_attn_related_params branch from 17c8298 to e2b7c0f Compare January 22, 2026 14:13
@mergennachin
Copy link
Contributor

@DannyYuyang-quic

This is past the 1.1 branch cut, does it need to be cherry-picked? If so, please look at cherry-pick instructions at #16365

@DannyYuyang-quic
Copy link
Contributor Author

@pytorchbot cherry-pick --onto release/1.1 -c critical

@DannyYuyang-quic
Copy link
Contributor Author

DannyYuyang-quic commented Jan 24, 2026

This is past the 1.1 branch cut, does it need to be cherry-picked? If so, please look at cherry-pick instructions at #16365

Thanks for the cherry-pick instructions!
This needs to be cherry-picked, and I’ve already followed the instructions and applied the cherry-pick onto the 1.1 release.

@mergennachin
Copy link
Contributor

mergennachin commented Jan 25, 2026

You need to land this first and then cherry-pick

@cccclai
Copy link
Contributor

cccclai commented Jan 26, 2026

can you rebase then we can land?

@DannyYuyang-quic DannyYuyang-quic force-pushed the dev1/danny/integrate_local_attn_related_params branch from e2b7c0f to 6018b6b Compare January 27, 2026 02:23
@mergennachin
Copy link
Contributor

We won't cherry-pick to 1.1 fyi since this is too late at this point...

@DannyYuyang-quic
Copy link
Contributor Author

DannyYuyang-quic commented Jan 28, 2026

@cccclai, @mergennachin
No problem. If users run into this issue, I’ll point them to this PR and ask them to reference this patch before the next release.
Thanks for the update!

@mergennachin mergennachin merged commit 883af3f into pytorch:main Jan 28, 2026
144 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants