Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llama] Remove unnecessary model attribute assignment on 'freqs_cis' #1766

Closed
wants to merge 2 commits into from

Conversation

BowenBao
Copy link
Contributor

@BowenBao BowenBao commented Jul 13, 2023

Fixes #1767, more details there.

Upstream PR link meta-llama/llama#349

@BowenBao BowenBao changed the title Remove unnecessary model attribute assignment on 'freqs_cis' [llama] Remove unnecessary model attribute assignment on 'freqs_cis' Jul 13, 2023
@msaroufim
Copy link
Member

CI failure seems to be a flake not your fault, id retry triggering build

@xuzhao9
Copy link
Contributor

xuzhao9 commented Jul 14, 2023

The upstream model code has a similar issue: https://github.com/facebookresearch/llama/blob/main/llama/model.py#L226

The convention is before we merge this PR, we require the PR author to create a PR to fix the upstream model code as well, and reference the upstream PR link in the code of torchbench model.

@@ -224,8 +224,8 @@ def forward(self, tokens: torch.Tensor, start_pos: int):

h = self.tok_embeddings(tokens)

self.freqs_cis = self.freqs_cis.to(h.device)
freqs_cis = self.freqs_cis[start_pos : start_pos + seqlen]
freqs_cis = self.freqs_cis.to(h.device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a reference to meta-llama/llama#349 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xuzhao9 merged this pull request in e8c1cf0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[export][llama] AssertionError: Mutating module attribute freqs_cis during export
4 participants