-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama3.1 models do not allow configuring max_seq_len
#2202
Comments
hey @akashc1, I see. We should fix it, but to be clear, you don't have to redefine max_seq_len for the model. This is used only for the positional embedding, you can leave it as 131k, unless you are trying to go beyond that. For your specific case, where you just want to limit your sequence length in the batch, you should change it only in the tokenizer.
|
@felipemello1 yes I understand that, however the transformer implementation does throw an error if it gets a This is an example config that I've been using to ensure the tokenizer & model both produce/expect the same thing max_seq_len: 16384
# Tokenizer
tokenizer:
_component_: torchtune.models.llama3.llama3_tokenizer
...
max_seq_len: ${max_seq_len}
# Model Arguments
model:
_component_: torchtune.models.llama3_1.lora_llama3_1_405b
...
max_seq_len: ${max_seq_len}
# Dataset and Sampler
dataset:
_component_: torchtune.datasets.instruct_dataset
...
packed: True Let me know if this explains what I'm running into, or if I can clarify further :) |
Oh, i see! Ok, there is a deeper problem that we set max_seq_len for 405B as 8k, which is wrong. The model was trained for >131k. Thanks for raising it. I left a comment in your PR (https://github.com/pytorch/torchtune/pull/2203/files): Lets just update the docstring and I will approve it. By the way, should you try llama 3.3 70B? It has better or equivalent performance to 405B. |
@ebsmothers , can you sanity check that its wrong that we set max_seq_len to 8k? https://github.com/pytorch/torchtune/blob/aa8f365f91a69aa36aaea14cf6f03ccd45310bb6/torchtune/models/llama3_1/_model_builders.py#L239C9-L239C26 |
Llama 3.1 model builders hardcode the max context length, even though the component builders allow specifying it:
And since the
QLoRA
versions also use these, it affects that too. This prevents anyone from specifying the model'smax_seq_len
from a config or CLI. E.g. this config will throw an error:In my workload, and I'm sure for others as well, I need to specify the context length differently.
The text was updated successfully, but these errors were encountered: