-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 [Bug] Unable to compile RoBERTa #3335
Comments
You didn't move model parameters and input tensors to CUDA device, hence the compilation failure. |
I get the same error with: import torch
import torch_tensorrt
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# BEGIN CONFIG #
MODEL_DIR = f'roberta-base'
# END CONFIG #
model = AutoModelForSequenceClassification.from_pretrained(MODEL_DIR, attn_implementation = 'sdpa')
model = model.to('cuda')
model = model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
input_ids = [tokenizer.encode('Hello world')] * 128
input_ids = torch.stack([torch.tensor(input) for input in input_ids]).to('cuda')
attention_mask = torch.ones_like(input_ids).to('cuda')
model = torch_tensorrt.compile(model, inputs = (input_ids, attention_mask)) Here is the traceback:
|
The error is actually different now and the bug has been fixed in #3258. You'll need to install the nightly version of torch/torchvision/torch_tensorrt at the moment. |
@HolyWu Upon installing the nightly version of
|
It looks like the nightly version also broke my |
I solved it by manually editing the two files that were modified by the patch. Much easier than installing the nightly 😆
EDIT: TensorRT seems slower than uncompiled for bfloat16 AMP. |
Bug Description
When I try compiling
roberta-base
, I get this error:To Reproduce
Run:
Expected behavior
The compilation works.
Environment
WSL 2, Torch-TensorRT version 2.5.0, PyTorch verison 2.5.1, CUDA 12.4, Python 3.12.5
The text was updated successfully, but these errors were encountered: