Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: TypeError in SAM2ImagePredictor.predict() method #1431

Open
dongxiaolong opened this issue Dec 18, 2024 · 3 comments
Open

Bug: TypeError in SAM2ImagePredictor.predict() method #1431

dongxiaolong opened this issue Dec 18, 2024 · 3 comments
Assignees
Labels
bug Something isn't working triaged

Comments

@dongxiaolong
Copy link

dongxiaolong commented Dec 18, 2024

@cpuhrsch Hi! I need your help with a bug in SAM2ImagePredictor.predict() method

# Bug: TypeError in SAM2ImagePredictor.predict() method

## Description
When using `SAM2ImagePredictor.predict()`, two errors occur:
1. When `return_logits=False`: RuntimeError: "Boolean value of Tensor with more than one value is ambiguous"
2. When `return_logits=True`: AssertionError at `postprocess_masks_1_channel()` due to incorrect tensor dimension (expecting channel dimension to be 1)

## Reproduction Steps
```python
from torchao._models.sam2.build_sam import build_sam2
from torchao._models.sam2.sam2_image_predictor import SAM2ImagePredictor

# Initialize model
sam2_checkpoint = "sam2.1_hiera_large.pt"
model_cfg = "sam2.1_hiera_l.yaml"
sam2_model = build_sam2(model_cfg, sam2_checkpoint, device="cuda")
predictor = SAM2ImagePredictor(sam2_model)

# Set image and input points
predictor.set_image(image)
input_point = np.array([[500, 375]])
input_label = np.array([1])

# This call raises the error
masks, scores, logits = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True,
    return_logits=False  # or True
)

Error Messages

With return_logits=False:

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

With return_logits=True:

AssertionError at transforms.py:128: assert masks.size(1) == 1

Additional Context

The input tensor has shape torch.Size([1, 256, 64, 64]). The error seems to occur in the parameter passing between _predict() and _predict_masks_postprocess() methods, specifically around the handling of return_logits parameter.

Environment

  • Python version: 3.11
  • CUDA: enabled

Could you please help me understand what's going wrong here? Thank you in advance!

@supriyar supriyar added the bug Something isn't working label Dec 19, 2024
@cpuhrsch
Copy link
Contributor

Hey @dongxiaolong - the copy of SAM2 in torchao isn't intended to be general purpose just yet, but specialized towards the example in https://github.com/pytorch/ao/tree/main/examples/sam2_amg_server . The assert comes up because some assumption that was made along the development is being invalidated.

@dongxiaolong
Copy link
Author

Thank you for your reply.

@cpuhrsch
Copy link
Contributor

I'll keep this issue open so I can revisit it later on when this example works.

@cpuhrsch cpuhrsch reopened this Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged
Projects
None yet
Development

No branches or pull requests

4 participants