Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool calling on Llama 3.2 not working with system prompt + image + zero temperature #4374

Open
1 task
p0deje opened this issue Dec 10, 2024 · 5 comments
Open
1 task
Assignees
Labels
bedrock-runtime bug This issue is a confirmed bug. p2 This is a standard priority issue service-api This issue is caused by the service API, not the SDK implementation.

Comments

@p0deje
Copy link

p0deje commented Dec 10, 2024

Describe the bug

When using Llama 3.2 on AWS Bedrock with the following combination of inputs, the tools are not respected and the response is plain text. It only happens when:

  • system prompt is present
  • temperature is 0
  • input contains JPEG image (though sometimes reproduces on PNG too)
  • tools are requested

Changing any of these (e.g. increasing temperature to 1) fixes the problem and toolUse is returned.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

toolsUse contains the tools from the model.

Current Behavior

toolsUse is empty.

Reproduction Steps

Run the following script - https://gist.github.com/p0deje/82a09ddadf7c39c1b11f322f7de9dfb1

$ python boto.py
================
Tool use working
================
[{'toolUse': {'toolUseId': 'tooluse_gWFvnMgwRwC127Q9M1PuGQ', 'name': 'response', 'input': {'result': 'True'}}}]
================
Tool use broken
================
[{'text': 'The prompt is asking whether the statement "2+2=4" is true or false. To answer this, we can use the "response" function with the argument "result" set to True, since 2+2 indeed equals 4.\n\nHere is the JSON for the function call:\n\n{\n    "name": "response",\n    "parameters": {\n        "result": true\n    }\n}'}]

Possible Solution

No response

Additional Information/Context

Originally reported to langchain-ai/langchain-aws#285

SDK version used

1.35.64

Environment details (OS name and version, etc.)

macOS 15.1.1

@p0deje p0deje added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Dec 10, 2024
@mkagenius
Copy link

image

For now, they don't support tool calling in vision models.

https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2/

@tim-finnigan tim-finnigan self-assigned this Dec 11, 2024
@tim-finnigan
Copy link
Contributor

Thanks for reaching out. The Boto3 converse command makes a request to the underlying Converse API, so the issue here involves the API rather than Boto3 directly. And it sounds like this may currently be a limitation with Llama as suggested in the comment above.

If there's feedback you'd like us to forward to the Bedrock team regarding this please let us know — maybe some clarification could be added the API documentation or here in their User Guide: https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-runtime_example_bedrock-runtime_Converse_MetaLlama_section.htm (in which case you can also use the Provide feedback link at the bottom of that page to send feedback.)

@tim-finnigan tim-finnigan closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2024
@tim-finnigan tim-finnigan added service-api This issue is caused by the service API, not the SDK implementation. bedrock-runtime and removed service-api This issue is caused by the service API, not the SDK implementation. needs-triage This issue or PR still needs to be triaged. labels Dec 11, 2024
@p0deje
Copy link
Author

p0deje commented Dec 11, 2024

For now, they don't support tool calling in vision models.

If you look at my example, you can see that it actually works for text+image, the problem is that it doesn't work for some of the images. I believe that this could be an issue with Bedrock, rather than the model itself.

@tim-finnigan tim-finnigan reopened this Dec 11, 2024
@tim-finnigan
Copy link
Contributor

For now, they don't support tool calling in vision models.

If you look at my example, you can see that it actually works for text+image, the problem is that it doesn't work for some of the images. I believe that this could be an issue with Bedrock, rather than the model itself.

Thanks for following up @p0deje — we can try reaching out to the Bedrock team for clarification on the expected behavior here.

@tim-finnigan tim-finnigan added the p2 This is a standard priority issue label Dec 11, 2024
@mkagenius
Copy link

Weird, that llama team would explicitly write that vision models don't support tool calling (maybe that's the reason for erratic occasional tool calling response)

Or Maybe bedrock is calling the vision model with just the text input - can be tested by asking something about the image along with tool calling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bedrock-runtime bug This issue is a confirmed bug. p2 This is a standard priority issue service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

3 participants