Prompt Caching Preview Not Working With Boto3 #4376

Armek · 2024-12-12T16:02:15Z

Describe the bug

I'm attempting to run the boto3 example that uses prompt caching in this AWS Blog post: https://aws.amazon.com/blogs/aws/reduce-costs-and-latency-with-amazon-bedrock-intelligent-prompt-routing-and-prompt-caching-preview/

Specifically this block of code:

import json

import boto3

MODEL_ID = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
AWS_REGION = "us-west-2"

bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=AWS_REGION,
)

DOCS = [
    "bedrock-or-sagemaker.pdf",
    "generative-ai-on-aws-how-to-choose.pdf",
    "machine-learning-on-aws-how-to-choose.pdf",
]

messages = []


def converse(new_message, docs=[], cache=False):

    if len(messages) == 0 or messages[-1]["role"] != "user":
        messages.append({"role": "user", "content": []})

    for doc in docs:
        print(f"Adding document: {doc}")
        name, format = doc.rsplit('.', maxsplit=1)
        with open(doc, "rb") as f:
            bytes = f.read()
        messages[-1]["content"].append({
            "document": {
                "name": name,
                "format": format,
                "source": {"bytes": bytes},
            }
        })

    messages[-1]["content"].append({"text": new_message})

    if cache:
        messages[-1]["content"].append({"cachePoint": {"type": "default"}})

    response = bedrock_runtime.converse(
        modelId=MODEL_ID,
        messages=messages,
    )

    output_message = response["output"]["message"]
    response_text = output_message["content"][0]["text"]

    print("Response text:")
    print(response_text)

    print("Usage:")
    print(json.dumps(response["usage"], indent=2))

    messages.append(output_message)


converse("Compare AWS Trainium and AWS Inferentia in 20 words or less.", docs=DOCS, cache=True)
converse("Compare Amazon Textract and Amazon Transcribe in 20 words or less.")
converse("Compare Amazon Q Business and Amazon Q Developer in 20 words or less.")

My organization has the prompt caching preview enabled and we are running the latest version of boto3 (1.35.79 at time of this post). I get the following exception when running the above example:

botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in messages[0].content[1]: "cachePoint", must be one of: text, image, document, video, toolUse, toolResult, guardContent

It appears that boto3 doesn't support this parameter yet, but the AWS blog shows it being used above. Is the version of boto3 that supports this not yet published or am I potentially doing something wrong?

Regression Issue

Select this option if this issue appears to be a regression.

Expected Behavior

I expect to be able to call the Converse API with prompt caching working.

Current Behavior

I get the below error

botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in messages[0].content[1]: "cachePoint", must be one of: text, image, document, video, toolUse, toolResult, guardContent

Reproduction Steps

Run the code as documented in the AWS Blog: https://aws.amazon.com/blogs/aws/reduce-costs-and-latency-with-amazon-bedrock-intelligent-prompt-routing-and-prompt-caching-preview/

This block specifically:

import json

import boto3

MODEL_ID = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
AWS_REGION = "us-west-2"

bedrock_runtime = boto3.client(
    "bedrock-runtime",
    region_name=AWS_REGION,
)

DOCS = [
    "bedrock-or-sagemaker.pdf",
    "generative-ai-on-aws-how-to-choose.pdf",
    "machine-learning-on-aws-how-to-choose.pdf",
]

messages = []


def converse(new_message, docs=[], cache=False):

    if len(messages) == 0 or messages[-1]["role"] != "user":
        messages.append({"role": "user", "content": []})

    for doc in docs:
        print(f"Adding document: {doc}")
        name, format = doc.rsplit('.', maxsplit=1)
        with open(doc, "rb") as f:
            bytes = f.read()
        messages[-1]["content"].append({
            "document": {
                "name": name,
                "format": format,
                "source": {"bytes": bytes},
            }
        })

    messages[-1]["content"].append({"text": new_message})

    if cache:
        messages[-1]["content"].append({"cachePoint": {"type": "default"}})

    response = bedrock_runtime.converse(
        modelId=MODEL_ID,
        messages=messages,
    )

    output_message = response["output"]["message"]
    response_text = output_message["content"][0]["text"]

    print("Response text:")
    print(response_text)

    print("Usage:")
    print(json.dumps(response["usage"], indent=2))

    messages.append(output_message)


converse("Compare AWS Trainium and AWS Inferentia in 20 words or less.", docs=DOCS, cache=True)
converse("Compare Amazon Textract and Amazon Transcribe in 20 words or less.")
converse("Compare Amazon Q Business and Amazon Q Developer in 20 words or less.")

PDF's needed are linked to in the Blog post

Possible Solution

No response

Additional Information/Context

Our organization that I'm authenticating with has been enabled for the prompt caching preview.

SDK version used

1.35.79

Environment details (OS name and version, etc.)

Windows 11 and Linux

The text was updated successfully, but these errors were encountered:

tim-finnigan · 2024-12-12T22:25:16Z

Thanks for reaching out — this issue is with the Bedrock Runtime API rather than Boto3 directly. Per the blog post that you referenced:

Amazon Bedrock support for prompt caching is available in preview in US West (Oregon) for Anthropic’s Claude 3.5 Sonnet V2 and Claude 3.5 Haiku. Prompt caching is also available in US East (N. Virginia) for Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro. You can request access to the Amazon Bedrock prompt caching preview her

You said that you already have prompt caching enabled. Can you confirm that it is enabled for us-west-2 and the account you are using?

In the meantime I'll also try to get more clarification from the Bedrock team regarding the expected behavior. I saw cachePoint was also referenced here in the Amazon Bedrock User Guide: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html. However it is not mentioned in the Converse API documentation.

Armek added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Dec 12, 2024

tim-finnigan self-assigned this Dec 12, 2024

tim-finnigan added investigating This issue is being investigated and/or work is in progress to resolve the issue. service-api This issue is caused by the service API, not the SDK implementation. p2 This is a standard priority issue bedrock-runtime labels Dec 12, 2024

tim-finnigan added response-requested Waiting on additional information or feedback. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-triage This issue or PR still needs to be triaged. labels Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt Caching Preview Not Working With Boto3 #4376

Prompt Caching Preview Not Working With Boto3 #4376

Armek commented Dec 12, 2024

tim-finnigan commented Dec 12, 2024

Prompt Caching Preview Not Working With Boto3 #4376

Prompt Caching Preview Not Working With Boto3 #4376

Comments

Armek commented Dec 12, 2024

Describe the bug

Regression Issue

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

SDK version used

Environment details (OS name and version, etc.)

tim-finnigan commented Dec 12, 2024