Inference Pipeline defaults to Training image instead of using specified Inference image #4888

maslick · 2024-10-10T15:37:47Z

Describe the bug
We operate two SageMaker pipelines: training and batch inference. In the training pipeline, we create a model using the CreateModelStep API, where we explicitly specify the image_uri for inference. The process completes successfully, creating a new model package version which is then registered with the Model Registry. Upon reviewing the inference specification of the generated model package, the image_uri correctly reflects the specified Docker image for inference.

In the inference pipeline, we import the model package version following the guidelines in the documentation [1] and a sample project [2]. However, the image_uri is not preserved and defaults back to the training image.

As a result, we are unable to run the Batch Transform step because the job incorrectly uses the training image instead of the specified inference image.

[1] https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-mkt-model-pkg-model.html#sagemaker-mkt-model-pkg-model-sdk
[2] https://github.com/aws-samples/aws-enterprise-mlops-framework/blob/b6eea322b44b6d90a110ae91298308ba060f96d1/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/train_deploy_batch_inference_product/seed_code/build_app/ml_pipelines/inference/pipeline.py#L221

To reproduce

# Training pipeline
step_create_model = CreateModelStep(
    name=f'model-{project_id}',
    display_name="Create Model",
    model=Model(
            predictor_cls=AutoGluonTabularPredictor,
            image_uri=inferenceImageUriParameter.to_string(),
            model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
            sagemaker_session=pipeline_session,
            role=role,
            name=f'model-{project_id}'
    ),
    inputs=CreateModelInput(
        instance_type="ml.m5.xlarge",
    ),
    depends_on=[step_train]
)

# Inference pipeline
model_package = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,  # e.g. arn:aws:sagemaker:eu-west-1:77777777777:model-package/mlops-example-1-p-jmxjq22cpa2r/21
    sagemaker_session=pipeline_session,
)

step_create_model = ModelStep(
    name="CreateModel",
    display_name="Create Model",
    step_args=model_package.create(
        instance_type="ml.m5.xlarge"
    ),
    depends_on=[step_load_data]
)

Expected behavior
The image uri in the Inference specification is retained.

System information

SageMaker Python SDK version: 2.232.2
Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
Framework version: N/A
Python version: 3.
CPU or GPU: N/A
Custom Docker image (Y/N): Y

The text was updated successfully, but these errors were encountered:

maslick added the bug label Oct 10, 2024

maslick changed the title ~~Inference image_uri is lost when importing creating a Model from model_package_arn~~ Inference Pipeline Defaults to Training Image Instead of Using Specified Inference Image Oct 10, 2024

maslick changed the title ~~Inference Pipeline Defaults to Training Image Instead of Using Specified Inference Image~~ Inference Pipeline defaults to Training image instead of using specified Inference image Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Pipeline defaults to Training image instead of using specified Inference image #4888

Inference Pipeline defaults to Training image instead of using specified Inference image #4888

maslick commented Oct 10, 2024

Inference Pipeline defaults to Training image instead of using specified Inference image #4888

Inference Pipeline defaults to Training image instead of using specified Inference image #4888

Comments

maslick commented Oct 10, 2024