Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Pipeline defaults to Training image instead of using specified Inference image #4888

Open
maslick opened this issue Oct 10, 2024 · 0 comments
Labels

Comments

@maslick
Copy link

maslick commented Oct 10, 2024

Describe the bug
We operate two SageMaker pipelines: training and batch inference. In the training pipeline, we create a model using the CreateModelStep API, where we explicitly specify the image_uri for inference. The process completes successfully, creating a new model package version which is then registered with the Model Registry. Upon reviewing the inference specification of the generated model package, the image_uri correctly reflects the specified Docker image for inference.

In the inference pipeline, we import the model package version following the guidelines in the documentation [1] and a sample project [2]. However, the image_uri is not preserved and defaults back to the training image.

As a result, we are unable to run the Batch Transform step because the job incorrectly uses the training image instead of the specified inference image.

[1] https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-mkt-model-pkg-model.html#sagemaker-mkt-model-pkg-model-sdk
[2] https://github.com/aws-samples/aws-enterprise-mlops-framework/blob/b6eea322b44b6d90a110ae91298308ba060f96d1/mlops-multi-account-cdk/mlops-sm-project-template/mlops_sm_project_template/templates/train_deploy_batch_inference_product/seed_code/build_app/ml_pipelines/inference/pipeline.py#L221

To reproduce

# Training pipeline
step_create_model = CreateModelStep(
    name=f'model-{project_id}',
    display_name="Create Model",
    model=Model(
            predictor_cls=AutoGluonTabularPredictor,
            image_uri=inferenceImageUriParameter.to_string(),
            model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
            sagemaker_session=pipeline_session,
            role=role,
            name=f'model-{project_id}'
    ),
    inputs=CreateModelInput(
        instance_type="ml.m5.xlarge",
    ),
    depends_on=[step_train]
)

# Inference pipeline
model_package = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,  # e.g. arn:aws:sagemaker:eu-west-1:77777777777:model-package/mlops-example-1-p-jmxjq22cpa2r/21
    sagemaker_session=pipeline_session,
)

step_create_model = ModelStep(
    name="CreateModel",
    display_name="Create Model",
    step_args=model_package.create(
        instance_type="ml.m5.xlarge"
    ),
    depends_on=[step_load_data]
)

Expected behavior
The image uri in the Inference specification is retained.

System information

  • SageMaker Python SDK version: 2.232.2
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.
  • CPU or GPU: N/A
  • Custom Docker image (Y/N): Y
@maslick maslick added the bug label Oct 10, 2024
@maslick maslick changed the title Inference image_uri is lost when importing creating a Model from model_package_arn Inference Pipeline Defaults to Training Image Instead of Using Specified Inference Image Oct 10, 2024
@maslick maslick changed the title Inference Pipeline Defaults to Training Image Instead of Using Specified Inference Image Inference Pipeline defaults to Training image instead of using specified Inference image Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant