-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a custom SM image based on sagemaker-distribution hangs and fails in SM studio #178
Comments
Can you provide the environment.yml packages? And are these logs from /aws/sagemaker/studio cloudwatch group? |
Yes, these are logs for the This is a bare-bones example of environment.yml that fails for me.
But its correct that this should work be able to work as a custom Jupyter Lab image in the new studio as well? If it helps I can provide my config too
|
Our team has struggled with this as well. I tried my best to reproduce your image based on the Dockerfile and env.yml and was able to get it to work. The main difference is that instead of relying on the app-image-config property:
we define the ENTRYPOINT and CMD in our Dockefile directly in accordance with https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-jl-image-specifications.html. This was because we had a hard time getting the Below is the Dockerfile I used (the micromamba config is due to our proxy):
Your logs seem to suggest that the CMD portion of this is missing since you do not get these logs (last two):
NOTE: I pushed the image to ECR and then just used the console to create and attach the image to the domain. We use CDK to do our actual deployments. Also, your app image config will have to at least have an empty |
I am not quite sure where to report but since the docs outline how to build a custom image I will try here.
I am building this custom image and pushing it to ECR and adding to sagemaker images and creating app image config, like one would according to the docs.
I am defining my docker image like this
The only difference I can see in the logs is this these two lines at 2024-02-09T10:23:34.006+01:00 and 2024-02-09T10:23:34.006+01:00
In the working images they have a URL that's configured correctly.
I am in VPC only mode for the domain,, but I dont see how that should change anything since the sagemaker-distribution image works fine.
Would appreciate any pointer
The text was updated successfully, but these errors were encountered: