Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Python 3.10 for SageMaker Studio #104

Merged
merged 7 commits into from
Jun 11, 2024

Conversation

athewsey
Copy link

@athewsey athewsey commented May 31, 2024

Issue #, if available: #94, #103

Description of changes:

  • Updated pyproject.toml to allow Python ^3.10 and re-ran poetry lock to resolve the dependencies
  • Added executed_notebooks/ and .venv/ to gitignore, and tried to start structuring the gitignore a bit since it's becoming large
  • Make copy_s3_content.sh ensure the existence of FMBENCH_WRITE_DIR
  • Make copy_s3_content.sh and debug.sh actually fail (set -e) if any of its commands fail - rather than the previous default behaviour of just carrying on.
  • Make debug.sh actually force re-installation of the wheel on a re-run with same pyproject version, and clean up old wheels if the pyproject version has been incremented (instead of trying to install any the old versions floating around too)
  • Added fix for Error: need to escape, but no escapechar set #103 which I encountered in SM Notebook Instance environment (but still not sure why other people have not been coming across it? It's suspicious...)
  • Swapped wget for curl in copy_s3_content.sh to work around wget failing in SMStudio with 'Unable to locally verify the issuer's authority' aws/sagemaker-distribution#435

Testing done:

To be honest, I find the local debugging procedures for the library a bit sketchy at the moment and still can't get the thing working from source on my laptop. Sharing exact steps I took to verify in different environments:

  1. Verified on a fresh SM Notebook Instance environment (ml.t3.2xlarge):
# (SM NBIs don't have Poetry by default)
curl -sSL https://install.python-poetry.org | python3 -
export PATH="/home/ec2-user/.local/bin:$PATH"

# Fetch source:
cd ~/SageMaker
git clone https://github.com/athewsey/foundation-model-benchmarking-tool fmb
cd fmb
git checkout feat/py310

# Build & test:
poetry env use 3.10
poetry install
./copy_s3_content.sh
./debug.sh
  1. Verified on a fresh SM Studio JupyterLab environment (ml.t3.2xlarge, SM Distribution v1.8):
# (SMStudio doesn't have Poetry by default)
curl -sSL https://install.python-poetry.org | python3 -
export PATH="/home/sagemaker-user/.local/bin:$PATH"

# Fetch source:
cd ~
git clone https://github.com/athewsey/foundation-model-benchmarking-tool fmb
cd fmb
git checkout feat/py310

# Build & test:
./copy_s3_content.sh
./debug.sh

In both cases the default/debug benchmark job seems to run successfully and generate reports.

⚠️ But as further discussed in comment thread, this does not in itself guarantee the built library will be nice to install in those environments, and in fact I think we need to significantly loosen other dependency constraints to make that happen!


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Collaborator

@antara678 antara678 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to have support for issue (#103) resolved in the same pull request to address all dependencies related to adding support for Python 3.10

athewsey added 3 commits June 11, 2024 10:35
Update copy_s3_content to ensure the FMBENCH_WRITE_DIR is also
created, and fail the whole script if any of the commands error out.
athewsey added 4 commits June 11, 2024 10:54
Since debug.sh does an install of dist/*.whl, it should also
explicitly remove any old versions' artifacts from this folder -
otherwise it fails if it runs in the same environment after pyproject
version was incremented.
Since debug.sh rebuilds the wheel, it should also force
re-installation of the new wheel rather than allowing cache of the
previous version.
Fix aws-samples#103 `Error: need to escape, but no escapechar set` observed on
SageMaker Notebook Instance (pandas v2.1.4)
Recently I've been observing wget fails in SMStudio as per:
aws/sagemaker-distribution#435 - so to get
fmbench debugged successfully in this environment I had to swap out
wget for curl calls in the `copy_s3_content` script.
@athewsey
Copy link
Author

As discussed with @antara678, appreciate the feedback that we need to be clear about what works & does not. Rebased the changes against current main and added some extras (that I was originally planning to leave out to keep this PR as small as possible) to push us closer to a nice end-state. Updated the PR description itself rather than adding all the validation comments here, to keep things clear.


Outstanding gap vs the ideal

The current proposed change does not by itself make FMBench "nice" to install on SageMaker Studio or NBIs, which is my end goal...

The major blocker to that is FMBench is currently pinning so many dependencies to exact versions: See also #105.

As discussed at great length on this nice blog post, applications should be specific about pinning package versions to drive reliable deployments - but libraries should be as loosely specified as possible (while still working), so that users can install them on a range of environments.

At the moment if you simply %pip install dist/fmbench-1.0.45-py3-none-any.whl then it will install on SMStudio, and I can import it, but because it overrides a bunch of library versions there are a load of conflicts that might break other stuff (including aiobotocore, sparkmagic, autogluon).

...But I thought #105 was a bit far-reaching to try and address with my first PR to the repo 😅 Was hoping we could just make some incremental progress of being able to install it with complaints.

Copy link
Collaborator

@antara678 antara678 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update looks good

@antara678 antara678 changed the base branch from main to python-3.10 June 11, 2024 16:22
@antara678 antara678 merged commit 74cd06e into aws-samples:python-3.10 Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants