Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CustomAttributes in script mode (for PyTorch, HuggingFace, etc) #110

Open
athewsey opened this issue Aug 17, 2022 · 1 comment
Open

Comments

@athewsey
Copy link

Describe the feature you'd like

Users of {PyTorch, HuggingFace, XYZ...} SageMaker DLCs in script mode should be able to access the SageMaker CustomAttributes request header (and maybe other current/future request context?) via script mode function overrides (like input_fn, transform_fn): Just like TensorFlow users already can as pointed out by aws/sagemaker-pytorch-inference-toolkit#111

How would this feature be used? Please describe.

CustomAttributes could be useful for a range of purposes as outlined in the service doc. One particular use case I have come across multiple times now is to build endpoints for processing images and video chunks, where we'd like the main request body content type to be the image/video itself, but add some additional metadata (video stream ID, language, feature flags, etc).

Today, AFAIK, users looking to consume this header would need to plan and build quite complex modifications to the DLC serving containers. It's a steep curve for users to go from nice function override API, to having to understand the ecosystem of:

  • Transformers vs Handler Services vs Handlers
  • "Default" vs final implementations
  • Relationship between TorchServe (or SageMaker MMS), the sagemaker-inference-toolkit, and the toolkit for their framework of choice e.g. sageamker-huggingface-inference-toolkit

To support consuming the extra context simply via custom inference script overrides (input_fn and etc), I believe change to this sagemaker-inference-toolkit library would be needed/best.

Describe alternatives you've considered

  1. Are there any nice code samples out there demonstrating minimal tweaks to the transformer/handler stacks of containers like PyTorch and HF?
    • I'm not aware of any in the gap between script mode vs big serving stack changes... But would love to hear!
  2. Breaking change to handler override function APIs
    • Because there's no dict/namespace-like context object or **kwargs flexibility in the current APIs for input_fn / predict_fn / output_fn / transform_fn, there's nowhere really to put additional data without breaking things
    • The API of these functions could be amended to accept some kind of context object through which CustomAttributes (and any additional requirements in future) could be surfaced.
    • If I understand correctly, Transformer._default_transform_fn defines the signatures expected of these functions in script mode. Transformer.transform seems to dictate what the expected API of a transform_fn override would be, but it doesn't pass through additional context there either.
  3. Non-breaking change via Python inspect.signature
    • If a breaking change to these function override APIs is not possible, perhaps this library's default Transformer could use inspect.signature(...) to check the provided function's APIs on the fly at run time and pass through extra context arguments iff it's supported?
    • This would allow e.g. def input_fn(self, input_data, content_type) and def input_fn(self, input_data, content_type, extra_context) in user code to both work correctly with the library.
    • The change could be made to this base library without requiring default handlers in downstream libraries to be updated straight away (e.g. pytorch-inference-toolkit, huggingface-inference-toolkit, etc).

Even if (3) is the selected option, I think I'd suggest to introduce an extensible object/namespace argument like context, rather than a specific item like custom_attributes, to avoid growing complexity of the default transformer code and API if further fields need to be added in future?

Additional context

As far as I can tell, this library's default Transformer (and therefore its user function override logic) seems to be directly inherited by:

So I think, that adding CustomAttributes support in this toolkit would bubble through to at least these frameworks, once their DLCs are re-built and consuming the updated version?

@Alex-Wenner-FHR
Copy link

I also need this feature. I am using a PyTorch DLC with a custom inference script. When I print out the context it is just a content type string. However, the documentation implies this is some sort of object. Here is an example of that

Which is indeed what I need. I need the ability to pass CustomAttributes during invocation and receive them in the endpoint and handle them. Seems like this has been stale since 2022. I would imagine this would be a feature that should be integrated given the docs imply it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants