You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When defining a ProcessingStep using the Python SDK the pipeline compiler complains if the code= argument is not specified. However, the SDK documentation and code have code=None as a default (which is invalid) and the AWS documentation for processing steps states that the code parameter may be None if the code already exists in the container. In this case the ScriptProcessor already contains the code, and defines how to execute it through command= parameter.
To reproduce
Defining a processing step without a code argument will cause an error.
Expected behavior
If a ScriptProcessor is used which is based upon a custom image, the command should just be run directly. No specific code needs to be uploaded or pulled into the container. The expected behaviour can be obtained using the SDK currently by pointing code to any dummy file on S3 or the local machine. This is then pushed to the container, but the command specified by the Script Processor is still executed.
Screenshots or logs
ValueError: code None url scheme b'' is not recognized. Please pass a file path or S3 url
The text was updated successfully, but these errors were encountered:
I have received an internal customer ticket on the same topic and responded to that. Not sure if that was from you, so replying here as well.
The ScriptProcessor, as its name suggested, is for the use case of supplying custom script or code. That's why it has the code argument as required. In other words, ScriptProcessor is not for the use case of Bring Your Own Processor Container.
However, there is still another more general class to use, i.e. Processor,
for which, you don't need to supply the code. And instead, you'll need to supply an image uri, which can be your custom image. This class works with ProcessingStep as well. See the example below:
Describe the bug
When defining a ProcessingStep using the Python SDK the pipeline compiler complains if the
code=
argument is not specified. However, the SDK documentation and code havecode=None
as a default (which is invalid) and the AWS documentation for processing steps states that the code parameter may be None if the code already exists in the container. In this case the ScriptProcessor already contains the code, and defines how to execute it throughcommand=
parameter.To reproduce
Defining a processing step without a
code
argument will cause an error.Expected behavior
If a ScriptProcessor is used which is based upon a custom image, the command should just be run directly. No specific code needs to be uploaded or pulled into the container. The expected behaviour can be obtained using the SDK currently by pointing
code
to any dummy file on S3 or the local machine. This is then pushed to the container, but the command specified by the Script Processor is still executed.Screenshots or logs
The text was updated successfully, but these errors were encountered: