Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS S3 output fails to upload objects in partitioned path #2869

Open
bkh-kl opened this issue Sep 12, 2024 · 6 comments
Open

AWS S3 output fails to upload objects in partitioned path #2869

bkh-kl opened this issue Sep 12, 2024 · 6 comments
Labels
aws Issues relating to AWS bug needs more info An issue that may be a bug or useful feature, but requires more information

Comments

@bkh-kl
Copy link

bkh-kl commented Sep 12, 2024

Hello!

I'm using aws_s3 output and would like to utilize certain metadata in the path to enable AWS Glue to identify the partition keys based on the key=value format. (AWS doc)

This is the path example I'd like to upload my objects into: bucket/events/year=2024/month=09/object.gz

However, the moment I add the = character in the path, the output fails with the following error message:

Failed to send message to aws_s3: operation error S3: PutObject, https response error StatusCode: 403, RequestID: XYZ, HostID: XYZ, api error SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.

Is this error caused by my misconfiguration or the output does not support it yet?

I also searched your documentation with no luck to understand if = must be escaped to be able to be used.

Thank you!

@mihaitodor mihaitodor added bug needs investigation It looks as though have all the information needed but investigation is required aws Issues relating to AWS labels Sep 15, 2024
@mihaitodor
Copy link
Collaborator

Hey @bkh-kl 👋 Thanks for reporting this issue! Unfortunately, I wasn't able to reproduce it using the Localstack Docker container, which seems to accept that path just fine. I also tried replacing = with %3D and the libraries don't attempt to decode it, so you'd get year%3D2024/month%3D09 in the path, which I guess isn't ideal.

I do wonder, though, if the issue might be caused by metadata instead (see docs here). Can you please add a log processor with message: ${! metadata() } to check if any of the metadata fields have invalid values? Also, you could try removing the metadata fields with a mapping processor with meta = deleted().

@mihaitodor mihaitodor added needs more info An issue that may be a bug or useful feature, but requires more information and removed needs investigation It looks as though have all the information needed but investigation is required labels Sep 15, 2024
@bkh-kl
Copy link
Author

bkh-kl commented Sep 16, 2024

Thanks @mihaitodor

I removed the metadata function from the path and tried with a fixed value which contains = character:

path: 'v1/events/year=55/stream_2-${! uuid_v4() }.parquet

You are right! in Localstack S3 Bucket it works correctly when I use the above path as you can see in the following screenshot:

Screenshot 2024-09-16 at 11 36 34

However, when I switch the same stream to AWS S3 Bucket, the same error appears:

{"@service":"redpanda-connect","label":"s3_output","level":"error","msg":"Failed to send message to aws_s3: operation error S3: PutObject, https response error StatusCode: 403, RequestID: XYZ, HostID: XYZ, api error SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.","path":"root.output","stream":"stream-2","time":"2024-09-16T10:12:05Z"}

I have also set the force_path_style_urls to False for both Buckets and Streams.

@mihaitodor
Copy link
Collaborator

Thanks for checking @bkh-kl! Dunno how to reproduce it without an AWS account, but I see some other projects do use percent encoding for paths (for example peak/s5cmd#280). Maybe give it a shot and see what happens:

path: '${! ["v1", "events", "year=55", "stream_2-%s.parquet".format(uuid_v4())].map_each(e -> e.escape_url_query()).join("/") }'

@bkh-kl
Copy link
Author

bkh-kl commented Sep 18, 2024

Unfortunately that took the encoding literal:

Screenshot 2024-09-18 at 15 38 16

@mihaitodor
Copy link
Collaborator

OK, thanks for checking! We'll have to try and reproduce it somehow and see what we can do to fix this. If you have experience with Go, please try and see if you can get a hello world example which works.

@bkh-kl
Copy link
Author

bkh-kl commented Sep 18, 2024

Thanks @mihaitodor! I don't have experience with Go, but definitely will give it a try..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Issues relating to AWS bug needs more info An issue that may be a bug or useful feature, but requires more information
Projects
None yet
Development

No branches or pull requests

2 participants