Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for changing the UNLOAD S3 file suffix #74

Open
dschiavu opened this issue May 11, 2022 · 4 comments
Open

Add support for changing the UNLOAD S3 file suffix #74

dschiavu opened this issue May 11, 2022 · 4 comments

Comments

@dschiavu
Copy link

dschiavu commented May 11, 2022

Hi,

It would be great if the UNLOAD command's S3 filename suffix could be made configurable. Currently it's hardcoded as 0000_part_00 in the Writer class:

val resultKey = s"${unloadCommand.destination.prefix}0000_part_00"

Thanks,
Danijel

@ocadaruma
Copy link
Collaborator

Thanks for raising an issue.

Is changing S3 filename suffix possible in AWS Redshift? (sorry, I'm no longer using AWS in my work now so I don't have ideas about this)

If so, it sounds make sense but otherwise, could you elaborate the motivation a bit?

@dschiavu
Copy link
Author

dschiavu commented Jun 2, 2022

Thanks for getting back to me.

Unfortunately it's not possible to tell Redshift to use an alternative S3 UNLOAD file suffix, see https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html - it always adds the 000 to the file extension, and this suffix is hardcoded.

See also https://stackoverflow.com/questions/39255954/redshift-unloads-file-name

Hence, to make redshift-fake mimic the official AWS Redshift behaviour as closely as possible, I'd suggest to change 0000_part_00 (the current hardcoded suffix) to 000. Or, even better, allow specifying a custom suffix (but still using the 000 as the sensible default).

@dschiavu
Copy link
Author

FYI, I'm working on this feature now, planning to raise a Pull Request soon.

@dschiavu
Copy link
Author

dschiavu commented Jul 1, 2022

Done in #75

ocadaruma added a commit that referenced this issue Jul 8, 2022
…o_official_spec

[#74] Change UNLOAD command's CSV file suffix to match official AWS Redshift:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants