Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Reimplement $ transpilation using cuDF new line terminator support #11554

Open
NVnavkumar opened this issue Oct 1, 2024 · 0 comments
Open
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@NVnavkumar
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

cuDF added support for multiple new-line characters in rapidsai/cudf#15961, which allows support for the different Java unicode line terminator characters. This requires passing a flag to the cuDF regex APIs to enable this mode, and updating the transpiler to a more simplified implementation of $ (which only needs to add support for the \r\n combination in addition to the individual characters already supported by cuDF:

  • \n line-feed (already supported)
  • \r carriage-return
  • \u0085 next line (NEL)
  • \u2028 line separator
  • \u2029 paragraph separator

Additional context

This might fix failing tests here:

Also, can look into a possible solution for:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant