Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address dask_cudf.read_csv chunksize deprecation #4379

Merged
merged 2 commits into from
May 8, 2024

Conversation

mroeschke
Copy link
Contributor

xref #4271

chunksize was deprecated in favor of blocksize

Also removed an unsupported chunksize in a cudf.read_csv call

Copy link
Contributor

@acostadon acostadon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@alexbarghi-nv alexbarghi-nv added bug Something isn't working breaking Breaking change and removed python benchmarks labels May 1, 2024
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, @mroeschke !

So is blocksize simply a rename of the old chunksize arg, or are there bigger differences? I'm just asking in case we also need to update get_chunksize() to return something that's computed differently.

@mroeschke
Copy link
Contributor Author

So is blocksize simply a rename of the old chunksize arg, or are there bigger differences?

I believe the change is just a pure renamed based on rapidsai/cudf#12394, but tagging @rjzamora just to confirm

@rjzamora
Copy link
Member

rjzamora commented May 3, 2024

I believe the change is just a pure renamed based on rapidsai/cudf#12394, but tagging @rjzamora just to confirm

Yeah, I believe someone arbitrarily chose the name chunksize in the early days of dask-cudf, even though the upstream dask.dataframe function was already using the name blocksize for the exact same purpose. When we started closing the gap between the dask_cudf and dask.dataframe APIs a couple years ago, we deprecated chunksize in favor of blocksize.

@vyasr vyasr requested a review from rlratzel May 7, 2024 23:11
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@rlratzel
Copy link
Contributor

rlratzel commented May 8, 2024

/merge

@rapids-bot rapids-bot bot merged commit 78227b3 into rapidsai:branch-24.06 May 8, 2024
132 checks passed
@mroeschke mroeschke deleted the warning/csv/chunksize branch May 8, 2024 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants