-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The BatchSplittingSampler
cannot handle empty batches
#522
Comments
A follow up issue related to this one. Even without calling the I think the |
Summary: ## Background Poisson sampling can sometimes result in an empty input batch, especially if a sampling rate (i.e. expected batch size) is small. This is not out of the ordinary and should be handled accordingly - gradients (signal) should be set to 0 and noise should still be added. We've made an [attempt](https://github.com/pytorch/opacus/blob/main/opacus/data_loader.py#L31) to support this behaviour, but it wasn't fully covered with tests and got broken over time. As a result, at the moment we have a DataLoader that is capable of producing zero-sized batches, GradSampleModule that only partially supports them and DPOptimizer that doesn't support them at all This PR addresses Issue #522 (thanks xichens for reporting) ## Improvements This diff fixes the following * DPOptimizer can now handle empty batches * BatchMemoryManager can now handle empty batches * Adds a PrivacyEngine test with empty batches * Adds BatchMemoryManager test with empty batches * DataLoader now respects dtype of the inputs (i.e. empty batches only used to work with float input tensors) * ExpandedWeights still can's process empty batches, which we call out in our readme (FYI samdow ) Pull Request resolved: #530 Reviewed By: alexandresablayrolles Differential Revision: D40676213 Pulled By: ffuuugor fbshipit-source-id: dc637fd91a3c20d481d22c5de97d22d42e423a71
🐛 Bug
When poisson sampling is used, empty batches can occur. However, the
BatchSplittingSampler
frombatch_memory_manager.py
, which is called when using theBatchMemoryManager
, cannot handle empty batches and will throw an error.To Reproduce
To reproduce it, see this colab link.
Expected behavior
The wrapped batch sampler should handle empty batches properly
Additional context
I think the issue is with this line
When calling
the
batch_idxs
can be an empty list since it is from aUniformWithReplacementSampler
, butnp.array_split
does not expect the first arg to be empty.The text was updated successfully, but these errors were encountered: