You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm excited that Nimble has such flexible encodings/compressions! It shouldn't be too hard to add Pcodec, which generally gets much better compression ratio on numerical data than the traditional dictionary/rle/.../LZ approach. Compression and decompression speeds could benefit too. This seems important, especially for an ML-focused columnar format.
The text was updated successfully, but these errors were encountered:
Hi @mwlon, I'm just reading about pcodec and it does seem like something that would be interesting to try out. Is this something you would like to do? We can help in ensure that Nimble has the right extensibility APIs for you to add it, and would be interested in experimental results.
I'm looking at the repo more now, but I don't see a spec doc. Does Nimble have a concept equivalent to Parquet's fine-grained "data pages"? If not, does it plan to have finer-grained pages in the future? This might affect a Pcodec implementation.
I'm excited that Nimble has such flexible encodings/compressions! It shouldn't be too hard to add Pcodec, which generally gets much better compression ratio on numerical data than the traditional dictionary/rle/.../LZ approach. Compression and decompression speeds could benefit too. This seems important, especially for an ML-focused columnar format.
The text was updated successfully, but these errors were encountered: