Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial release #22

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Initial release #22

wants to merge 2 commits into from

Conversation

hendrikvanantwerpen
Copy link
Contributor

@hendrikvanantwerpen hendrikvanantwerpen commented Oct 3, 2024

The bpe crate names was released, so I released an initial version of the crate to claim the name.

Things changed:

  • Added some fields to the crate manifest.
  • Changed the serialization of token dictionaries. The problem was that <crates.io> has a size limit of 10MB, while our serialized BPE instances were around 15MB and 30MB. For now I've opted to serialize the token lists + hash factor, and build the BPE instance in the lazy function. The performance impact of this is only relevant when the values are initialized, which is only once per run. But I'm happy to iterate ont his if necessary.

I decided to go ahead with the release to make sure we got the name. I imagine we do another release soon with additional changes or polishing we think are necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant