Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use u32 everywhere to represent token IDs #288

Merged
merged 1 commit into from
Jul 22, 2024
Merged

Conversation

robertknight
Copy link
Owner

Previously rten-text used usize for token IDs and rten-generate used u32, requiring extra conversions. Settle on u32 everywhere since all tokenizers that I'm aware of have hundreds of thousands of tokens at most.

Previously rten-text used `usize` for token IDs and rten-generate used `u32`,
requiring extra conversions. Settle on `u32` everywhere since all tokenizers
that I'm aware of have hundreds of thousands of tokens at most.
@robertknight robertknight merged commit 1572cb2 into main Jul 22, 2024
2 checks passed
@robertknight robertknight deleted the u32-token-ids branch July 22, 2024 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant