RepoNotes
Toggle theme
Loading ...
Sign in
explosion/curated-tokenizers
Lightweight piece tokenization library
12
7
2
Readme
Notes
New Note
Sign in to add notes for this repository
RN
explosion
1460
Followers
0
Following
78
Repos
No Twitter
Software company specializing in developer tools and tailored solutions for AI and Natural Language Processing
Branches
windows-ci
spacy-3.x
rustic
Releases
v2.0.0: Byte BPE encoding performance improvements
v0.9.2: picklable piece processors
v0.0.9: picklable piece processors
Commits
Set version to 2.0.0 (#59)
Set version to v2.0.0.dev0 (#58) * Set version to 2.0.0.dev0 * Set minimum version of Python to 3.9 Also update CI to test 3.9 as the lowest version.
bbpe: Add token LRU cache and avoid string lookups (#57) * bbpe: Add token LRU cache and avoid string lookups This change combines some performance improvemets: - We now cache token -> ids look
Issues
Test on Python 3.13 and upgrade GitHub Actions
Add Python 3.13 and 3.14 wheels for curated-tokenizers==0.0.9