Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
almaghrabima
/
deeplatent-tokenizer
like
0
Arabic
English
tokenizer
sarf
morpheme
bpe
deeplatent
bilingual
arabic-english
arabic
morphology
License:
cc-by-nc-4.0
Model card
Files
Files and versions
xet
Community
main
deeplatent-tokenizer
9.24 MB
Ctrl+K
Ctrl+K
1 contributor
History:
28 commits
almaghrabima
Update README: remove morpheme_map.json references (now bundled in suhail-nlp)
a28373d
verified
3 months ago
.gitattributes
Safe
127 Bytes
Upload .gitattributes with huggingface_hub
3 months ago
README.md
Safe
6.3 kB
Update README: remove morpheme_map.json references (now bundled in suhail-nlp)
3 months ago
special_tokens_map.json
Safe
449 Bytes
Upload special_tokens_map.json with huggingface_hub
3 months ago
token_bytes.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch.IntStorage"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
402 kB
xet
Upload token_bytes.pt with huggingface_hub
3 months ago
tokenizer.json
Safe
7.54 MB
Upload tokenizer.json with huggingface_hub
3 months ago
tokenizer.pkl
pickle
Detected Pickle imports (1)
"tiktoken.core.Encoding"
How to fix it?
1.29 MB
xet
Upload tokenizer.pkl with huggingface_hub
3 months ago
tokenizer_config.json
Safe
631 Bytes
Upload tokenizer_config.json with huggingface_hub
3 months ago