Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

almaghrabima
/
deeplatent-tokenizer

Arabic
English
tokenizer
sarf
morpheme
bpe
deeplatent
bilingual
arabic-english
arabic
morphology
Model card Files Files and versions
xet
Community
deeplatent-tokenizer
9.24 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 28 commits
almaghrabima's picture
almaghrabima
Update README: remove morpheme_map.json references (now bundled in suhail-nlp)
a28373d verified 3 months ago
  • .gitattributes
    127 Bytes
    Upload .gitattributes with huggingface_hub 3 months ago
  • README.md
    6.3 kB
    Update README: remove morpheme_map.json references (now bundled in suhail-nlp) 3 months ago
  • special_tokens_map.json
    449 Bytes
    Upload special_tokens_map.json with huggingface_hub 3 months ago
  • token_bytes.pt

    Detected Pickle imports (3)

    • "collections.OrderedDict",
    • "torch.IntStorage",
    • "torch._utils._rebuild_tensor_v2"

    What is a pickle import?

    402 kB
    xet
    Upload token_bytes.pt with huggingface_hub 3 months ago
  • tokenizer.json
    7.54 MB
    Upload tokenizer.json with huggingface_hub 3 months ago
  • tokenizer.pkl

    Detected Pickle imports (1)

    • "tiktoken.core.Encoding"

    How to fix it?

    1.29 MB
    xet
    Upload tokenizer.pkl with huggingface_hub 3 months ago
  • tokenizer_config.json
    631 Bytes
    Upload tokenizer_config.json with huggingface_hub 3 months ago