YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

seqSight_4096_512_89M-at-base-multi_16s_gt_lr_1e-04_genus

This model uses a custom GuidedTokenizer with motif-aware tokenization for genomic sequences.

Usage

from transformers import AutoModelForSequenceClassification
from guidedTokenizer import GuidedTokenizer  # Make sure to have the custom tokenizer class

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("vedantM/seqSight_4096_512_89M-at-base-multi_16s_gt_lr_1e-04_genus")
tokenizer = GuidedTokenizer.from_pretrained("vedantM/seqSight_4096_512_89M-at-base-multi_16s_gt_lr_1e-04_genus")

# The tokenizer automatically loads the motifs from motif_config.json
print(f"Loaded {len(tokenizer.motifs)} motifs")

Files included

  • Standard model files (config.json, pytorch_model.bin, etc.)
  • Standard tokenizer files (tokenizer.json, tokenizer_config.json, etc.)
  • motif_config.json: Contains the motifs and configuration for the GuidedTokenizer

Note

This model requires the custom GuidedTokenizer class to work properly. Make sure you have access to the guidedTokenizer.py file.

Downloads last month
2
Safetensors
Model size
92.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support