[To be released soon]

BHASHA-7B-2K-HI

A 7B foundation language model pre-trained on hindi text with 2048 context size. Weights initialised from bhasha-7b-256-hi model. Uses extended vocabulary with knowledge transfer within embedding space.

Model Description

Hyperparameter Value
n_parameters 6695735296 (6.69B)
n_layers 32
n_heads 32
d_model 4096
vocab size 61772
sequence length 2048

This model is still getting pre-trained. Updated weights along with more details will be available soon.

Follow us to get updates on the progress.

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train soketlabs/bhasha-7b-2k-hi