Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

sagar007
/
multigemma

Image-to-Text
PyTorch
English
multimodal
vision-language
gemma
clip
llava
lightning
Model card Files Files and versions
xet
Community
1

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Gated model
You can list files but not access them

Preview of files found in this repository
  • inference_results
    Add inference result images 2 months ago
  • multimodal-gemma-epoch=02-val
    Upload trained Multimodal Gemma 270M checkpoint (A100 optimized) 2 months ago
  • multimodal-gemma-epoch=03-val
    Upload trained Multimodal Gemma 270M checkpoint (A100 optimized) 2 months ago
  • .gitattributes
    2.24 kB
    Add inference result images 2 months ago
  • README.md
    3.73 kB
    Update model card with inference images gallery 2 months ago
  • final_model.ckpt
    1.2 GB
    xet
    Full training on 157K LLaVA samples (3 epochs) - Loss: 1.333 2 months ago
  • last.ckpt
    1.24 GB
    xet
    Upload trained Multimodal Gemma 270M checkpoint (A100 optimized) 2 months ago