Update model card metadata and add external links
Browse filesThis PR improves the model card for Innovator-VL-8B-Instruct:
- Updates the `pipeline_tag` to `image-text-to-text` to better reflect its multimodal capabilities.
- Adds `library_name: transformers` to the metadata to enable Hub integration features.
- Adds direct links to the paper, project page, and GitHub repository.
- Maintains the existing sample usage and architectural details.
README.md
CHANGED
|
@@ -1,13 +1,16 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
- zh
|
| 6 |
-
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
# Innovator-VL-8B-Instruct
|
| 10 |
|
|
|
|
|
|
|
| 11 |
## Model Summary
|
| 12 |
|
| 13 |
**Innovator-VL-8B-Instruct** is a multimodal instruction-following large language model designed for scientific understanding and reasoning. The model integrates strong general-purpose vision-language capabilities with enhanced scientific multimodal alignment, while maintaining a fully transparent and reproducible training pipeline.
|
|
@@ -136,7 +139,8 @@ print(output_text)
|
|
| 136 |
```bibtex
|
| 137 |
@article{wen2026innovator,
|
| 138 |
title={Innovator-VL: A Multimodal Large Language Model for Scientific Discovery},
|
| 139 |
-
author={Wen, Zichen and Yang, Boxue and
|
| 140 |
journal={arXiv preprint arXiv:2601.19325},
|
| 141 |
year={2026}
|
| 142 |
-
}
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
- zh
|
| 5 |
+
license: mit
|
| 6 |
+
pipeline_tag: image-text-to-text
|
| 7 |
+
library_name: transformers
|
| 8 |
---
|
| 9 |
|
| 10 |
# Innovator-VL-8B-Instruct
|
| 11 |
|
| 12 |
+
[**Paper**](https://huggingface.co/papers/2601.19325) | [**Project Page**](https://innovatorlm.github.io/Innovator-VL) | [**GitHub**](https://github.com/InnovatorLM/Innovator-VL)
|
| 13 |
+
|
| 14 |
## Model Summary
|
| 15 |
|
| 16 |
**Innovator-VL-8B-Instruct** is a multimodal instruction-following large language model designed for scientific understanding and reasoning. The model integrates strong general-purpose vision-language capabilities with enhanced scientific multimodal alignment, while maintaining a fully transparent and reproducible training pipeline.
|
|
|
|
| 139 |
```bibtex
|
| 140 |
@article{wen2026innovator,
|
| 141 |
title={Innovator-VL: A Multimodal Large Language Model for Scientific Discovery},
|
| 142 |
+
author={Wen, Zichen and Yang, Boxue and Bird, Shuang and Zhang, Yaojie and Han, Yuhang and Ke, Junlong and Wang, Cong and others},
|
| 143 |
journal={arXiv preprint arXiv:2601.19325},
|
| 144 |
year={2026}
|
| 145 |
+
}
|
| 146 |
+
```
|