🧬 BioReason-Pro
Advancing Protein Function Prediction with
Multimodal Biological Reasoning

bioRxiv GitHub Website HuggingFace


BioReason-Pro SFT

Supervised fine-tuned (SFT) checkpoint of BioReason-Pro, a multimodal reasoning LLM for protein function prediction. This model integrates ESM3 protein embeddings, a GO graph encoder, and biological context (InterPro domains, STRING interactions) within a Qwen3-4B backbone to generate structured reasoning traces and functional annotations.

Training data: wanglab/bioreason-pro-sft-reasoning-data

See also:

Citation

If you find this work useful, please cite our papers:

@article {Fallahpour2026.03.19.712954,
    author = {Fallahpour, Adibvafa and Seyed-Ahmadi, Arman and Idehpour, Parsa and Ibrahim, Omar and Gupta, Purav and Naimer, Jack and Zhu, Kevin and Shah, Arnav and Ma, Shihao and Adduri, Abhinav and G{\"u}loglu, Talu and Liu, Nuo and Cui, Haotian and Jain, Arihant and de Castro, Max and Fallahpour, Amirfaham and Cembellin-Prieto, Antonio and Stiles, John S. and Nem{\v c}ko, Filip and Nevue, Alexander A. and Moon, Hyungseok C. and Sosnick, Lucas and Markham, Olivia and Duan, Haonan and Lee, Michelle Y. Y. and Salvador, Andrea F. M. and Maddison, Chris J. and Thaiss, Christoph A. and Ricci-Tam, Chiara and Plosky, Brian S. and Burke, Dave P. and Hsu, Patrick D. and Goodarzi, Hani and Wang, Bo},
    title = {BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning},
    elocation-id = {2026.03.19.712954},
    year = {2026},
    doi = {10.64898/2026.03.19.712954},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2026/03/20/2026.03.19.712954},
    eprint = {https://www.biorxiv.org/content/early/2026/03/20/2026.03.19.712954.full.pdf},
    journal = {bioRxiv}
}

@misc{fallahpour2025bioreasonincentivizingmultimodalbiological,
      title={BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model}, 
      author={Adibvafa Fallahpour and Andrew Magnuson and Purav Gupta and Shihao Ma and Jack Naimer and Arnav Shah and Haonan Duan and Omar Ibrahim and Hani Goodarzi and Chris J. Maddison and Bo Wang},
      year={2025},
      eprint={2505.23579},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.23579}, 
}
Downloads last month
32
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train wanglab/bioreason-pro-sft

Collection including wanglab/bioreason-pro-sft

Paper for wanglab/bioreason-pro-sft