File size: 3,663 Bytes
08b5406 055f8f0 08b5406 9f26b19 08b5406 ce15cbe 9f26b19 e56e5d4 9f26b19 08b5406 ce15cbe 08b5406 9f26b19 08b5406 9f26b19 08b5406 055f8f0 08b5406 055f8f0 08b5406 9f26b19 08b5406 2f756b9 08b5406 9f26b19 08b5406 2f756b9 08b5406 055f8f0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
---
license: apache-2.0
pipeline_tag: text-generation
arxiv: 2512.24873
tags:
- agent
- moe
---
# ROME-30B-A3B
<p align="left" style="display: flex; gap: 8px; align-items: center;">
<a href="https://arxiv.org/pdf/2512.24873" target="_blank">
<img src="https://img.shields.io/badge/Paper-arXiv%3A2512.24873-red" alt="Paper">
</a>
<a href="https://faithful-almanac-add.notion.site/The-Bitter-Lesson-Behind-Building-Agentic-RL-in-Terminal-Environments-2eaddd45837f80c9ad2ed6a15ef3c1a1?pvs=74" target="_blank">
<img src="https://img.shields.io/badge/Blog-Notion-orange" alt="Blog">
</a>
<img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License">
<img src="https://img.shields.io/badge/Model%20Type-MoE-green" alt="Model Type">
</p>
---
**ROME** (**R**OME is **O**bviously an **A**gentic **M**odEl) is an open-source **agentic model** incubated within the **ALE (Agentic Learning Ecosystem)**.
Rather than scaling performance purely by increasing parameter count, ROME achieves parameter-scale–crossing performance through full-stack infrastructure integration and advanced Reinforcement Learning optimization.
<img src="https://rlhf.oss-cn-hangzhou.aliyuncs.com/iFLOW-ROME/performance.png" width="600"/>
---
## 🚀 Highlights
<img src="https://rlhf.oss-cn-hangzhou.aliyuncs.com/iFLOW-ROME/ALE.PNG" width="600"/>
### 🔧 ALE Full-Stack Infrastructure
- [**ROLL**](https://github.com/alibaba/ROLL) – Large-scale reinforcement learning optimization engine
- [**ROCK**](https://github.com/alibaba/ROCK) – Secure sandbox and environment orchestration for agent execution
- **iFlow CLI** – Unified agent framework and developer interface
### 🧠 IPA Policy Optimization Algorithm
- Introduces **Interaction-Perceptive Agentic Policy Optimization (IPA)**
- Performs credit assignment at the level of **Semantic Interaction Chunks**
- Significantly improves **training stability** and **success rates** on **long-horizon tasks**
### 🚀 Strong Agentic Performance
- Despite being a **mid-sized model** (30B MoE with 3B active parameters), ROME outperforms same-scale models on standard agent benchmarks:
- **Terminal-Bench 2.0**: 24.72%
- **SWE-bench Verified**: 57.40%
- Performance is competitive with, and in some cases comparable to, models exceeding **100B parameters**
### 🔒 Production-Grade Safety
- Designed for autonomous agent execution in real environments
- Rigorously aligned and red-teamed against risks such as:
- Unauthorized access
- Illegal or unsafe tool invocation
- Built with **deployment-grade safety guarantees** in mind
---
## 📊 Performance (Preview)
### Terminal-Based Benchmarks
| **Model** | **Terminal-Bench 2.0** | **SWE-bench Verified** |
| ---------------------------- | ---------------------- | ---------------------- |
| Qwen3-Coder-30B-A3B-Instruct | 13.48% | 46.33% |
| **ROME-30B-A3B** | **24.72%** | **57.40%** |
| GPT-OSS-120B | 21.12% | 43.93% |
| GLM-4.5 Air (106B) | 17.30% | 56.20% |
> See the technical report for full experimental details.
---
## 📜 Citation
If you find our work useful, please consider citing:
```bibtex
@article{rome2025ale,
title={Let It Flow: Agentic Crafting on Rock and Roll - Building the ROME Model within an Open Agentic Learning Ecosystem},
author={Wang, Weixun and Xu, XiaoXiao and An, Wanhe and Dai, Fangwen and others},
journal={arXiv preprint arXiv:2512.24873},
year={2025}
}
``` |