File size: 3,663 Bytes
08b5406
 
 
055f8f0
 
 
 
08b5406
 
9f26b19
08b5406
ce15cbe
9f26b19
 
e56e5d4
 
 
9f26b19
 
 
 
08b5406
 
ce15cbe
08b5406
 
 
9f26b19
 
 
08b5406
 
 
 
 
 
 
 
9f26b19
08b5406
 
 
 
 
055f8f0
08b5406
055f8f0
08b5406
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f26b19
08b5406
 
 
 
 
 
2f756b9
08b5406
 
 
 
 
 
 
 
 
9f26b19
08b5406
 
 
 
 
 
2f756b9
08b5406
 
055f8f0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: apache-2.0
pipeline_tag: text-generation
arxiv: 2512.24873
tags:
- agent
- moe
---

# ROME-30B-A3B

<p align="left" style="display: flex; gap: 8px; align-items: center;">
  <a href="https://arxiv.org/pdf/2512.24873" target="_blank">
    <img src="https://img.shields.io/badge/Paper-arXiv%3A2512.24873-red" alt="Paper">
  </a>
  <a href="https://faithful-almanac-add.notion.site/The-Bitter-Lesson-Behind-Building-Agentic-RL-in-Terminal-Environments-2eaddd45837f80c9ad2ed6a15ef3c1a1?pvs=74" target="_blank">
    <img src="https://img.shields.io/badge/Blog-Notion-orange" alt="Blog">
  </a>
  <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License">
  <img src="https://img.shields.io/badge/Model%20Type-MoE-green" alt="Model Type">
</p>



---


**ROME** (**R**OME is **O**bviously an **A**gentic **M**odEl) is an open-source **agentic model** incubated within the **ALE (Agentic Learning Ecosystem)**. 

Rather than scaling performance purely by increasing parameter count, ROME achieves parameter-scale–crossing performance through full-stack infrastructure integration and advanced Reinforcement Learning optimization.

<img src="https://rlhf.oss-cn-hangzhou.aliyuncs.com/iFLOW-ROME/performance.png" width="600"/>


---



## 🚀 Highlights

<img src="https://rlhf.oss-cn-hangzhou.aliyuncs.com/iFLOW-ROME/ALE.PNG" width="600"/>


### 🔧 ALE Full-Stack Infrastructure
- [**ROLL**](https://github.com/alibaba/ROLL) – Large-scale reinforcement learning optimization engine  

- [**ROCK**](https://github.com/alibaba/ROCK) – Secure sandbox and environment orchestration for agent execution  

- **iFlow CLI** – Unified agent framework and developer interface  

  

### 🧠 IPA Policy Optimization Algorithm
- Introduces **Interaction-Perceptive Agentic Policy Optimization (IPA)**  
- Performs credit assignment at the level of **Semantic Interaction Chunks**  
- Significantly improves **training stability** and **success rates** on **long-horizon tasks**



### 🚀 Strong Agentic Performance
- Despite being a **mid-sized model** (30B MoE with 3B active parameters), ROME outperforms same-scale models on standard agent benchmarks:
  - **Terminal-Bench 2.0**: 24.72%
  - **SWE-bench Verified**: 57.40%
  
- Performance is competitive with, and in some cases comparable to, models exceeding **100B parameters**

  

### 🔒 Production-Grade Safety
- Designed for autonomous agent execution in real environments  
- Rigorously aligned and red-teamed against risks such as:
  - Unauthorized access
  - Illegal or unsafe tool invocation
- Built with **deployment-grade safety guarantees** in mind

---



## 📊 Performance (Preview)

### Terminal-Based Benchmarks

| **Model**                    | **Terminal-Bench 2.0** | **SWE-bench Verified** |
| ---------------------------- | ---------------------- | ---------------------- |
| Qwen3-Coder-30B-A3B-Instruct | 13.48%                 | 46.33%                 |
| **ROME-30B-A3B**             | **24.72%**             | **57.40%**             |
| GPT-OSS-120B                 | 21.12%                 | 43.93%                 |
| GLM-4.5 Air (106B)           | 17.30%                 | 56.20%                 |

> See the technical report for full experimental details.

---



## 📜 Citation

If you find our work useful, please consider citing:

```bibtex
@article{rome2025ale,
  title={Let It Flow: Agentic Crafting on Rock and Roll - Building the ROME Model within an Open Agentic Learning Ecosystem},
  author={Wang, Weixun and Xu, XiaoXiao and An, Wanhe and Dai, Fangwen and others},
  journal={arXiv preprint arXiv:2512.24873},
  year={2025}
}
```