InCoder-32B: Industrial Code Foundation Model

InCoder-32B (Industrial-Coder-32B) is the first 32B-parameter code foundation model purpose-built for industrial code intelligence. While general code LLMs excel at standard programming tasks, their performance often degrades in industrial scenarios that require reasoning about hardware semantics, specialized language constructs, and strict resource constraints.

InCoder-32B unifies code intelligence across:

Chip Design (Verilog / RTL)
GPU Kernel Optimization (CUDA / Triton)
Embedded Systems (ARM Cortex-M, STM32)
Compiler Optimization (x86-64 assembly, LLVM)
3D Modeling (CAD/CAM via CadQuery / OpenCascade)

The model supports a native long-context window of up to 128K tokens.

Paper: InCoder-32B: Code Foundation Model for Industrial Scenarios
Repository: GitHub - Industrial-Coder
Project Page: IndustrialCoder

Performance

General Code Benchmarks

Model	Size	HumanEval	HumanEval+	MBPP	MBPP+	BCB Full	BCB Hard
Qwen2.5-Coder-32B-Instruct	32B	93.3	86.6	90.2	77.8	48.0	24.3
InCoder-32B	32B	94.5	89.6	91.8	78.3	49.8	31.1

Industrial Code Benchmarks

Domain	Benchmark	InCoder-32B	Claude-Sonnet-4.6	Qwen3.5-397B-A17B
Chip Design	VeriScope Score	80.7	87.7	73.1
GPU Optim.	KernelBench L1/L2/L3	22.2/36.0/14.0	11.1/28.0/2.0	4.0/10.0/0.0
3D Modeling	CAD-Coder Compile (%)	82.0	77.0	79.0

Training Pipeline: Code-Flow

InCoder-32B was developed using a three-stage Code-Flow pipeline:

Pre-training & Annealing: Curated industrial code mixed with general code pre-training using multi-level deduplication.
Mid-training (Context Extension): Progressive context extension from 8K to 128K tokens using synthetic industrial reasoning data and agent trajectories.
Post-training: Execution-grounded SFT across hardware design, GPU kernels, and systems programming.

Quickstart

Installation

pip install -U "transformers>=4.57.1" accelerate safetensors

Usage with Transformers

Note: The model uses a custom architecture, so trust_remote_code=True is required.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Multilingual-Multimodal-NLP/IndustrialCoder"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

messages = [{"role": "user", "content": "Optimize this CUDA kernel for better memory coalescing."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=2048, temperature=0.6, top_p=0.85, top_k=20)

print(tokenizer.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

Citation

@article{yang2026incoder,
  title={InCoder-32B: Code Foundation Model for Industrial Scenarios},
  author={Yang, Jian and Zhang, Wei and Wu, Jiajun and Cheng, Junhang and Guo, Shawn and Wang, Haowen and Gu, Weicheng and Du, Yaxin and Li, Joseph and Xu, Fanglin and others},
  journal={arXiv preprint arXiv:2603.16790},
  year={2026}
}

Disclaimer

The model may generate incorrect or unsafe code. Always review and test outputs in a sandboxed environment before production use. Industrial code (RTL, embedded firmware, GPU kernels) requires expert human review before deployment.

Downloads last month: 43

Safetensors

Model size

5B params

Tensor type

I32

BF16

F16

Collection including Multilingual-Multimodal-NLP/IndustrialCoder-32B-AWQ-INT4

IndustrialCoder

Collection

5 items • Updated 4 days ago • 1

Paper for Multilingual-Multimodal-NLP/IndustrialCoder-32B-AWQ-INT4

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 1 day ago • 221