FloydARC (ARC-AGI Reasoning)

Model Summary

FloydARC is a neural algorithmic reasoning model adapted from FloydNet for the ARC-AGI benchmark. This checkpoint is trained primarily on ARC-style synthetic and curated data, and is designed to solve ARC tasks via iterative refinement and test-time adaptation, rather than large-scale web pretraining.

Among models trained mainly on ARC-like data, FloydARC achieves state-of-the-art performance on both ARC-AGI-1 and ARC-AGI-2, significantly narrowing the gap to very large proprietary models.

Performance

FloydARC demonstrates strong generalization on ARC benchmarks under standard evaluation protocols.

ARC-AGI benchmark results:

Model	#Params	ARC-AGI-1	ARC-AGI-2
VARC	73M	60.4	11.1
Loop-ViT	11.2M	61.2	10.3
HRM	27M	40.3	5.0
FloydARC	153.7M	70.5	15.3

Model Details

Model ID: ocxlabs/FloydARC
Task: Abstraction and Reasoning Corpus (ARC-AGI)
Architecture: FloydNet-based global relational reasoning with looped refinement
Input / Output: ARC grid-based visual reasoning (query canvas → predicted answer canvas)
License: Apache 2.0

Usage: Inference & Evaluation

This checkpoint is intended for research and evaluation use on ARC-AGI. Full reproduction of reported results requires multi-GPU inference with test-time training.

1. Download checkpoint

Download the pretrained checkpoint from Hugging Face:

https://huggingface.co/ocxlabs/FloydARC

Place the downloaded folder anywhere on disk and pass its path via --ckpt_path.

2. Prepare ARC evaluation data

Place the original ARC JSON files under rawdata/, then preprocess:

python -m scripts.process_data \
  --input_dir ./rawdata/ARC-AGI-1_evaluation/ \
  --output_dir ./preprocessed/arc1 \
  --split test

Repeat with ARC-AGI-2_evaluation for ARC-AGI-2.

3. Run inference with Test-Time Training (recommended)

python -m scripts.TTT \
  --ckpt_path /path/to/floydarc_ckpt \
  --subset arc1 \
  --output_dir ./output/TTT_results

Notes:

Default configuration uses 8 GPUs on a single node
LoRA-based TTT is enabled by default and recommended
For ARC-AGI-2, set --subset arc2

4. Ensembling & visualization

For reproducible evaluation and qualitative inspection:

python -m scripts.analyze \
  --result-folder ./output/TTT_results \
  --subset arc1 \
  --out-html output/arc1_results.html

Multiple result folders can be passed to enable max-voting ensembling.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support