Unary Quantization Research
True unary (base-1) quantization for neural network weights. NOT binary.
(c) 2026 OpenTransformers Ltd / Scott Bisset
Overview
Unary means magnitude N = N consecutive 1-bits across N bitplanes. Each bitplane contributes value=1, not binary powers. This eliminates multiplication from inference β only addition and popcount.
7-plane unary gives 8 magnitude levels (15 distinct values with sign), achieving 0.97 cosine similarity per layer against FP32 originals.
Contents
Converters (Python)
unary_convert.py/unary_convert_v2.pyβ Base unary thermometer conversionconvert_proper_unary.py/convert_proper_unary_v2.pyβ Proper unary with group quantizationconvert_log_unary.pyβ Log-spaced unary variantconvert_fast.pyβ Optimised conversion pipelinepacked_convert.py/packed_loader.pyβ Packed binary formatconvert_qwen3.py/convert_qwen3_v2.pyβ Qwen3-4B specific converters
C Inference Engines (AVX-512 + POPCNT)
unary_engine.c/unary_engine_v2.cβ Core unary inferencepure_unary_engine.cβ Pure unary (no FP in linear layers)log_unary_engine.cβ Log-unary engineproper_unary.cβ Proper unary with group scalestrue_unary.cβ True base-1 unary engineconcat_unary.cβ Concatenated unary enginepacked_engine.cβ Packed bitplane engineunary_full.cβ Full forward pass engine
Converted Models
deepseek-r1-1.5b-*β DeepSeek-R1-1.5B in multiple unary variants (4-plane, 7-plane, 31-plane, grouped, packed, ternary baseline)qwen3-4b-*β Qwen3-4B-Thinking in unary, log-unary, and proper-unary variants
Benchmarks and Runners
bench_fwd.py/bench_gen.py/bench_prompt.pyβ Performance benchmarksinference.py/server.pyβ Python inference and API server- Various
run_*.pyβ Model-specific runners
Key Insight
Unary quantization trades bits-per-weight for computational simplicity. All multiply-accumulate operations become popcount + addition, making this particularly suited for edge/CPU inference where SIMD popcount is fast.
Building
gcc -O3 -mavx512f -mavx512bw -mpopcnt -o unary_engine unary_engine.c -lm
License
Apache 2.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support