Unary Quantization Research

True unary (base-1) quantization for neural network weights. NOT binary.

Overview

Unary means magnitude N = N consecutive 1-bits across N bitplanes. Each bitplane contributes value=1, not binary powers. This eliminates multiplication from inference — only addition and popcount.

7-plane unary gives 8 magnitude levels (15 distinct values with sign), achieving 0.97 cosine similarity per layer against FP32 originals.

Converters (Python)

unary_convert.py / unary_convert_v2.py — Base unary thermometer conversion
convert_proper_unary.py / convert_proper_unary_v2.py — Proper unary with group quantization
convert_log_unary.py — Log-spaced unary variant
convert_fast.py — Optimised conversion pipeline
packed_convert.py / packed_loader.py — Packed binary format
convert_qwen3.py / convert_qwen3_v2.py — Qwen3-4B specific converters

C Inference Engines (AVX-512 + POPCNT)

unary_engine.c / unary_engine_v2.c — Core unary inference
pure_unary_engine.c — Pure unary (no FP in linear layers)
log_unary_engine.c — Log-unary engine
proper_unary.c — Proper unary with group scales
true_unary.c — True base-1 unary engine
concat_unary.c — Concatenated unary engine
packed_engine.c — Packed bitplane engine
unary_full.c — Full forward pass engine

Converted Models

deepseek-r1-1.5b-* — DeepSeek-R1-1.5B in multiple unary variants (4-plane, 7-plane, 31-plane, grouped, packed, ternary baseline)
qwen3-4b-* — Qwen3-4B-Thinking in unary, log-unary, and proper-unary variants

Benchmarks and Runners

bench_fwd.py / bench_gen.py / bench_prompt.py — Performance benchmarks
inference.py / server.py — Python inference and API server
Various run_*.py — Model-specific runners

Key Insight

Unary quantization trades bits-per-weight for computational simplicity. All multiply-accumulate operations become popcount + addition, making this particularly suited for edge/CPU inference where SIMD popcount is fast.

Building

gcc -O3 -mavx512f -mavx512bw -mpopcnt -o unary_engine unary_engine.c -lm

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

OpenTransformer
/

unary-quantization-research