WangKaiLin
/

PipeOwl-1.2

transformer-free

Model card Files Files and versions

PipeOwl-1.2(Geometric Embedding)

A transformer-free semantic retrieval engine.

PipeOwl performs deterministic vocabulary scoring over a static embedding field:

score = α⋅base + β⋅Δfield

where:

base = cosine similarity in embedding space
Δfield = static scalar field bias

Features:

O(n) over vocabulary.
No attention.
No transformer weights.

Patch Note

1.1

fix OOV
symbolic fallback
english fallback
japanese fallback
PipeOwlConfig improvement
Tokenizer: max_len cap
load_assets: contiguous + row-normalize
small benchmark

1.2

safetensors support

Architecture

Static embedding table (V × D)
Aligned vocabulary index
Optional scalar bias field
Linear scoring
Pluggable decoder stage
Targeted for CPU environments and low-latency systems (e.g. IME).
Single static field (~635MB), no runtime model weights.

Attribution

The base embedding vectors were generated using BGE-M3 (Apache-2.0) via inference. This repository does not redistribute any original BGE weights.

Quickstart

pip install numpy safetensors
python quickstart.py

See full experimental notes here:

https://hackmd.io/@galaxy4552/SJ5DatsuZx

Repository Structure

pipeowl1.2/
 ├ README.md
 ├ config.json
 ├ LICENSE
 ├ quickstart.py
 ├ pipeowl.safetensors
 ├ vocabulary.json
 └ engine.py

PipeOwl 是一個基於靜態語義場的幾何檢索系統。

核心公式：

score = α⋅base + β⋅Δfield

其中：

base = embedding cosine similarity
delta = 靜態場偏移量
α / β 為可調權重

提供一種 O(n) 的輕量語義計分方法，適合低延遲環境（如輸入法）。

LICENSE

MIT

Downloads last month: 44

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WangKaiLin/PipeOwl-1.2

PipeOwl

A transformer-free semantic retrieval engine. • 5 items • Updated 1 day ago