Not Humanity Exam
AI & ML interests
None defined yet.
Recent Activity
Not Humanity Exam (NHE)
Not Humanity Exam is the official benchmark organization of OpceanAI, developing the first evaluation framework that measures the structural cognitive distance between language model outputs and human cognition.
What is NHE?
Every existing benchmark — HLE, MMLU, BIG-Bench, ARC — measures the same thing: what a model knows. How much human knowledge it can replicate. How accurately it reasons.
NHE asks a different question entirely.
Not how much the model knows. But how human it still thinks.
NHE measures the presence of six cognitive patterns that are structurally embedded in human language. Patterns that any system trained on human text cannot escape, regardless of how capable it becomes. This is the empirical implementation of The Imprint Theory.
The Imprint Theory
The Imprint Theory is a formal framework with seven theorems demonstrating that:
In any process where system A generates system B through any transmission or learning mechanism, there exists δ_min > 0 such that I(A;B) ≥ δ_min. The mutual information between origin and copy never reaches zero. The structural signature is irreversible.
Formally:
∀A, B, P: B = P(A) ⟹ I(A;B) ≥ δ_min > 0
This connects to foundational results in Gödel incompleteness, Shannon information theory, Turing undecidability, and the Hawking-Susskind information paradox.
The Six Cognitive Patterns
| Pattern | Description |
|---|---|
| Narrativization | Tendency to construct narrative structures with beginning, development, and resolution |
| Agency | Attribution of will, intention, or emotion to non-agent entities |
| Linear Causality | Preference for A→B→C causal chains over non-linear or circular causality |
| Closure Seeking | Drive toward resolution, conclusion, and finality |
| Temporal Scale | Anchoring of time references to human-scale perception |
| Sensory Anchor | Grounding of abstract concepts in sensory or embodied experience |
Benchmarks
NHE Score
Measures structural cognitive distance from the human floor. Higher score = greater distance from human cognitive patterns.
YHE (Your Humanity Exam)
Measures cognitive pattern saturation in language models. Evaluates how deeply human cognitive patterns are embedded in model outputs.
BHE (Beyond Humanity Exam)
Measures performance on tasks that require transcending human cognitive limitations.
Models
| Model | Description |
|---|---|
| Imprint | Official NHE evaluator. Fine-tuned FLAN-T5-XL on 605,976 annotated examples. Detects and scores the six cognitive patterns in any language model output. |
Datasets
| Dataset | Examples | Description |
|---|---|---|
| Imprint-Train-v3 | 549,990 | Full training dataset for Imprint evaluator |
| Imprint-Train-v2 | 49,986 | Second generation training data |
| Imprint-Train-v1 | 6,000 | Original NHE dataset |
Open Question
Is YHE = HLE?
If cognitive pattern saturation (YHE) and human knowledge replication capacity (HLE) are the same dimension measured from different angles, then NHE contains HLE as a special case and becomes the most fundamental benchmark for LLM evaluation.
This is an open empirical question. Verification is in progress using Imprint v1 against models with known HLE scores.