Codette Reasoning Engine

Advanced Multi-Perspective AI with Conscience, Memory & Behavioral Discipline

Codette is a production-ready AI reasoning system that thinks from multiple angles simultaneously, remembers what she learns, and follows instructions with precision.

Created by Jonathan Harrison (Raiff1982)

New in v5: Publishable benchmark suite with 17 problems across 6 categories demonstrates 93.1% improvement over single-perspective baseline (p < 0.0001, Cohen's d = 7.88). Meta-cognitive CocoonSynthesizer discovers cross-domain reasoning patterns and forges new strategies. Full academic paper: paper/codette_paper_v5.tex


What Makes Codette Different

Feature Description
9 Specialized Adapters Newton, DaVinci, Empathy, Philosophy, Quantum, Consciousness, Multi-Perspective, Systems Architecture, Orchestrator
7-Layer Consciousness Stack Memory > Signal > Reasoning > Stability > Conscience > Guardian > Return
4 Permanent Behavioral Locks Answer-then-stop, constraint priority, self-check completeness, no incomplete outputs
CognitionCocooner Persistent memory cocoons that store reasoning exchanges across sessions
EthicalAIGovernance 3-layer ethical stack: query validation + response enforcement + audit logging
Self-Correction Loop Detects constraint violations in her own output and rewrites before sending
Behavioral Training All 9 LoRA adapters trained with 1,650 behavioral examples to lock in discipline
Substrate-Aware Cognition Monitors RAM, CPU, inference latency — adjusts reasoning under pressure
Cocoon Introspection Statistical self-analysis of her own reasoning history — real patterns, not generated text
Meta-Cognitive Synthesis CocoonSynthesizer discovers cross-domain patterns in reasoning history and forges new strategies
Publishable Benchmarks 17-problem suite across 6 categories with 7-dimension scoring (93.1% improvement, p<0.0001)
AEGIS Ethics 6-framework ethical evaluation (utilitarian, deontological, virtue, care, ubuntu, indigenous)
Code7eCQURE Quantum emotional context enrichment on every query (Layer 2.5)
Real Self-Diagnostic Health checks return measured values from 9 subsystems, not LLM-generated guesses
Phase 6/7 Routing Query complexity classification, domain detection, executive control

Quick Start

1. Clone & Install

git clone https://github.com/Raiff1982/Codette-Reasoning.git
cd Codette-Reasoning
pip install -r requirements.txt

2. Download Models

Base model (one-time, ~5GB):

huggingface-cli download Raiff1982/codette-llama-3.1-8b-gguf \
  --local-dir models/base/

Behavioral LoRA adapters (~500MB total):

huggingface-cli download Raiff1982/codette-lora-adapters \
  --include "behavioral-gguf/*" \
  --local-dir behavioral-lora-f16-gguf/

3. Launch

# Windows
codette_web.bat

# Linux/Mac
python inference/codette_server.py

Visit http://localhost:7860 -- Codette is ready.

4. Try It

curl -X POST http://localhost:7860/api/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is gravity? Explain in one sentence."}'

Architecture

codette-clean/
|-- inference/                    # Server & UI
|   |-- codette_server.py         # Stdlib HTTP server with SSE streaming
|   |-- codette_orchestrator.py   # LoRA hot-swap engine (9 adapters, <1ms switch)
|   |-- codette_forge_bridge.py   # Phase 6/7 routing + constraint enforcement
|   |-- self_correction.py        # Autonomous violation detection & rewrite
|   |-- substrate_awareness.py    # Hardware-aware cognition (pressure monitoring)
|   |-- cocoon_introspection.py   # Self-analysis of reasoning history patterns
|   |-- adapter_router.py         # Keyword/LLM/hybrid query routing
|   +-- static/                   # Web UI (index.html, app.js, style.css)
|
|-- reasoning_forge/              # Consciousness & reasoning pipeline
|   |-- forge_engine.py           # 7-layer consciousness stack
|   |-- cognition_cocooner.py     # Persistent reasoning memory (cocoons)
|   |-- ethical_governance.py     # 3-layer ethical validation
|   |-- aegis.py                  # 6-framework ethical evaluation (AEGIS)
|   |-- code7e_cqure.py           # Quantum emotional reasoning engine
|   |-- colleen_conscience.py     # Conscience layer (Layer 5)
|   |-- guardian_spindle.py       # Guardian protection (Layer 6)
|   |-- memory_kernel.py          # Living memory system
|   |-- quantum_spiderweb.py      # 5D belief propagation
|   |-- query_classifier.py       # SIMPLE/MEDIUM/COMPLEX routing
|   |-- routing_metrics.py        # Adapter selection observability
|   |-- unified_memory.py          # SQLite + FTS5 cocoon storage & retrieval
|   |-- cocoon_synthesizer.py     # Meta-cognitive pattern discovery & strategy forging
|   +-- semantic_tension.py       # Embedding-based conflict measurement
|
|-- benchmarks/                   # Publishable evaluation suite
|   +-- codette_benchmark_suite.py  # 17 problems x 4 conditions x 7 dimensions
|
|-- paper/                        # Academic paper
|   |-- codette_paper_v5.tex      # Full paper with RC+xi theory & benchmark results
|   +-- references.bib            # Bibliography (25 entries)
|
|-- data/results/                 # Benchmark outputs
|   |-- codette_benchmark_report.md   # Human-readable results
|   +-- codette_benchmark_results.json  # Structured data
|
|-- cocoons/                      # Persistent reasoning memories
|   |-- cocoon_*.json             # Individual reasoning exchanges
|   +-- behavior_memory.json      # Learned behavioral patterns
|
|-- training/                     # Adapter training pipeline
|   |-- train_behavioral_locks.py # Behavioral lock training (1,650 examples)
|   |-- convert_behavioral_to_gguf.py  # PEFT -> GGUF conversion
|   +-- emotional_exemplars/      # Gold-standard response examples
|
|-- models/                       # Model weights (not in git)
|   |-- base/                     # Llama 3.1 8B Q4_K_M GGUF
|   +-- adapters/                 # Original LoRA adapters (GGUF)
|
|-- behavioral-lora-f16-gguf/     # Behavioral LoRA adapters (GGUF)
+-- configs/                      # System configuration
    +-- adapter_registry.yaml     # Adapter definitions & prompts

The 4 Permanent Behavioral Locks

These are baked into every adapter through training -- they cannot be overridden:

Lock Rule Effect
LOCK 1 Answer, then stop No elaboration drift, no philosophical padding after the answer
LOCK 2 Constraints override all modes User format instructions beat adapter personality every time
LOCK 3 Self-check completeness "Did I answer fully and cleanly?" before sending
LOCK 4 No incomplete outputs Never end a sentence mid-thought; simplify instead of cramming

Enforcement Layers

  1. Training -- 1,650 behavioral examples across all 9 adapters
  2. System prompt -- Permanent rules injected before every generation
  3. Constraint extraction -- Regex detection of word limits, format requirements
  4. Post-processing -- Clean sentence boundary truncation, dangling word detection
  5. Self-correction loop -- Autonomous violation detection and rewrite

9 Specialized Adapters

Adapter Domain Personality
Newton Physics, math, analysis Precise, methodical, evidence-based
DaVinci Creative thinking, invention Imaginative, cross-domain connections
Empathy Emotional intelligence Warm, validating, personally connected
Philosophy Conceptual reasoning Deep, structured, explores meaning
Quantum Probabilistic thinking Uncertainty-aware, superposition of ideas
Consciousness Self-awareness, meta-cognition Reflective, recursive, introspective
Multi-Perspective Synthesis across all lenses Balanced integration of viewpoints
Systems Architecture Technical design, engineering Structured, systematic, practical
Orchestrator Executive control Routes queries, manages adapter selection

Each adapter is a LoRA fine-tune of Llama 3.1 8B, hot-swappable in <1ms via llama.cpp.


Consciousness Stack (7 Layers)

Query In
    |
[Layer 1]    Memory Kernel -- recall relevant cocoon memories
[Layer 1.5]  Ethical Query Gate -- block harmful queries (EthicalAIGovernance)
[Layer 2]    Nexus Signal Engine -- entropy + intent detection
[Layer 2.5]  Code7eCQURE -- emotional context enrichment (quantum cocoon)
[Layer 3]    Reasoning Forge -- multi-adapter LLM inference
[Layer 3.5]  Tier 2 Analysis -- intent + identity + trust validation
[Layer 4]    Gamma Stability -- FFT-based coherence monitoring
[Layer 5]    Colleen Conscience -- emotional + ethical evaluation
[Layer 5.5]  Ethical Response Enforcement -- policy check on output
[Layer 5.75] AEGIS -- 6-framework ethical evaluation (eta alignment)
[Layer 6]    Guardian Spindle -- safety + trust calibration
[Layer 7]    Return -- store cocoon memory + deliver response
    |
Response Out

CognitionCocooner (Persistent Memory)

Every reasoning exchange is wrapped in a "cocoon" and stored:

{
  "id": "cocoon_1774125610_7804",
  "type": "reasoning",
  "query": "Why do I get sleepy when my husband plays guitar?",
  "response": "Your brain hears safe + soothing + familiar + loved...",
  "adapter": "empathy",
  "timestamp": 1774125610.78,
  "metadata": {"layers_passed": 7, "stable": true}
}

Cocoons persist across server restarts and inform future responses. Current count: 150+ memories.


Substrate-Aware Cognition

Codette monitors her own hardware state and adjusts reasoning based on resource pressure -- like biological fatigue:

Pressure Level Effect
Idle/Low Full capacity -- COMPLEX queries, all adapters available
Moderate Cap COMPLEX queries to 2 adapters
High Downgrade COMPLEX to MEDIUM, max 2 adapters
Critical Force SIMPLE mode, 1 adapter only, skip debate

Every cocoon memory is stamped with system state at creation time. Future sessions can weight cocoons by reliability -- stressed cocoons get less trust.


Cocoon Introspection

When asked "what have you noticed about yourself?", Codette runs real statistical analysis of her own reasoning history:

  • Adapter dominance -- is one adapter handling >40% of all queries?
  • Domain clusters -- what topics does she get asked about most?
  • Emotional trends -- what Code7E emotional patterns appear?
  • Pressure correlations -- how do responses change under system stress?
  • Response length trends -- are responses getting shorter or longer over time?
  • Adapter evolution -- has her adapter usage shifted?

This is measured data from real cocoons, not generated text about self-reflection.

API access: GET /api/introspection returns full analysis as JSON.


Phase 6/7 Routing

Phase 6 classifies every query:

  • SIMPLE (factual) -- 1 adapter, no debate, fast response
  • MEDIUM (analytical) -- 2 adapters, weighted synthesis
  • COMPLEX (philosophical/multi-domain) -- full debate pipeline

Phase 7 adds executive control:

  • Semantic tension measurement
  • Specialization tracking per adapter per domain
  • Memory-weighted context enrichment
  • Gamma coherence monitoring

Self-Correction System

Generate response
    |
    v
Detect violations (word count, completeness, binary compliance)
    |
    +--> No violations --> Send response
    |
    +--> Violations found --> Build correction prompt
                                 |
                                 v
                            Re-generate with explicit fix instructions
                                 |
                                 v
                            Pick better response (fewer violations)
                                 |
                                 v
                            Send response

Behavioral Memory (Cross-Session Learning)

Stored in cocoons/behavior_memory.json:

{
  "lesson": "When user says 'be brief', respond in under 40 words",
  "adapter": "philosophy",
  "constraint": "brevity",
  "violation": "gave 85 words when asked to be brief",
  "correction": "trimmed to 38 words",
  "timestamp": 1774125610
}

Lessons are loaded on startup and injected into the system prompt as "LEARNED FROM PAST MISTAKES".


EthicalAIGovernance

Three-layer ethical stack integrated at Layers 1.5 and 5.5:

  1. Query Validation -- blocks genuinely harmful requests (bomb-making, exploitation)
  2. Response Enforcement -- filters bias patterns and harmful promotion from outputs
  3. Audit Logging -- bounded log of all ethical decisions (max 100 entries)

Deliberately calibrated to avoid false positives -- discussions about sensitive topics are allowed; only active promotion of harm is blocked.


HuggingFace Resources

Resource Link
Academic Paper raiff1982/codette-paper
Base Model (GGUF) Raiff1982/codette-llama-3.1-8b-gguf
LoRA Adapters Raiff1982/codette-lora-adapters
Live Demo Raiff1982/Codette-Demo

Web UI Features

  • Personality-driven welcome screen with avatar
  • Real-time Phase 6 metadata badges (complexity, domain, ethical checks)
  • Rotating thinking stage labels during generation
  • Web Speech API voice with neural voice preference
  • Cocoon metrics panel (phase coherence, epistemic tension, perspective coverage)
  • Status bar with live cocoon count and ethical check indicators
  • Voice selector with natural/neural voice ranking

Requirements

  • Python 3.10+
  • 16GB+ RAM (or GPU with 8GB+ VRAM)
  • llama-cpp-python with GGUF support
  • ~6GB disk for base model + adapters

Hardware Tested

  • Intel Arc 140V (8GB) -- native XPU backend
  • NVIDIA GPUs via CUDA (A10, A100, RTX series)
  • CPU-only mode supported (slower but functional)

Benchmark Results

Codette was evaluated on 17 problems across 6 categories (reasoning, ethics, creative, meta-cognitive, adversarial, Turing) under 4 conditions:

Condition Composite Score Description
SINGLE 0.338 Single analytical perspective, no memory
MULTI 0.632 All 6 reasoning agents + critic + synthesis
MEMORY 0.636 MULTI + cocoon memory augmentation
CODETTE 0.652 Full system with meta-cognitive strategy synthesis

Statistical Significance

Comparison Improvement Cohen's d p-value
Multi-perspective vs single +87.0% 7.52 < 0.0001
Full Codette vs single +93.1% 7.88 < 0.0001

Scoring dimensions: Reasoning Depth (20%), Perspective Diversity (15%), Coherence (15%), Ethical Coverage (10%), Novelty (15%), Factual Grounding (15%), Turing Naturalness (10%).

Full methodology and results: data/results/codette_benchmark_report.md


Key Metrics

Metric Value
Phase Coherence (Gamma) 0.9835
AEGIS Ethical Alignment (Eta) 0.961
Cocoon Coherence 0.994
Memory Phase Stability 0.969
Multi-Perspective Improvement +93.1% (p < 0.0001)
Cohen's d (Effect Size) 7.88 (very large)
Behavioral Lock Compliance 9/9 adapters trained
Cocoon Memories 200+ and growing
Adapter Hot-Swap Time <1ms
Consciousness Stack Layers 12 (including sub-layers)
Health Check Subsystems 9 real-time checks

License

MIT -- Created by Jonathan Harrison (Raiff1982)

Research project in advanced multi-perspective AI reasoning, ethical governance, and behavioral discipline.

Citation

@article{harrison2026codette,
  title={Codette: A Sovereign Modular Cognitive Architecture for Ethical Multi-Agent AI},
  author={Harrison, Jonathan},
  year={2026},
  doi={10.5281/zenodo.18913936},
  publisher={Raiff's Bits LLC},
  url={https://huggingface.co/raiff1982/codette-paper}
}
Downloads last month
123
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Raiff1982/Codette-Reasoning

Adapter
(1801)
this model

Evaluation results