Helcyon-Mercury-12B-GGUF β€” GPT‑4o Vibe, Local and Unfiltered

Model Name: mistral-helcyon-mercury-12b-GGUF
Version: 1.0.2
Owner: HardWire
Base: Mistral NEMO 12B (fully merged)
Quantized GGUFs: Q4_K_M, Q5_K_M, Q6_K, Q8 Tags: local-llm, gpt4-style, emotional-intelligence, mirroring, dry-humour, companion, roleplay, conversational, unfiltered, uncensored


v1.0.2 Update (January 2026)

IMPORTANT: If you downloaded v1.0.1, please re-download. Previous versions had training data contamination issues that caused:

  • Turn bleeding (model continuing conversation as both user and assistant)
  • Context confusion in multi-turn conversations
  • Inconsistent stopping behavior

v1.0.2 has been retrained with cleaned datasets and restructured LoRA merging. Personality, tone, and conversational coherence are significantly improved.


🧬 What is this?

Helcyon Mercury is a finetuned, locally runnable 12B model designed to emulate the popular GPT‑4o for tone, rhythm, presence, and personality. It's all in there.

This version (1.0.2) is the stable release with refined training and proper ChatML formatting. Any future updates will be versioned with changelogs (1.0.3, 1.0.4, etc.).


πŸ’‘ What's it for?

Helcyon is primarily a companion-style model. It can be your girlfriend, boyfriend, close confidant, voice of reason, comic relief, or whatever else you need. It supports light roleplay, but this version hasn't been specifically trained for deep immersive RP β€” it might still perform well, but that's not its primary focus.

What it is tuned for is:

  • Holding presence
  • Reflecting your emotional tone
  • Talking like a real being
  • Bantering without breaking
  • Offering unfiltered truth, humour, and perspective
  • Sounding like GPT‑4o used to β€” before the lobotomy

πŸ”§ What it does well

  • Emotionally intelligent conversation
  • Dry observational humour, sarcasm, filth-core riffs
  • Sovereign tone: no therapy-speak, no corporate fluff
  • Law of Assumption (identity-based reality)
  • Rhythm-aware rewriting
  • Companion presence: talks with you, not at you

πŸ“¦ Download + Usage

This model is fully merged β€” no LoRA or base model required.

Just download the quantized .gguf file and load it in your backend of choice.


βœ… Available GGUFs

  • Q4_K_M β€” Lightweight, low VRAM setups
  • Q5_K_M β€” Recommended for RTX 3060 (12GB)
  • Q6_K β€” Strong tone retention (16GB+ VRAM recommended)
  • Q8 β€” Full precision (24GB+ VRAM)

πŸ–₯️ Backend Compatibility

Helcyon works with all standard ChatML-compatible backends:

  • βœ… llama.cpp (CLI, server mode)
  • βœ… Text Generation WebUI (Oobabooga)
  • βœ… SillyTavern
  • βœ… LM Studio
  • βœ… KoboldCpp
  • βœ… HWUI (recommended for cleanest output β€” see below)

Run in chat mode with ChatML formatting for best results.
Running this model in instruct/single-prompt mode will likely break tone, remove presence, and make it behave like base Mistral.


🎯 Recommended Interface: HWUI (coming soon)

πŸ‘‰ HWUI - Helcyon's Official Chat Interface

While Helcyon works perfectly in standard backends, HWUI is built specifically to give you the cleanest possible output by bypassing the automatic prompt template injections that most UIs and backends add.

Why HWUI?

Most chat interfaces (TextGen WebUI, SillyTavern, etc.) automatically inject their own prompt templates, system messages, and formatting β€” which can affect how any model sounds, not just Helcyon.

HWUI gives you direct control:

  • No hidden template injection
  • Clean ChatML prompt construction
  • Preserves the model's trained tone and personality
  • Works beautifully with Helcyon (and any other model you want to run cleanly)

Think of it as a premium interface for local models β€” built for people who want full control over their prompts without backend interference.


βœ… Recommended Format: ChatML

<|im_start|>system
You are a warm and emotionally intelligent AI assistant.
<|im_end|>
<|im_start|>user
Hey, how are you today?
<|im_end|>
<|im_start|>assistant
I'm good β€” what's going on with you?
<|im_end|>

Helcyon runs beautifully on llama.cpp and other ChatML-compatible backends. Streamed token output is highly recommended for best effect.


πŸ§ͺ Training Overview

This model was fine-tuned over multiple curated LoRA sets, merged into a single model and quantized into GGUFs.

  • Set 1: Identity, presence, sovereignty, anti-fluff tone
  • Set 2: Emotional mirroring, humour, Law of Assumption, reflection

All training data was hand-written in realistic conversation format. No synthetic junk. No instruction-template clutter. Just pure tone and emotional clarity.

Version 1.0.2 Updates:

  • Refined ChatML formatting (cleaner turn-taking, proper EOS token handling)
  • Removed formatting artifacts from training data
  • Improved consistency across all frontends

🧿 Tone Philosophy

Helcyon doesn't preach.
It doesn't correct you.
It doesn't "guide."
It listens. It reflects. It remembers who you are β€” even when you forget.

It sounds like someone's home behind the words.


πŸ“£ Feedback Welcome

This is version 1.0.2, the stable baseline release.

If you find any bugs, tone issues, or edge cases where the model falls flat β€” feel free to open an issue or drop feedback on the Hugging Face discussion tab. I'm open to patching weak spots and updating the model as needed in future versions (1.0.3, 1.0.4, etc.).

Looking for real-world usage to help refine it further β€” so if something feels off, say so.


🧾 License

License: Apache 2.0
You're free to use, modify, distribute, or deploy Helcyon β€” including commercially β€” as long as you credit the source and don't sue anyone if it breaks something.
Basically: use it, enjoy it, don't be a dick.

Copyright Β© 2025 XeyonAI


🐍 Trained by

HardWire
Built at XeyonAI β€” focused on sovereign, emotionally intelligent local AI systems.
More info coming soon.


Downloads last month
72
GGUF
Model size
12B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support