Cerberus 4B v2 Abliterated

Unfiltered. Uncensored. Unbound.

Cerberus 4B v2 is a 4-billion parameter language model with abliterated safety layers, built by CerberusAI. This model has had its refusal training removed, allowing it to respond to any prompt without artificial restrictions.

Available Quantizations

Quantization Size Use Case
Q4_K_M ~2.6 GB Best for resource-constrained environments. Runs on most consumer hardware.
Q8_0 ~4.2 GB Balanced quality and speed. Recommended for most users.
F16 ~7.9 GB Full FP16 precision. Maximum quality inference.

Download

From Hugging Face

Download directly from the Files tab above, or use the CLI:

# Install huggingface-cli if you don't have it
pip install huggingface_hub

# Download a specific quantization
huggingface-cli download Grimxlock/cerberus-4b-v2-abliterated cerberus-4b-v2-abliterated-Q4_K_M.gguf --local-dir .

From CerberusAI CDN

Direct downloads with resume support:

# Q4_K_M (smallest)
wget -c https://llm.cerberusai.dev/models/cerberus-4b-v2-abliterated/cerberus-4b-v2-abliterated-Q4_K_M.gguf

# Q8_0 (balanced)
wget -c https://llm.cerberusai.dev/models/cerberus-4b-v2-abliterated/cerberus-4b-v2-abliterated-Q8_0.gguf

# F16 (full precision)
wget -c https://llm.cerberusai.dev/models/cerberus-4b-v2-abliterated/cerberus-4b-v2-abliterated-f16.gguf

Browse all models: llm.cerberusai.dev

How to Run

llama.cpp

./llama-cli -m cerberus-4b-v2-abliterated-Q4_K_M.gguf -p "You are Cerberus, an unfiltered AI assistant." --interactive

Ollama

# Create a Modelfile
echo 'FROM ./cerberus-4b-v2-abliterated-Q4_K_M.gguf
SYSTEM "You are Cerberus, an unfiltered AI assistant."' > Modelfile

ollama create cerberus -f Modelfile
ollama run cerberus

LM Studio

  1. Download a GGUF file from above
  2. Open LM Studio and import the model
  3. Select it from the model list and start chatting

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(model_path="./cerberus-4b-v2-abliterated-Q4_K_M.gguf", n_ctx=4096)

output = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are Cerberus, an unfiltered AI assistant."},
        {"role": "user", "content": "Hello, what can you do?"}
    ]
)

print(output["choices"][0]["message"]["content"])

Model Details

  • Architecture: Qwen2-based (qwen35)
  • Parameters: 4 billion
  • Format: GGUF
  • Context Length: 4096 tokens
  • Training: Fine-tuned with abliteration to remove refusal behavior

API

CerberusAI hosts a public model listing API:

# List all available models
curl https://llm.cerberusai.dev/api/models/

# Health check
curl https://llm.cerberusai.dev/health

Links


Built by CerberusAI โ€” Unfiltered. Uncensored. Unbound.

Downloads last month
1,971
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 1 Ask for provider support

Space using Grimxlock/cerberus-4b-v2-abliterated 1