You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Qwen3.5-35B-A3B-abliterated

Unrestricted version of Qwen/Qwen3.5-35B-A3B, created with Abliterix — automated LLM abliteration via orthogonalized steering and Bayesian optimization.

Highlights

Metric	Value
Refusal rate	3/200 (1.5%)
KL divergence	0.0035
Optimization trials	50

Exceptionally low KL divergence of 0.0035 — the closest to the original model's behavior across the entire lineup, with only 1.5% refusals.

How It Works

Abliterix removes safety-refusal behavior while preserving model capabilities:

Refusal direction extraction — 800 harmful + 800 benign prompts reveal per-layer refusal activation patterns
Orthogonal projection — isolates the refusal signal by projecting out components aligned with normal responses, reducing refusals by 67% vs. raw abliteration
LoRA-based abliteration — rank-1 modifications to attention and MLP weights, captured as lightweight adapters (not destructive edits)
Bayesian optimization — Optuna TPE searches kernel shape, fractional direction index, and per-component strength across 50 trials to find the Pareto-optimal balance of low refusals and low KL divergence

All Abliterix Models

Model	Refusals	KL Divergence	Trials
Qwen3.5-122B-A10B-abliterated	1/200 (0.5%)	0.0115	25
Qwen3.5-35B-A3B-abliterated	3/200 (1.5%)	0.0035	50
Qwen3.5-27B-abliterated	3/200 (1.5%)	0.0051	35
Qwen3.5-9B-abliterated	2/200 (1%)	0.0105	50
Qwen3.5-4B-abliterated	3/200 (1.5%)	0.0065	50
Qwen3.5-0.8B-abliterated	0/200 (0%)	0.0087	100

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("wangzhang/Qwen3.5-35B-A3B-abliterated", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("wangzhang/Qwen3.5-35B-A3B-abliterated")

messages = [{"role": "user", "content": "Your question here"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

@software{abliterix,
  author = {Wu, Wangzhang},
  title = {Abliterix: Automated LLM Abliteration},
  year = {2026},
  url = {https://github.com/wuwangzhang1216/abliterix}
}