LocoOperator-4B-GGUF

This repository contains the official GGUF quantized versions of LocoOperator-4B.

LocoOperator-4B is a 4B-parameter code exploration agent distilled from Qwen3-Coder-Next. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with 100% JSON tool-calling validity.

πŸš€ Which file should I choose?

We provide several quantization levels to balance performance and memory usage:

File Name Size Recommendation
LocoOperator-4B.Q8_0.gguf 4.28 GB Best Accuracy. Recommended for local agent loops to ensure perfect JSON output.
LocoOperator-4B.Q6_K.gguf 3.31 GB Great Balance. Near-lossless logic with a smaller footprint.
LocoOperator-4B.Q4_K_M.gguf 2.50 GB Standard. Compatible with almost all local LLM runners (LM Studio, Ollama, etc.).
LocoOperator-4B.IQ4_XS.gguf 2.29 GB Advanced. Uses Importance Quantization for better performance at smaller sizes.

πŸ›  Usage (llama.cpp)

To run this model using llama-cli or llama-server, we recommend a context size of at least 50K to handle multi-turn codebase exploration:

Simple CLI Chat:

./llama-cli \
    -m LocoOperator-4B.Q8_0.gguf \
    -c 51200 \
    -p "You are a helpful codebase explorer. Use tools to help the user."

Serve as an OpenAI-compatible API:

./llama-server \
    -m LocoOperator-4B.Q8_0.gguf \
    --ctx-size 51200 \
    --port 8080

πŸ“‹ Model Details

  • Base Model: Qwen3-4B-Instruct-2507
  • Teacher Model: Qwen3-Coder-Next
  • Training Method: Full-parameter SFT (Knowledge Distillation)
  • Primary Use Case: Codebase exploration (Read, Grep, Glob, Bash, Task)

πŸ”— Links

πŸ™ Acknowledgments

Special thanks to mradermacher for the initial quantization work and the llama.cpp community.

Downloads last month
132
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LocoreMind/LocoOperator-4B-GGUF

Quantized
(9)
this model