Cygnis Alpha 1.7B v0.1 - GGUF Model Card

Quick Start with Ollama

You can now run Cygnis Alpha directly via Ollama for an ultra-fast and simplified local experience.

Run it instantly via your terminal:

ollama run CygnisAI/Cygnis-Alpha-1.7B-v0.1

1. Model Overview

Cygnis Alpha 1.7B v0.1 is a Small Language Model (SLM) optimized for ultra-fast local inference on CPUs. Based on the SmolLM2 architecture, it has been fine-tuned by Simonc-44 to develop a strong system identity and high efficiency.

This GGUF version is specifically designed to run on consumer-grade hardware (laptops, mini-PCs) without requiring a dedicated GPU.

Developer: Simonc-44 / CygnisAI
Architecture: SmolLM2 (Llama-like)
Format: GGUF (Available quantizations: Q4_K_M, Q8_0)
Capabilities: Chat, Instruction-following, Personal Assistant.

2. Technical Specifications

Feature	Detail
Model Type	Causal Language Model
Parameters	1.7B
Context Length	2048 tokens
Quantization	Q4_K_M (4-bit) & Q8_0 (8-bit)
Training Precision	bfloat16

Target Performance

Inference Speed (CPU): ~30-50 tokens/sec (on standard processors).
Memory Footprint: ~1.5 GB RAM minimum required (Q4_K_M version).

3. Usage & Implementation

System Prompt Configuration (Recommended)

To ensure the model adheres to its identity, use the following template:

"You are Cygnis Alpha, a sovereign artificial intelligence designed by Simonc-44. You are polite, fast, and concise."

Python Integration (Llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="./models/cygnis-alpha-1.7b-v0.1.Q4_K_M.gguf",
    n_ctx=2048,
    n_threads=4, # Adjust based on your CPU cores
    chat_format="chatml"
)

# Example Request
response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Hello Cygnis, introduce yourself."}]
)
print(response["choices"][0]["message"]["content"])

4. Evaluation & Improvements

Cygnis Alpha v0.1 brings the following improvements over previous iterations:

Stable Identity: Reduced hallucinations regarding the model's origin and its creator, Simonc-44.
CPU Optimization: Near-instant response times even on older generation processors.
Formatting: Improved handling of bullet points and structured responses.

5. Ethics & Limitations

Limitations

Factual Knowledge: Due to its reduced size (1.7B), the model may make mistakes on highly specific historical or technical facts.
Complex Reasoning: For advanced mathematical or logic tasks, the Cygnis Beta range is recommended.

Security Policy

The use of Cygnis Alpha for illegal or malicious activities is strictly prohibited. The model is provided under the Apache 2.0 license.

6. Citation

@misc{allal2025smollm2smolgoesbig,
      title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model}, 
      author={Loubna Ben Allal and others},
      year={2025},
      eprint={2502.02737},
      archivePrefix={arXiv},
}