Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -43,16 +43,15 @@ This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% ma
|
|
| 43 |
|
| 44 |
## Key Features
|
| 45 |
|
| 46 |
-
✅ **Direct Output Format** - Clean code responses without verbose preambles
|
| 47 |
-
✅ **High Accuracy** - 87% token-level accuracy on Python tasks
|
| 48 |
-
✅ **Fast Inference** - Optimized for quick responses
|
| 49 |
-
⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)
|
| 50 |
|
| 51 |
## Usage
|
| 52 |
|
| 53 |
### Transformers
|
| 54 |
-
|
| 55 |
-
\\\python
|
| 56 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 57 |
|
| 58 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -68,37 +67,35 @@ prompt = 'Write a Python function to validate email addresses'
|
|
| 68 |
inputs = tokenizer(prompt, return_tensors='pt')
|
| 69 |
outputs = model.generate(**inputs, max_length=512)
|
| 70 |
print(tokenizer.decode(outputs[0]))
|
| 71 |
-
|
| 72 |
|
| 73 |
### Ollama
|
| 74 |
-
|
| 75 |
-
\\\ash
|
| 76 |
# Pull from Ollama registry
|
| 77 |
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b
|
| 78 |
|
| 79 |
# Run
|
| 80 |
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
|
| 81 |
-
|
| 82 |
|
| 83 |
### llama.cpp
|
| 84 |
-
|
| 85 |
-
\\\ash
|
| 86 |
# Download GGUF
|
| 87 |
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf
|
| 88 |
|
| 89 |
# Run
|
| 90 |
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
|
| 91 |
-
|
| 92 |
|
| 93 |
## File Structure
|
| 94 |
|
| 95 |
-
-
|
| 96 |
-
-
|
| 97 |
-
-
|
| 98 |
-
-
|
| 99 |
-
anbeige4.1-python-deepthink-fp16.gguf
|
| 100 |
-
-
|
| 101 |
-
anbeige4.1-python-deepthink-q8.gguf
|
| 102 |
|
| 103 |
## Best Use Cases
|
| 104 |
|
|
@@ -117,13 +114,12 @@ anbeige4.1-python-deepthink-q8.gguf\ - 8-bit quantized GGUF (4.2GB)
|
|
| 117 |
|
| 118 |
## Training Notes
|
| 119 |
|
| 120 |
-
E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed
|
| 121 |
|
| 122 |
**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.
|
| 123 |
|
| 124 |
## Citation
|
| 125 |
-
|
| 126 |
-
\\\ibtex
|
| 127 |
@misc{nanbeige-python-deepthink-e1,
|
| 128 |
title={Nanbeige 4.1 Python DeepThink 3B},
|
| 129 |
author={deltakitsune},
|
|
@@ -131,7 +127,7 @@ E1 focused on direct output format. Training data contained no chain-of-thought
|
|
| 131 |
publisher={HuggingFace},
|
| 132 |
url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
|
| 133 |
}
|
| 134 |
-
|
| 135 |
|
| 136 |
## License
|
| 137 |
|
|
|
|
| 43 |
|
| 44 |
## Key Features
|
| 45 |
|
| 46 |
+
- ✅ **Direct Output Format** - Clean code responses without verbose preambles
|
| 47 |
+
- ✅ **High Accuracy** - 87% token-level accuracy on Python tasks
|
| 48 |
+
- ✅ **Fast Inference** - Optimized for quick responses
|
| 49 |
+
- ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)
|
| 50 |
|
| 51 |
## Usage
|
| 52 |
|
| 53 |
### Transformers
|
| 54 |
+
```python
|
|
|
|
| 55 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 56 |
|
| 57 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 67 |
inputs = tokenizer(prompt, return_tensors='pt')
|
| 68 |
outputs = model.generate(**inputs, max_length=512)
|
| 69 |
print(tokenizer.decode(outputs[0]))
|
| 70 |
+
```
|
| 71 |
|
| 72 |
### Ollama
|
| 73 |
+
```bash
|
|
|
|
| 74 |
# Pull from Ollama registry
|
| 75 |
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b
|
| 76 |
|
| 77 |
# Run
|
| 78 |
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
|
| 79 |
+
```
|
| 80 |
|
| 81 |
### llama.cpp
|
| 82 |
+
```bash
|
|
|
|
| 83 |
# Download GGUF
|
| 84 |
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf
|
| 85 |
|
| 86 |
# Run
|
| 87 |
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
|
| 88 |
+
```
|
| 89 |
|
| 90 |
## File Structure
|
| 91 |
|
| 92 |
+
- *.safetensors - Merged model weights (Transformers)
|
| 93 |
+
- config.json - Model configuration
|
| 94 |
+
- okenizer.json - Tokenizer files
|
| 95 |
+
-
|
| 96 |
+
anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
|
| 97 |
+
-
|
| 98 |
+
anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)
|
| 99 |
|
| 100 |
## Best Use Cases
|
| 101 |
|
|
|
|
| 114 |
|
| 115 |
## Training Notes
|
| 116 |
|
| 117 |
+
E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.
|
| 118 |
|
| 119 |
**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.
|
| 120 |
|
| 121 |
## Citation
|
| 122 |
+
```bibtex
|
|
|
|
| 123 |
@misc{nanbeige-python-deepthink-e1,
|
| 124 |
title={Nanbeige 4.1 Python DeepThink 3B},
|
| 125 |
author={deltakitsune},
|
|
|
|
| 127 |
publisher={HuggingFace},
|
| 128 |
url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
|
| 129 |
}
|
| 130 |
+
```
|
| 131 |
|
| 132 |
## License
|
| 133 |
|