update model card and license
#2
by richardbaihe - opened
README.md
CHANGED
|
@@ -11,7 +11,7 @@ library_name: transformers
|
|
| 11 |
|
| 12 |
# SSD-Qwen3-4B-Thinking
|
| 13 |
|
| 14 |
-
This model was produced using **Simple Self-Distillation (SSD)**, a method that improves code generation by fine-tuning a language model on its own sampled outputs
|
| 15 |
|
| 16 |
- **Base model:** [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
|
| 17 |
- **Variant:** thinking
|
|
@@ -46,6 +46,10 @@ model = AutoModelForCausalLM.from_pretrained("apple/SSD-Qwen3-4B-Thinking")
|
|
| 46 |
tokenizer = AutoTokenizer.from_pretrained("apple/SSD-Qwen3-4B-Thinking")
|
| 47 |
```
|
| 48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
## License
|
| 50 |
|
| 51 |
This model is released under the [Apple Machine Learning Research Model License](https://huggingface.co/apple/SSD-Qwen3-4B-Thinking/blob/main/LICENSE).
|
|
|
|
| 11 |
|
| 12 |
# SSD-Qwen3-4B-Thinking
|
| 13 |
|
| 14 |
+
This model was produced using **Simple Self-Distillation (SSD)**, a method that improves code generation by fine-tuning a language model on its own sampled outputs using standard supervised learning.
|
| 15 |
|
| 16 |
- **Base model:** [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
|
| 17 |
- **Variant:** thinking
|
|
|
|
| 46 |
tokenizer = AutoTokenizer.from_pretrained("apple/SSD-Qwen3-4B-Thinking")
|
| 47 |
```
|
| 48 |
|
| 49 |
+
## Intended Use
|
| 50 |
+
|
| 51 |
+
Research on code generation and self-distillation methods.
|
| 52 |
+
|
| 53 |
## License
|
| 54 |
|
| 55 |
This model is released under the [Apple Machine Learning Research Model License](https://huggingface.co/apple/SSD-Qwen3-4B-Thinking/blob/main/LICENSE).
|