scthornton commited on
Commit
95712b6
·
verified ·
1 Parent(s): 8c41485

Upgrade model card: badges, quick start, training details, collection table, citations

Browse files
Files changed (1) hide show
  1. README.md +120 -37
README.md CHANGED
@@ -1,60 +1,143 @@
1
  ---
2
  library_name: peft
 
3
  license: apache-2.0
4
- base_model: ibm-granite/granite-20b-code-instruct-8k
 
 
 
5
  tags:
6
- - base_model:adapter:ibm-granite/granite-20b-code-instruct-8k
7
- - lora
8
- - transformers
9
- pipeline_tag: text-generation
 
 
 
 
 
 
 
10
  model-index:
11
- - name: granite-20b-code-securecode
12
- results: []
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
 
18
- # granite-20b-code-securecode
19
 
20
- This model is a fine-tuned version of [ibm-granite/granite-20b-code-instruct-8k](https://huggingface.co/ibm-granite/granite-20b-code-instruct-8k) on the None dataset.
 
 
 
21
 
22
- ## Model description
23
 
24
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
29
 
30
- ## Training and evaluation data
 
 
31
 
32
- More information needed
 
 
 
 
 
 
33
 
34
- ## Training procedure
 
 
 
 
35
 
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 0.0002
40
- - train_batch_size: 1
41
- - eval_batch_size: 8
42
- - seed: 42
43
- - gradient_accumulation_steps: 16
44
- - total_train_batch_size: 16
45
- - optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
- - lr_scheduler_type: cosine
47
- - lr_scheduler_warmup_steps: 100
48
- - num_epochs: 3
49
 
50
- ### Training results
 
 
 
 
 
 
 
 
 
51
 
 
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
- ### Framework versions
55
 
56
- - PEFT 0.18.1
57
- - Transformers 5.1.0
58
- - Pytorch 2.7.1+cu128
59
- - Datasets 2.21.0
60
- - Tokenizers 0.22.2
 
1
  ---
2
  library_name: peft
3
+ pipeline_tag: text-generation
4
  license: apache-2.0
5
+ language:
6
+ - code
7
+ base_model:
8
+ - ibm-granite/granite-20b-code-instruct-8k
9
  tags:
10
+ - securecode
11
+ - security
12
+ - owasp
13
+ - code-generation
14
+ - secure-coding
15
+ - lora
16
+ - qlora
17
+ - vulnerability-detection
18
+ - cybersecurity
19
+ datasets:
20
+ - scthornton/securecode
21
  model-index:
22
+ - name: granite-20b-code-securecode
23
+ results: []
24
+ ---
25
+
26
+ # Granite 20B Code SecureCode
27
+
28
+ [![Parameters](https://img.shields.io/badge/parameters-20B-blue.svg)](#model-details) [![Dataset](https://img.shields.io/badge/dataset-2,185_examples-green.svg)](https://huggingface.co/datasets/scthornton/securecode) [![OWASP](https://img.shields.io/badge/OWASP-Top_10_2021_+_LLM_Top_10-red.svg)](#security-coverage) [![Method](https://img.shields.io/badge/method-QLoRA-purple.svg)](#training-details) [![License](https://img.shields.io/badge/license-Apache_2.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)
29
+
30
+ **Enterprise flagship security-aware code generation model (IBM-grade trust). Fine-tuned on 2,185 real-world vulnerability examples covering OWASP Top 10 2021 and OWASP LLM Top 10 2025.**
31
+
32
+ [Dataset](https://huggingface.co/datasets/scthornton/securecode) | [Paper](https://huggingface.co/papers/2512.18542) | [Model Collection](https://huggingface.co/collections/scthornton/securecode) | [perfecXion.ai](https://perfecxion.ai) | [Blog Post](https://huggingface.co/blog/scthornton/securecode-models)
33
+
34
  ---
35
 
36
+ ## What This Model Does
 
37
 
38
+ Granite 20B Code SecureCode generates security-aware code by teaching the model to recognize vulnerability patterns and produce secure implementations. Every training example includes:
39
 
40
+ - **Real-world incident grounding** Tied to documented CVEs and breach reports
41
+ - **Vulnerable + secure implementations** — Side-by-side comparison
42
+ - **Attack demonstrations** — Concrete exploit code
43
+ - **Defense-in-depth guidance** — SIEM rules, logging, monitoring, infrastructure hardening
44
 
45
+ ---
46
 
47
+ ## Model Details
48
+
49
+ | Property | Value |
50
+ |----------|-------|
51
+ | **Base Model** | [ibm-granite/granite-20b-code-instruct-8k](https://huggingface.co/ibm-granite/granite-20b-code-instruct-8k) |
52
+ | **Parameters** | 20B |
53
+ | **Architecture** | GPT (IBM Granite Code) |
54
+ | **Method** | QLoRA (4-bit quantization + LoRA) |
55
+ | **LoRA Rank** | 16 |
56
+ | **LoRA Alpha** | 32 |
57
+ | **Training Data** | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples) |
58
+ | **Training Time** | ~1h 19min |
59
+ | **Hardware** | 2x NVIDIA A100 40GB (GCP) |
60
+ | **Framework** | PEFT 0.18.1, Transformers 5.1.0, PyTorch 2.7.1 |
61
 
62
+ ---
63
 
64
+ ## Quick Start
65
 
66
+ ```python
67
+ from transformers import AutoModelForCausalLM, AutoTokenizer
68
+ from peft import PeftModel
69
 
70
+ base_model = AutoModelForCausalLM.from_pretrained(
71
+ "ibm-granite/granite-20b-code-instruct-8k",
72
+ device_map="auto",
73
+ load_in_4bit=True
74
+ )
75
+ model = PeftModel.from_pretrained(base_model, "scthornton/granite-20b-code-securecode")
76
+ tokenizer = AutoTokenizer.from_pretrained("scthornton/granite-20b-code-securecode")
77
 
78
+ prompt = "Write a secure JWT authentication handler in Python with proper token validation"
79
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
80
+ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
81
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
82
+ ```
83
 
84
+ ---
85
 
86
+ ## Training Details
 
 
 
 
 
 
 
 
 
 
87
 
88
+ | Hyperparameter | Value |
89
+ |----------------|-------|
90
+ | Learning Rate | 2e-4 |
91
+ | Batch Size | 1 |
92
+ | Gradient Accumulation | 16 |
93
+ | Epochs | 3 |
94
+ | Scheduler | Cosine |
95
+ | Warmup Steps | 100 |
96
+ | Optimizer | paged_adamw_8bit |
97
+ | Final Loss | 1.639 |
98
 
99
+ ---
100
 
101
+ ## SecureCode Model Collection
102
+
103
+ | Model | Parameters | Base | Training Time | Link |
104
+ |-------|------------|------|---------------|------|
105
+ | Llama 3.2 3B | 3B | Meta Llama 3.2 | 1h 5min | [scthornton/llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode) |
106
+ | Qwen Coder 7B | 7B | Qwen 2.5 Coder | 1h 24min | [scthornton/qwen-coder-7b-securecode](https://huggingface.co/scthornton/qwen-coder-7b-securecode) |
107
+ | CodeGemma 7B | 7B | Google CodeGemma | 1h 27min | [scthornton/codegemma-7b-securecode](https://huggingface.co/scthornton/codegemma-7b-securecode) |
108
+ | DeepSeek Coder 6.7B | 6.7B | DeepSeek Coder | 1h 15min | [scthornton/deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode) |
109
+ | CodeLlama 13B | 13B | Meta CodeLlama | 1h 32min | [scthornton/codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode) |
110
+ | Qwen Coder 14B | 14B | Qwen 2.5 Coder | 1h 19min | [scthornton/qwen2.5-coder-14b-securecode](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode) |
111
+ | StarCoder2 15B | 15B | BigCode StarCoder2 | 1h 40min | [scthornton/starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode) |
112
+ | **Granite 20B** | **20B** | **IBM Granite Code** | **1h 19min** | **This model** |
113
+
114
+ ---
115
+
116
+ ## Citation
117
+
118
+ ```bibtex
119
+ @misc{thornton2025securecode,
120
+ title={SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models},
121
+ author={Thornton, Scott},
122
+ year={2025},
123
+ publisher={perfecXion.ai},
124
+ url={https://perfecxion.ai/articles/securecode-v2-dataset-paper.html},
125
+ note={Model: https://huggingface.co/scthornton/granite-20b-code-securecode}
126
+ }
127
+ ```
128
+
129
+ ---
130
+
131
+ ## Links
132
+
133
+ - **Dataset**: [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples)
134
+ - **Paper**: [SecureCode v2.0](https://huggingface.co/papers/2512.18542)
135
+ - **Model Collection**: [SecureCode Models](https://huggingface.co/collections/scthornton/securecode) (8 models)
136
+ - **Blog Post**: [Training Security-Aware Code Models](https://huggingface.co/blog/scthornton/securecode-models)
137
+ - **Publisher**: [perfecXion.ai](https://perfecxion.ai)
138
+
139
+ ---
140
 
141
+ ## License
142
 
143
+ Apache 2.0