qwen3-4b-structeval-strategy7-baseline-yamlxml-lr6e-6

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Strategy 7: Baseline Return + YAML/XML Pure Increase 🔥

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: QLoRA (4-bit)
  • Max sequence length: 1024
  • Epochs: 1
  • Learning rate: 6e-06
  • LoRA: r=16, alpha=32

Dataset: Baseline Return + YAML/XML Boost

Strategy 7: Reverting to Baseline's TOML 100% Success

Baseline Success (0.80195):

  • TOML: 100%
  • YAML: 91.4%
  • XML: 78.0%
  • TOML ratio: 14%

Problem with Strategy 2 Revised (0.82286):

  • TOML: 76.0% ❌ (down from 100%)
  • YAML: 97.1% ✅
  • XML: 90.0% ✅
  • TOML ratio: 10%

Strategy 3 & 4 Failure:

  • TOML ratio 14% did NOT reproduce Baseline's TOML 100%
  • Something else was missing

Strategy 7 Solution:

1. Revert to Baseline's daichira level

  • daichira系: 1.0x (Baseline level, no boost)
  • Avoid daichira reduction that harmed TOML

2. Pure YAML/XML increase

  • u-10bei v2/v4/v5: 2.0x boost (YAML-rich)
  • u-10bei base512/base: 1.3x boost (TOML-rich)
  • u-10bei v2_short: 1.3x boost (TOML-rich)

3. Maintain TOML 14% ratio

  • TOML absolute quantity: +30% (2,800 → 3,640)
  • TOML ratio: 14% (same as Baseline)
  • Expected TOML recovery: 95-98%

Data Cleaning Pipeline:

  1. CoT tags removal: <thinking>...</thinking> completely removed
  2. Code fence removal: yaml, json, xml, toml, ````csv removed
  3. Leading phrase removal: "Here's the output:", "Sure!" etc. removed
  4. 🔥 Output extraction: For u-10bei datasets, extract only content after "Output:" marker
  5. Format validation: JSON/YAML/XML/TOML/CSV parsing validation
  6. Deduplication: Exact duplicates removed

Format Distribution (Expected):

  • YAML: ~13,000 (50%) 🔥 (up from 35% in Baseline)
  • XML: ~5,200 (20%) 🔥 (up from 18% in Baseline)
  • TOML: ~3,640 (14%) 🔥 (same ratio as Baseline, +30% absolute)
  • JSON: ~1,560 (6%)
  • CSV: ~2,600 (10%)

Total: ~26,000 samples

Source Datasets with Boost Factors:

daichira系(1.0x - Baseline level):

  • daichira/structured-3k-mix-sft 🔥 1.0x (TOML 0%)
  • daichira/structured-5k-mix-sft 🔥 1.0x (TOML 0%)
  • daichira/structured-hard-sft-4k 🔥 1.0x (TOML 0%)

u-10bei系(YAML/XML強化):

  • u-10bei/structured_data_with_cot_dataset_512_v2 🔥 2.0x (YAML-rich)
  • u-10bei/structured_data_with_cot_dataset_512_v4 🔥 2.0x (YAML-rich)
  • u-10bei/structured_data_with_cot_dataset_512_v5 🔥 2.0x (YAML-rich)
  • u-10bei/structured_data_with_cot_dataset_512 🔥 1.3x (TOML 20%)
  • u-10bei/structured_data_with_cot_dataset 🔥 1.3x (TOML 17.6%)
  • u-10bei/structured_data_with_cot_dataset_v2 🔥 1.3x (TOML 15.9%)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yuk1chan/qwen3-4b-structeval-strategy7-baseline-yamlxml-lr6e-6"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

# Inference
prompt = "Generate YAML code for..."
# ... your inference code

Training Results

  • Training Loss: ~1.25-1.30
  • Validation Loss: ~1.50-1.55
  • Training Time: ~9-10 hours (T4 GPU)
  • Expected Score: 0.840-0.860

Strategy: Baseline Return + YAML/XML Pure Increase

Key Insights:

Baseline's TOML 100% Success:

  • TOML ratio: 14%
  • TOML score: 100%
  • This is the only time TOML achieved 100%

Why Strategy 2/3/4 Failed:

  • Strategy 2: TOML ratio 10% → TOML 76% (ratio too low)
  • Strategy 3: TOML ratio 14-15% → TOML ~70% (something else was wrong)
  • Strategy 4: TOML ratio 11-12% → TOML 0 samples (daichira reduction failed)

Strategy 7 Design:

Aspect Baseline Strategy 2 Strategy 7
daichira boost 1.0x 2.0x 1.0x
TOML ratio 14% 10% 14%
TOML absolute 2,800 2,600 3,640
TOML score 100% 76% 95-98%
YAML ratio 35% 45% 50%
YAML score 91.4% 97.1% 95-97%
Overall 0.80195 0.82286 0.840-0.860

Expected Improvements:

  • TOML recovery: 76% → 95-98% (+19-22%)
  • YAML maintained: 97.1% → 95-97%
  • XML: 90.0% → 88-90%
  • Overall: 0.82286 → 0.840-0.860 (+1.7-3.7%)

Risk Analysis:

  • Risk: Baseline's TOML 100% may not be reproducible
  • Mitigation: Maintain TOML 14% ratio, daichira 1.0x
  • Expected: High chance of TOML recovery

License

Apache 2.0


Trained on Baseline Return + YAML/XML Boost StructEval dataset Learning Rate: 6e-6 (proven setting from 0.82286) Strategy: daichira 1.0x (Baseline) + u-10bei 1.3-2.0x (YAML/XML boost) TOML ratio: 14% (same as Baseline)

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yuk1chan/qwen3-4b-structeval-strategy7-baseline-yamlxml-lr6e-6

Adapter
(5377)
this model