🎯 GameTheory-Formulator-Model

Phase 3 of the Alogotron Game Theory AI Pipeline — A QLoRA adapter that teaches language models to translate real-world scenarios into formal game theory formulations.

Overview

Property	Value
Base Model	Qwen/Qwen2.5-7B-Instruct
Method	QLoRA (4-bit NF4 quantization + LoRA)
Task	Real-world scenario → Formal game theory formulation
Dataset	Alogotron/GameTheory-Formulator (1,215 examples)
Training	SFT, 1 epoch, ~24 minutes on 2x RTX 3090
Eval Accuracy	100.0% valid formulations on held-out set

The Alogotron Game Theory Pipeline

This model is part of a 3-phase training pipeline:

Phase	Model	Task	Method
Phase 1	GameTheory-Solver	Solve formal GT problems	SFT on 2,913 problems → 94% accuracy
Phase 2	GameTheory-Reasoner	Enhanced reasoning	GRPO on same dataset
Phase 3	GameTheory-Formulator (this model)	Real-world → formal GT	SFT on 1,215 formulation problems

What This Model Does

Given a real-world scenario (business competition, political negotiation, security analysis, etc.), this model:

📋 Formulation Steps — Walks through the reasoning to identify the game structure
🎮 Formal Game Model — Identifies players, strategies, payoffs, information structure, and solution concept
🧮 Solution — Solves the formulated game (Nash equilibrium, dominant strategies, etc.)
🌍 Real-World Interpretation — Translates the mathematical solution back to actionable insights

Training Details

QLoRA Configuration

Parameter	Value
LoRA rank (r)	32
LoRA alpha	64
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization	4-bit NF4 with double quantization
Trainable params	80.7M / 7.7B (1.05%)

Training Hyperparameters

Parameter	Value
Epochs	1
Batch size (per device)	2
Gradient accumulation	4
Effective batch size	16
Learning rate	5e-5 (cosine schedule)
Optimizer	paged_adamw_8bit
Max sequence length	2048
Packing	Enabled
Gradient checkpointing	Enabled
Hardware	2x NVIDIA RTX 3090 (24GB each)

Training Metrics

Metric	Value
Train loss	1.0992
Eval loss	0.8492
Training time	24.3 minutes
Dataset size	1215 examples
Train split	1093 examples
Eval split	122 examples

Evaluation Results

Tested on 20 held-out examples across 6 domains and 3 difficulty levels:

Metric	Score
Valid Formulations	100.0%
All sections present	100.0%
All GT elements identified	100.0%
Avg response length	1821 chars

By Domain

Domain	Valid
Business	8/8 (100%)
Security	5/5 (100%)
Politics	2/2 (100%)
Auctions	2/2 (100%)
Technology	2/2 (100%)
Social	1/1 (100%)

By Difficulty

Difficulty	Valid
Easy	5/5 (100%)
Medium	9/9 (100%)
Hard	6/6 (100%)

Usage

With PEFT + Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Load the Formulator adapter
model = PeftModel.from_pretrained(base_model, "Alogotron/GameTheory-Formulator-Model")
model.eval()

# Create a prompt
messages = [
    {"role": "system", "content": "You are a game theory expert. Given a real-world scenario, formulate it as a formal game theory model. Identify the players, strategies, payoffs, and information structure. Then solve the game and interpret the results."},
    {"role": "user", "content": "Two coffee shops on the same street must decide whether to offer a loyalty program. If both offer it, they split customers evenly but incur costs. If neither offers it, they split evenly with no extra cost. If only one offers it, that shop attracts 70% of customers."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Example Output

Input Scenario:

Two airline companies, Stellar and Haven, each control roughly half the market. They are independently deciding their pricing for the upcoming quarter...

Model Output:

Formulation Steps

Step 1 - Stellar and Haven are each making pricing decisions that directly affect each other's profits...

Step 2 - Players: Stellar and Haven...

Step 3 - Strategies: Each firm can either 'Maintain Prices' or 'Cut Prices'...

Formal Game Model

Game Type: Simultaneous Players: Stellar, Haven Strategies: Maintain Prices, Cut Prices Payoffs: Both Maintain: (54, 54), Both Cut: (18, 18)... Solution Concept: Nash Equilibrium

Solution

Both firms will cut prices. Cutting is a dominant strategy for each...

Real-World Interpretation

This is a classic Prisoner's Dilemma. Both companies rationally choose to cut prices, resulting in lower profits than cooperation would yield...

Dataset

Trained on Alogotron/GameTheory-Formulator — 1,215 expert-crafted formulation problems across 6 domains:

Business (290): Pricing, market entry, production, R&D, supply chain
Security (230): Cybersecurity, threat modeling, defense allocation
Politics (195): Elections, negotiations, voting, international relations
Social (190): Social dilemmas, public goods, coordination, trust
Technology (165): Platform competition, standards, adoption, innovation
Auctions (145): First-price, second-price, common value, combinatorial

Related Models & Datasets

Resource	Link
Phase 1: Solver Model	Alogotron/GameTheory-Solver
Phase 2: Reasoner Model	Alogotron/GameTheory-Reasoner
Solver Dataset	Alogotron/GameTheory-Bench
Formulator Dataset	Alogotron/GameTheory-Formulator

Limitations

Trained on synthetic formulation data; may not handle all real-world edge cases
Formulation quality depends on scenario clarity and completeness
Best suited for classical game theory formulations (simultaneous, sequential, auctions)
Does not cover cooperative game theory or mechanism design (yet)

Citation

@misc{alogotron-formulator-2025,
  title={GameTheory-Formulator-Model: Real-World Scenario to Game Theory Formulation},
  author={Alogotron},
  year={2025},
  publisher={HuggingFace},
  url={https://huggingface.co/Alogotron/GameTheory-Formulator-Model}
}

Downloads last month: 2

Model tree for Alogotron/GameTheory-Formulator-Model

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(1762)

this model

Dataset used to train Alogotron/GameTheory-Formulator-Model

Space using Alogotron/GameTheory-Formulator-Model 1

Evaluation results

Valid Formulation Rate on GameTheory-Formulator
self-reported

100.000
Eval Loss on GameTheory-Formulator
self-reported

0.849
Train Loss on GameTheory-Formulator
self-reported

1.099