--- license: apache-2.0 library_name: transformers tags: - chess - sefl-play pipeline_tag: text-generation base_model: - FlameF0X/ChessSLM --- # ChessSLM-RL **ChessSLM-RL** is the improve version of **ChessSLM** (a small language model designed to play chess using natural language move generation.) by using RL (Reinforcement LeanLearning) to make the model to hallucinated less and play a bit more conscious. Despite having only **30M parameters**, it is capable of competing with and occasionally outperforming larger language models in chess-playing tasks. The model is based on the ChessSLM pre-train model, fine-tuned using RL with Stockfish to make the model to play more legal moves and attempt fewer illegal moves be rewarding good moves and bad moves. Play against ChessSLM [here](https://flamef0x.github.io/other/chess/chess). --- ## Overview - **Architecture:** GPT-2 - **Parameters:** ~30M - **Training data:** Self-Play w/ SF evaluation - **Task:** Autoregressive chess move generation --- ## Capabilities ChessSLM can play chess by generating moves sequentially in SAN notation. It has been evaluated in matches against several language models, including: - Claude [Won against it] - Gemini [Lost again it] - Qwen - GPT-2 - GPT-Neo - Pythia - LLaMA - Mistral - other small chess-oriented models The model achieves an averaging rating of **around ~1054 Elo** against other language models despite its small size. --- ## Benchmark Results | Model | Elo Rating | |------|------------| | EleutherAI/pythia-70m-deduped | 1111 | | mlabonne/chesspythia-70m | 1101 | | nlpguy/amdchess-v9 | 1094 | | nlpguy/smolchess-v2 | 1093 | | DedeProGames/mini-chennus | 1083 | | distilbert/distilgpt2 | 1061 | | DedeProGames/dialochess | 1059 | | facebook/opt-125m | 1057 | | **FlameF0X/ChessSLM** | **1054** | | **FlameF0X/ChessSLM-RL** | **1054** | | mlabonne/grandpythia-200k-70m | 1050 | | DedeProGames/Chesser-248K-Mini | 1048 | --- ## Limitations Like many language-model-based chess systems, ChessSLM has several limitations: - **Illegal move hallucinations:** The model may occasionally generate moves that violate chess rules. - **No board-state verification:** Moves are generated purely from learned patterns rather than a validated game state. - **Limited strategic depth:** While competitive at lower Elo levels, it cannot match dedicated chess engines. These limitations are common for **pure language-model chess agents** that do not use external rule engines. --- ## Summary ChessSLM shows that **very small language models can achieve meaningful chess performance** when trained on domain-specific data. It serves as a lightweight baseline for exploring **LLM-based chess agents** and **specialized small language models (SLMs)**.