--- language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - mathematics - conjectures - theorem-proving - reasoning - qlora - lora - peft - formal-math - lean - research base_model: "Qwen/Qwen2.5-0.5B-Instruct" datasets: - NorthernTribe-Research/math-conjecture-training-corpus model-index: - name: math-conjecture-model results: [] --- # Math Conjecture SOTA 0.5B Math Conjecture SOTA 0.5B is a research-focused language model adapted for **mathematical reasoning**, **conjecture analysis**, **proof-style generation**, and **formalization-aware responses**. It is part of the broader **Math Conjecture Training Corpus** effort, which builds open-license training pipelines for unsolved conjectures, structured reasoning, competition mathematics, proof traces, and Lean-oriented theorem workflows. This checkpoint is intended for research and experimentation around long-form mathematical reasoning rather than proof certification. --- ## Model Details ### Model description This model is fine-tuned to produce more structured mathematical outputs, including: - intuition-first explanations - stepwise proof sketches - conjecture decomposition - theorem-style reasoning - informal-to-formal transition hints - Lean-aware reasoning patterns The surrounding project supports multiple development paths including supervised fine-tuning, state-of-the-art math profiles, scratch initialization, evaluation, self-improvement data generation, adapter merge, Hub publishing, and local `llama.cpp` inference after conversion. ### Model type Parameter-efficient fine-tuned causal language model for math reasoning. ### Hub repo - **Model repo:** `NorthernTribe-Research/math-conjecture-model` - **Dataset repo:** `NorthernTribe-Research/math-conjecture-training-corpus` - **Trainer Space repo:** `NorthernTribe-Research/math_trainer` --- ## Parameter Count Visualization | Metric | Value | |---|---:| | Total parameters | 502.831M (502,830,976) | | Trainable parameters | 8.798M (8,798,208) | | Frozen parameters | 494.033M (494,032,768) | | Trainable ratio | 1.7497% | **Trainable share:** `[#---------------------------] 1.7497%` This checkpoint uses a parameter-efficient adaptation setup in which only a small fraction of model weights are trainable while the vast majority remain frozen. The project documentation explicitly states that training summaries record `model.parameter_counts` with total, trainable, frozen, and ratio fields, and that the Hub README is auto-updated with this visualization when `push_to_hub` is enabled. --- ## Training Reference - **Summary source:** `workspace/runs/math-conjecture-sota-050b-quick/training_summary.json` - This model card is derived from the repository README together with the latest training summary. The repo also includes a helper for refreshing the model card directly from those sources: `scripts/update_model_card.py`. --- ## Intended Uses ### Direct use This model is suitable for: - mathematical reasoning research - conjecture exploration demos - proof-sketch generation - theorem-style answer generation - reasoning benchmark experiments - formalization-oriented prompting - research prototypes on Hugging Face Spaces ### Downstream use Potential downstream uses include: - math-focused copilots - conjecture analysis interfaces - educational proof assistants - formal/informal bridge systems - evaluation pipelines for reasoning-heavy LLMs ### Out-of-scope use This model is **not** intended to be treated as: - a formal theorem prover - a replacement for proof assistants - a certified symbolic solver - a source of guaranteed-correct proofs - an authoritative system for high-stakes mathematical claims --- ## Training Data This model is part of a larger corpus-building effort designed for training math AI systems aimed at **unsolved conjecture reasoning**. The v1 dataset pipeline combines local conjecture data with curated open-license Hugging Face datasets covering competition mathematics, structured reasoning, and formal proof corpora. The repository also documents broader open math sources such as: - OpenR1-Math-220k - FineProofs-SFT - LeanStatement_CoT - NuminaMath-LEAN - DeepSeek-Prover-V1 These sources are included to improve coverage across proof traces, theorem formalization, and reasoning-heavy mathematical examples. The project documentation also emphasizes open-license, practical-size, and deterministic filtering choices in the corpus pipeline. --- ## Training Procedure The model-development workspace supports: - SFT training - SOTA math profile training - scratch initialization - scratch dry-runs - self-improvement / rejection-sampling data generation - evaluation with richer reasoning metrics - adapter merge and Hub publish flows The SOTA curriculum profiles responses across: - simple - intermediate - advanced - Lean-formalized bands ### Evaluation capabilities The project documentation states that the SOTA evaluation flow includes: - `difficulty_band_metrics` - `response_profile_metrics` - `simple_to_lean` - `selected_pass_at_k` - `consensus_rate` - `consensus_pass_at_k` It also supports stricter grading controls such as optional SymPy symbolic verification and substring-match controls. ### Self-improvement flow The repository includes a rejection-sampling data-generation path via `generate_rft_data.py`, used to create self-improvement training data from model outputs. --- ## Project Architecture The wider project includes: - merged dataset construction - validation and release packaging - model development configs and scripts - a Space trainer app - a conjecture-lab Space app - rollout runbooks - state-of-the-art math blueprints This means the model should be understood as one artifact within a broader research pipeline rather than a standalone checkpoint. --- ## Example Usage ### Transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch repo_id = "NorthernTribe-Research/math-conjecture-model" tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForCausalLM.from_pretrained( repo_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32, device_map="auto" ) prompt = """Analyze the following mathematical conjecture. Conjecture: If a sequence is eventually periodic, then its generating function is rational. Return: 1. Intuition 2. Proof sketch 3. Key assumptions 4. Formalization notes """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=512, do_sample=True, temperature=0.7, top_p=0.9 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Prompting style This model generally benefits from prompts that request structure explicitly, such as: - intuition - proof sketch - formalization notes - assumptions - edge cases - candidate counterexamples - Lean-style outline ### Example prompt ```text Consider the statement: "If a + b is even, then a and b have the same parity." Provide: 1. An intuitive explanation 2. A proof 3. A compact formalization outline 4. Any assumptions or edge cases ``` --- ## Local Inference The project documentation confirms a local inference path using `llama.cpp` after: 1. merging the adapter into a full model 2. converting the merged model to GGUF 3. optionally quantizing for lighter deployment This supports local experimentation outside standard Transformers-based serving. --- ## Limitations This is a research model and may still: - generate invalid proofs that sound convincing - confuse symbolic plausibility with correctness - fail on deep multi-hop theorem reasoning - produce incomplete or brittle formalization hints - overuse familiar proof templates - hallucinate mathematical structure All substantive outputs should be independently verified with formal tools, symbolic systems, or expert review. --- ## Risks and Recommendations Because the model is optimized for proof-style text generation, users may over-trust fluent mathematical output. For that reason: - verify results independently - use proof assistants or symbolic systems when correctness matters - treat outputs as research assistance, not certification - benchmark behavior before deployment in educational or analytical systems --- ## Project Documentation The repository includes documentation and workflow support for: - dataset release - model development - rollout automation - Space deployment - evaluation - adapter merge and publish - model-card refresh automation --- ## Citation ```bibtex @misc{northerntribe_math_conjecture_sota_05b_2026, title = {Math Conjecture SOTA 0.5B}, author = {NorthernTribe Research}, year = {2026}, publisher = {Hugging Face}, howpublished = {Model repository} } ``` --- ## Disclaimer This model is intended for research, experimentation, and educational exploration in mathematical reasoning. It does not guarantee theorem validity, proof correctness, or novel mathematical discovery.