NorthernTribe-Research's picture
Apply curated official model card (SOTA 0.5B) with full project context.
d671937 verified
metadata
language:
  - en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
  - mathematics
  - conjectures
  - theorem-proving
  - reasoning
  - qlora
  - lora
  - peft
  - formal-math
  - lean
  - research
base_model: Qwen/Qwen2.5-0.5B-Instruct
datasets:
  - NorthernTribe-Research/math-conjecture-training-corpus
model-index:
  - name: math-conjecture-model
    results: []

Math Conjecture SOTA 0.5B

Math Conjecture SOTA 0.5B is a research-focused language model adapted for mathematical reasoning, conjecture analysis, proof-style generation, and formalization-aware responses. It is part of the broader Math Conjecture Training Corpus effort, which builds open-license training pipelines for unsolved conjectures, structured reasoning, competition mathematics, proof traces, and Lean-oriented theorem workflows.

This checkpoint is intended for research and experimentation around long-form mathematical reasoning rather than proof certification.


Model Details

Model description

This model is fine-tuned to produce more structured mathematical outputs, including:

  • intuition-first explanations
  • stepwise proof sketches
  • conjecture decomposition
  • theorem-style reasoning
  • informal-to-formal transition hints
  • Lean-aware reasoning patterns

The surrounding project supports multiple development paths including supervised fine-tuning, state-of-the-art math profiles, scratch initialization, evaluation, self-improvement data generation, adapter merge, Hub publishing, and local llama.cpp inference after conversion.

Model type

Parameter-efficient fine-tuned causal language model for math reasoning.

Hub repo

  • Model repo: NorthernTribe-Research/math-conjecture-model
  • Dataset repo: NorthernTribe-Research/math-conjecture-training-corpus
  • Trainer Space repo: NorthernTribe-Research/math_trainer

Parameter Count Visualization

Metric Value
Total parameters 502.831M (502,830,976)
Trainable parameters 8.798M (8,798,208)
Frozen parameters 494.033M (494,032,768)
Trainable ratio 1.7497%

Trainable share: [#---------------------------] 1.7497%

This checkpoint uses a parameter-efficient adaptation setup in which only a small fraction of model weights are trainable while the vast majority remain frozen. The project documentation explicitly states that training summaries record model.parameter_counts with total, trainable, frozen, and ratio fields, and that the Hub README is auto-updated with this visualization when push_to_hub is enabled.


Training Reference

  • Summary source: workspace/runs/math-conjecture-sota-050b-quick/training_summary.json
  • This model card is derived from the repository README together with the latest training summary. The repo also includes a helper for refreshing the model card directly from those sources: scripts/update_model_card.py.

Intended Uses

Direct use

This model is suitable for:

  • mathematical reasoning research
  • conjecture exploration demos
  • proof-sketch generation
  • theorem-style answer generation
  • reasoning benchmark experiments
  • formalization-oriented prompting
  • research prototypes on Hugging Face Spaces

Downstream use

Potential downstream uses include:

  • math-focused copilots
  • conjecture analysis interfaces
  • educational proof assistants
  • formal/informal bridge systems
  • evaluation pipelines for reasoning-heavy LLMs

Out-of-scope use

This model is not intended to be treated as:

  • a formal theorem prover
  • a replacement for proof assistants
  • a certified symbolic solver
  • a source of guaranteed-correct proofs
  • an authoritative system for high-stakes mathematical claims

Training Data

This model is part of a larger corpus-building effort designed for training math AI systems aimed at unsolved conjecture reasoning. The v1 dataset pipeline combines local conjecture data with curated open-license Hugging Face datasets covering competition mathematics, structured reasoning, and formal proof corpora. The repository also documents broader open math sources such as:

  • OpenR1-Math-220k
  • FineProofs-SFT
  • LeanStatement_CoT
  • NuminaMath-LEAN
  • DeepSeek-Prover-V1

These sources are included to improve coverage across proof traces, theorem formalization, and reasoning-heavy mathematical examples. The project documentation also emphasizes open-license, practical-size, and deterministic filtering choices in the corpus pipeline.


Training Procedure

The model-development workspace supports:

  • SFT training
  • SOTA math profile training
  • scratch initialization
  • scratch dry-runs
  • self-improvement / rejection-sampling data generation
  • evaluation with richer reasoning metrics
  • adapter merge and Hub publish flows

The SOTA curriculum profiles responses across:

  • simple
  • intermediate
  • advanced
  • Lean-formalized bands

Evaluation capabilities

The project documentation states that the SOTA evaluation flow includes:

  • difficulty_band_metrics
  • response_profile_metrics
  • simple_to_lean
  • selected_pass_at_k
  • consensus_rate
  • consensus_pass_at_k

It also supports stricter grading controls such as optional SymPy symbolic verification and substring-match controls.

Self-improvement flow

The repository includes a rejection-sampling data-generation path via generate_rft_data.py, used to create self-improvement training data from model outputs.


Project Architecture

The wider project includes:

  • merged dataset construction
  • validation and release packaging
  • model development configs and scripts
  • a Space trainer app
  • a conjecture-lab Space app
  • rollout runbooks
  • state-of-the-art math blueprints

This means the model should be understood as one artifact within a broader research pipeline rather than a standalone checkpoint.


Example Usage

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

repo_id = "NorthernTribe-Research/math-conjecture-model"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)

prompt = """Analyze the following mathematical conjecture.

Conjecture:
If a sequence is eventually periodic, then its generating function is rational.

Return:
1. Intuition
2. Proof sketch
3. Key assumptions
4. Formalization notes
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompting style

This model generally benefits from prompts that request structure explicitly, such as:

  • intuition
  • proof sketch
  • formalization notes
  • assumptions
  • edge cases
  • candidate counterexamples
  • Lean-style outline

Example prompt

Consider the statement:

"If a + b is even, then a and b have the same parity."

Provide:
1. An intuitive explanation
2. A proof
3. A compact formalization outline
4. Any assumptions or edge cases

Local Inference

The project documentation confirms a local inference path using llama.cpp after:

  1. merging the adapter into a full model
  2. converting the merged model to GGUF
  3. optionally quantizing for lighter deployment

This supports local experimentation outside standard Transformers-based serving.


Limitations

This is a research model and may still:

  • generate invalid proofs that sound convincing
  • confuse symbolic plausibility with correctness
  • fail on deep multi-hop theorem reasoning
  • produce incomplete or brittle formalization hints
  • overuse familiar proof templates
  • hallucinate mathematical structure

All substantive outputs should be independently verified with formal tools, symbolic systems, or expert review.


Risks and Recommendations

Because the model is optimized for proof-style text generation, users may over-trust fluent mathematical output. For that reason:

  • verify results independently
  • use proof assistants or symbolic systems when correctness matters
  • treat outputs as research assistance, not certification
  • benchmark behavior before deployment in educational or analytical systems

Project Documentation

The repository includes documentation and workflow support for:

  • dataset release
  • model development
  • rollout automation
  • Space deployment
  • evaluation
  • adapter merge and publish
  • model-card refresh automation

Citation

@misc{northerntribe_math_conjecture_sota_05b_2026,
  title        = {Math Conjecture SOTA 0.5B},
  author       = {NorthernTribe Research},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {Model repository}
}

Disclaimer

This model is intended for research, experimentation, and educational exploration in mathematical reasoning. It does not guarantee theorem validity, proof correctness, or novel mathematical discovery.