Apply curated official model card (SOTA 0.5B) with full project context.

d671937 verified 28 days ago

9.12 kB

language:
  - en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
  - mathematics
  - conjectures
  - theorem-proving
  - reasoning
  - qlora
  - lora
  - peft
  - formal-math
  - lean
  - research
base_model: Qwen/Qwen2.5-0.5B-Instruct
datasets:
  - NorthernTribe-Research/math-conjecture-training-corpus
model-index:
  - name: math-conjecture-model
    results: []

Math Conjecture SOTA 0.5B

Math Conjecture SOTA 0.5B is a research-focused language model adapted for mathematical reasoning, conjecture analysis, proof-style generation, and formalization-aware responses. It is part of the broader Math Conjecture Training Corpus effort, which builds open-license training pipelines for unsolved conjectures, structured reasoning, competition mathematics, proof traces, and Lean-oriented theorem workflows.

This checkpoint is intended for research and experimentation around long-form mathematical reasoning rather than proof certification.

Model Details

Model description

This model is fine-tuned to produce more structured mathematical outputs, including:

intuition-first explanations
stepwise proof sketches
conjecture decomposition
theorem-style reasoning
informal-to-formal transition hints
Lean-aware reasoning patterns

The surrounding project supports multiple development paths including supervised fine-tuning, state-of-the-art math profiles, scratch initialization, evaluation, self-improvement data generation, adapter merge, Hub publishing, and local llama.cpp inference after conversion.

Model type

Parameter-efficient fine-tuned causal language model for math reasoning.

Hub repo

Model repo: NorthernTribe-Research/math-conjecture-model
Dataset repo: NorthernTribe-Research/math-conjecture-training-corpus
Trainer Space repo: NorthernTribe-Research/math_trainer

Parameter Count Visualization

Metric	Value
Total parameters	502.831M (502,830,976)
Trainable parameters	8.798M (8,798,208)
Frozen parameters	494.033M (494,032,768)
Trainable ratio	1.7497%

Trainable share: [#---------------------------] 1.7497%

This checkpoint uses a parameter-efficient adaptation setup in which only a small fraction of model weights are trainable while the vast majority remain frozen. The project documentation explicitly states that training summaries record model.parameter_counts with total, trainable, frozen, and ratio fields, and that the Hub README is auto-updated with this visualization when push_to_hub is enabled.

Training Reference

Summary source: workspace/runs/math-conjecture-sota-050b-quick/training_summary.json
This model card is derived from the repository README together with the latest training summary. The repo also includes a helper for refreshing the model card directly from those sources: scripts/update_model_card.py.

Intended Uses

Direct use

This model is suitable for:

mathematical reasoning research
conjecture exploration demos
proof-sketch generation
theorem-style answer generation
reasoning benchmark experiments
formalization-oriented prompting
research prototypes on Hugging Face Spaces

Downstream use

Potential downstream uses include:

math-focused copilots
conjecture analysis interfaces
educational proof assistants
formal/informal bridge systems
evaluation pipelines for reasoning-heavy LLMs

Out-of-scope use

This model is not intended to be treated as:

a formal theorem prover
a replacement for proof assistants
a certified symbolic solver
a source of guaranteed-correct proofs
an authoritative system for high-stakes mathematical claims

Training Data

This model is part of a larger corpus-building effort designed for training math AI systems aimed at unsolved conjecture reasoning. The v1 dataset pipeline combines local conjecture data with curated open-license Hugging Face datasets covering competition mathematics, structured reasoning, and formal proof corpora. The repository also documents broader open math sources such as:

OpenR1-Math-220k
FineProofs-SFT
LeanStatement_CoT
NuminaMath-LEAN
DeepSeek-Prover-V1

These sources are included to improve coverage across proof traces, theorem formalization, and reasoning-heavy mathematical examples. The project documentation also emphasizes open-license, practical-size, and deterministic filtering choices in the corpus pipeline.

Training Procedure

The model-development workspace supports:

SFT training
SOTA math profile training
scratch initialization
scratch dry-runs
self-improvement / rejection-sampling data generation
evaluation with richer reasoning metrics
adapter merge and Hub publish flows

The SOTA curriculum profiles responses across:

simple
intermediate
advanced
Lean-formalized bands

Evaluation capabilities

The project documentation states that the SOTA evaluation flow includes:

difficulty_band_metrics
response_profile_metrics
simple_to_lean
selected_pass_at_k
consensus_rate
consensus_pass_at_k

It also supports stricter grading controls such as optional SymPy symbolic verification and substring-match controls.

Self-improvement flow

The repository includes a rejection-sampling data-generation path via generate_rft_data.py, used to create self-improvement training data from model outputs.

Project Architecture

The wider project includes:

merged dataset construction
validation and release packaging
model development configs and scripts
a Space trainer app
a conjecture-lab Space app
rollout runbooks
state-of-the-art math blueprints

This means the model should be understood as one artifact within a broader research pipeline rather than a standalone checkpoint.

Example Usage

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

repo_id = "NorthernTribe-Research/math-conjecture-model"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)

prompt = """Analyze the following mathematical conjecture.

Conjecture:
If a sequence is eventually periodic, then its generating function is rational.

Return:
1. Intuition
2. Proof sketch
3. Key assumptions
4. Formalization notes
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompting style

This model generally benefits from prompts that request structure explicitly, such as:

intuition
proof sketch
formalization notes
assumptions
edge cases
candidate counterexamples
Lean-style outline

Example prompt

Consider the statement:

"If a + b is even, then a and b have the same parity."

Provide:
1. An intuitive explanation
2. A proof
3. A compact formalization outline
4. Any assumptions or edge cases

Local Inference

The project documentation confirms a local inference path using llama.cpp after:

merging the adapter into a full model
converting the merged model to GGUF
optionally quantizing for lighter deployment

This supports local experimentation outside standard Transformers-based serving.

Limitations

This is a research model and may still:

generate invalid proofs that sound convincing
confuse symbolic plausibility with correctness
fail on deep multi-hop theorem reasoning
produce incomplete or brittle formalization hints
overuse familiar proof templates
hallucinate mathematical structure

All substantive outputs should be independently verified with formal tools, symbolic systems, or expert review.

Risks and Recommendations

Because the model is optimized for proof-style text generation, users may over-trust fluent mathematical output. For that reason:

verify results independently
use proof assistants or symbolic systems when correctness matters
treat outputs as research assistance, not certification
benchmark behavior before deployment in educational or analytical systems

Project Documentation

The repository includes documentation and workflow support for:

dataset release
model development
rollout automation
Space deployment
evaluation
adapter merge and publish
model-card refresh automation

Citation

@misc{northerntribe_math_conjecture_sota_05b_2026,
  title        = {Math Conjecture SOTA 0.5B},
  author       = {NorthernTribe Research},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {Model repository}
}

Disclaimer

This model is intended for research, experimentation, and educational exploration in mathematical reasoning. It does not guarantee theorem validity, proof correctness, or novel mathematical discovery.