Commit History

last commit
2624b79
Running

DeepParmar commited on

Update FINDINGS_PAPER.md with latest benchmarks and standardize DeepSeek-V3 ID
12b944d

DeepParmar commited on

Add detailed model performance reasoning across all benchmark documentation
40ab31f

DeepParmar commited on

Update docs with latest HF Native and OpenRouter benchmark scores
bd428dc

DeepParmar commited on

final commit v2
9e79ae0

DeepParmar commited on

Compliance fix: Move inference.py to repo root strictly enforcing OpenEnv hackathon submission rules
4e7c1df

DeepParmar commited on

Refine UI layout and remove raw terminal logs from benchmark records
5966a06

DeepParmar commited on

Update HF space sync URL to DeepParmar/code-review
3aa41ae

DeepParmar commited on

Untrack AUDIT_RESULTS.md and add to gitignore per user request
0793608

DeepParmar commited on

Track server/ and fix gitignore
f72c6b2

DeepParmar commited on

Restore server folder and requirements.txt per user request
2b2ebe7

DeepParmar commited on

Final cleanup: Remove redundant testing scripts, un-track logs, sanitize comments
c43ae5c

DeepParmar commited on

Fix NoneType subscript bug in json parser pipeline
8ab3fe3

DeepParmar commited on

Update master record with massive confidence table and exact native module names
48ab79c

DeepParmar commited on

Update master records with exclusive latest HF and OR runs
3129333

DeepParmar commited on

Final master benchmark annotation record and raw logs
4385d2b

DeepParmar commited on

Add compiled benchmark_comparison, HF Native serverless testing logs
88518e4

DeepParmar commited on

Add final senior review checklist, final test-2last.txt tests with 5 frontier models against live HF Space!
8cddc5b

DeepParmar commited on

Update last-test.txt and final-result.txt with fresh benchmark data
f068648

DeepParmar commited on

Finalize submission: Add final-result.txt, clean up OpenRouter API keys from scripts, remove pycache, update logs
149378d

DeepParmar commited on

Add extreme final submission tests (48 tests: math, load, cross-file, adversarial, compliance)
4757a2e

DeepParmar commited on

Update Hugging Face Space URL in sync workflow
dfae0f1

DeepParmar commited on

chore: push HF space update and remove secrets
9de9c34

DeepParmar commited on

fix: clamp inference score strictly to 0.999 to avoid float format rounding to 1.000
41aa728

DeepParmar commited on

chore: audit findings fixes, openenv yaml updates, run_benchmark config, and audit results report
d64e3c6

DeepParmar commited on

experimental
27d7338

DeepParmar commited on

Initial commit
4a310a7
unverified

DeepParmar commited on