My TIGER app is now fully working again, with fixes and full compatibility with Gradio 6 ๐
It lets you: - ๐๏ธ Separate multiple speakers from an audio file - ๐ฌ Extract each speaker directly from a video - ๐ง Split audio into dialog, music, and sound effects (DnR) - ๐ฅ Apply DnR separation directly on videos
All powered by lightweight TIGER models for fast and efficient speech separation.
Things our clients and open source actually said to us this year:
"Finally, someone built a synthetic PII training data for German."
"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."
"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."
Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.
Introducing GRM2, a powerful 3 billion parameter model designed for long-term reasoning and high performance in complex tasks.
Even with only 3 billion parameters, it outperforms qwen3-32b in several benchmarks and complex reasoning tasks.
With just 3 billion parameters, it can also generate extensive and complex code with over 1000 lines, utilize tools comparable to larger models, and is perfect for agentic tasks.
GRM2 is licensed under Apache 2.0, making it ideal as a base for FineTune in other tasks.
Introducing GRM2, a powerful 3 billion parameter model designed for long-term reasoning and high performance in complex tasks.
Even with only 3 billion parameters, it outperforms qwen3-32b in several benchmarks and complex reasoning tasks.
With just 3 billion parameters, it can also generate extensive and complex code with over 1000 lines, utilize tools comparable to larger models, and is perfect for agentic tasks.
GRM2 is licensed under Apache 2.0, making it ideal as a base for FineTune in other tasks.
How do you find ideas to try next? I'm tracking multiple topics tied to the projects we're building at Remyx. Every morning I get a feed of papers ranked by relevance to those topics. No more good ideas lost because they didn't trend on X.
๐ค >_ Can an LLM execute logic gates and boolean arithmetic ?
We need to create datasets : - Neural Arithmetic and Logic Unit (NALU) 32 bits - Neural Application Binary Interface (NABI) 32 bits
๐ฏ Optimal Instruction Set = RV32IMAF
This opens the way for code writing and execution by the LLMs themselves without an external CLI.
The more of us who want it, the more possible it will become ...
PhysiQuanty/Binary-Addition-LLM-POC (10-bits binary addition : binary carry propagation, sampling no longer has any effect on the logits due to the fact that it is deterministic next token.)
๐ Releasing gradio-sync3dcompare v0.0.22 โ a Gradio custom component for synchronized 3D model comparison
๐ One component. Side-by-side. Perfectly in sync.
โจ What's included
๐๏ธ Supports GLB and PLY files ๐ต Renders as point clouds or native meshes ๐ฅ Synchronized orbit, zoom, and pan across all viewports ๐ Auto point sizing with manual override ๐ Configurable zoom range and reset controls
๐ฆ pip install gradio-sync3dcompare
๐ ๏ธ Built on Gradio 6.10.0 โ drops into any gr.Blocks app with a single import.
๐ฌ See it in action in the video below. The video shows a real-world comparison of two 3D point clouds reconstructed from stereo depth estimation โ one from FoundationStereo and one from RAFTStereo. Both models are exported as GLB files directly from the depth output and loaded side-by-side into the component. Every orbit, zoom, and pan is perfectly mirrored across both viewports, making it easy to spot structural differences between the two reconstructions at any angle.
๐ฌ Feedback on supported formats, rendering features, or comparison workflows is very welcome!
reactedtoSeaWolf-AI'spost with โค๏ธ๐ฅ4 days ago
๐ World Model Bench โ does your world model actually think?
FID measures realism. FVD measures smoothness. But neither tells you whether the model understood the scene.
We just released WM Bench โ the first benchmark for cognitive intelligence in world models. The core question: when a beast charges from 3 meters away, does the model know to sprint โ not walk? Does it respond differently to a human vs an animal? Does it remember the left corridor was blocked two steps ago?
Those are cognitive questions. No existing benchmark asks them. So we built one.
- ๐ P1 Perception (25%) โ Can it read the scene? - ๐ง P2 Cognition (45%) โ Does it predict threats, escalate emotions, utilize memory? - ๐ฅ P3 Embodiment (30%) โ Does the body respond with the right motion?
All evaluation is via simple JSON I/O โ no 3D engine, no special hardware. Any model with an API can participate.
We also built PROMETHEUS as a live reference implementation โ runs in your browser on a T4, no install needed. Combines FloodDiffusion motion generation with a LLM cognitive brain (Perceive โ Predict โ Decide โ Act). Scored 726/1000 (Grade B) on Track C โ the only directly verified model so far. Submissions from other teams very welcome.
๐ค >_ Can an LLM execute logic gates and boolean arithmetic ?
We need to create datasets : - Neural Arithmetic and Logic Unit (NALU) 32 bits - Neural Application Binary Interface (NABI) 32 bits
๐ฏ Optimal Instruction Set = RV32IMAF
This opens the way for code writing and execution by the LLMs themselves without an external CLI.
The more of us who want it, the more possible it will become ...
PhysiQuanty/Binary-Addition-LLM-POC (10-bits binary addition : binary carry propagation, sampling no longer has any effect on the logits due to the fact that it is deterministic next token.)
๐ค >_ Can an LLM execute logic gates and boolean arithmetic ?
We need to create datasets : - Neural Arithmetic and Logic Unit (NALU) 32 bits - Neural Application Binary Interface (NABI) 32 bits
๐ฏ Optimal Instruction Set = RV32IMAF
This opens the way for code writing and execution by the LLMs themselves without an external CLI.
The more of us who want it, the more possible it will become ...
PhysiQuanty/Binary-Addition-LLM-POC (10-bits binary addition : binary carry propagation, sampling no longer has any effect on the logits due to the fact that it is deterministic next token.)
We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match.
Everything is open-sourced: datasets, adapters, and code.
Most kymatio tests were done on standard pytorch models that yielded higher accuracy than simple conv or transformers before overfitting, but not in every instance. Most common tested low-count cifar10 and cifar100 instances yielded more for less. Those are in the hypersphere-experiments notebooks and are viewable via huggingface tensorboard metrics.
The accuracy, retention, agreement, disagreement, and sheer capacity of the refined SVD kernel shows that full Procrustes alignment is not just crucial to distillation, but also entirely representable within encoders themselves as students.
This structure can representationally re-impose layer-by-layer which is what I tested, and this capture system can behave as a global regularization system, a selector, a behavioral adjudication structure, an encoding solidification unit, a trajectory systemic accumulator, an anchored differentiation unit, and about 30 other tests show - all of the above simultaneously.
The preliminary rapid-iteration capable kernel shows that not only can these behaviorally represent utility, but the noise-drift can be directly accounted for using systems like GELU, drop path, dropout, and other elements to learn to ignore that very noise that accumulates.
Attention is now officially deemed valid when utilized based on the tests and examples allowing preserved geometric structure after attention selection.
This encoding structure is substantially more durable than I can give credit for.
Surge is coming, exactly as predicted. Late I admit.