AI & ML interests

None defined yet.

Recent Activity

abidlabs  updated a bucket 1 day ago
trackio/trackio-wheels
abidlabs  published a bucket 20 days ago
trackio/trackio-wheels
View all activity

abidlabs 
updated a bucket 1 day ago
qgallouedec 
posted an update 2 days ago
view post
Post
1603
TRL v1.2 introduces the SSDTrainer 🚀

Simple Self-Distillation (SSD) from Apple's paper "Embarrassingly Simple Self-Distillation Improves Code Generation" is now available as an experimental trainer in TRL.

The recipe is as minimal as the name suggests: sample completions from the model itself at a training-time temperature, then fine-tune on those raw, unverified samples with plain cross-entropy. No reward model. No verifier. No teacher model. No reinforcement learning. Just prompts and the model.

from trl.experimental.ssd import SSDConfig, SSDTrainer

trainer = SSDTrainer(
    model="Qwen/Qwen3-4B-Instruct",
    args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95),
    train_dataset=dataset,
)
trainer.train()


v1.2 also ships expanded tool-calling support (LLaMA 3.1 / 3.2, DeepSeek-V3), another round of KTO ↔ DPO alignment getting us closer to promoting KTO to stable, a big GRPO simplification for overlong tool results, deprecation of use_transformers_paged, and key fixes for VLM response parsing.

Full release notes: https://github.com/huggingface/trl/releases/tag/v1.2.0
qgallouedec 
posted an update 19 days ago
view post
Post
2306
TRL v1.0 is out!

Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.

The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.

What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.

pip install --upgrade trl


Read more: hf.co/blog/trl-v1
abidlabs 
published a bucket 20 days ago
qgallouedec 
posted an update about 2 months ago
view post
Post
3008
@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:

Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:

How hard is it to turn Tiny Aya into an agent?

Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.

Small model. Global reach. Agent capabilities.

👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb
  • 1 reply
·
abidlabs 
posted an update 6 months ago
view post
Post
10669
Why I think local, open-source models will eventually win.

The most useful AI applications are moving toward multi-turn agentic behavior: systems that take hundreds or even thousands of iterative steps to complete a task, e.g. Claude Code, computer-control agents that click, type, and test repeatedly.

In these cases, the power of the model is not how smart it is per token, but in how quickly it can interact with its environment and tools across many steps. In that regime, model quality becomes secondary to latency.

An open-source model that can call tools quickly, check that the right thing was clicked, or verify that a code change actually passes tests can easily outperform a slightly “smarter” closed model that has to make remote API calls for every move.

Eventually, the balance tips: it becomes impractical for an agent to rely on remote inference for every micro-action. Just as no one would tolerate a keyboard that required a network request per keystroke, users won’t accept agent workflows bottlenecked by latency. All devices will ship with local, open-source models that are “good enough” and the expectation will shift toward everything running locally. It’ll happen sooner than most people think.
  • 8 replies
·
abidlabs 
posted an update 7 months ago
abidlabs 
posted an update 11 months ago
abidlabs 
posted an update 12 months ago
view post
Post
5275
HOW TO ADD MCP SUPPORT TO ANY 🤗 SPACE

Gradio now supports MCP! If you want to convert an existing Space, like this one hexgrad/Kokoro-TTS, so that you can use it with Claude Desktop / Cursor / Cline / TinyAgents / or any LLM that supports MCP, here's all you need to do:

1. Duplicate the Space (in the Settings Tab)
2. Upgrade the Gradio sdk_version to 5.28 (in the README.md)
3. Set mcp_server=True in launch()
4. (Optionally) add docstrings to the function so that the LLM knows how to use it, like this:

def generate(text, speed=1):
    """
    Convert text to speech audio.

    Parameters:
        text (str): The input text to be converted to speech.
        speed (float, optional): Playback speed of the generated speech.


That's it! Now your LLM will be able to talk to you 🤯
abidlabs 
posted an update 12 months ago
view post
Post
2849
Hi folks! Excited to share a new feature from the Gradio team along with a tutorial.

If you don't already know, Gradio is an open-source Python library used to build interfaces for machine learning models. Beyond just creating UIs, Gradio also exposes API capabilities and now, Gradio apps can be launched Model Context Protocol (MCP) servers for LLMs.

If you already know how to use Gradio, there are only two additional things you need to do:
* Add standard docstrings to your function (these will be used to generate the descriptions for your tools for the LLM)
* Set mcp_server=True in launch()


Here's a complete example (make sure you already have the latest version of Gradio installed):


import gradio as gr

def letter_counter(word, letter):
    """Count the occurrences of a specific letter in a word.
    
    Args:
        word: The word or phrase to analyze
        letter: The letter to count occurrences of
        
    Returns:
        The number of times the letter appears in the word
    """
    return word.lower().count(letter.lower())

demo = gr.Interface(
    fn=letter_counter,
    inputs=["text", "text"],
    outputs="number",
    title="Letter Counter",
    description="Count how many times a letter appears in a word"
)

demo.launch(mcp_server=True)



This is a very simple example, but you can add the ability to generate Ghibli images or speak emotions to any LLM that supports MCP. Once you have an MCP running locally, you can copy-paste the same app to host it on [Hugging Face Spaces](https://huggingface.co/spaces/) as well.

All free and open-source of course! Full tutorial: https://www.gradio.app/guides/building-mcp-server-with-gradio
  • 2 replies
·