[
  {
    "additions": 4,
    "author": "cjkindel",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? `_can_set_attn_implementation` and `_can_set_experts_implementation` both do a direct subscript lookup into `sys.modules`: ```python class_module = sys.modules[cls.__module__] ``` If the module is not registered und\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44978",
    "created_at": "2026-03-24T21:01:11Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44978/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44978",
    "labels": [],
    "merged": false,
    "number": 44978,
    "review_comments_count": 0,
    "state": "open",
    "title": "fix: handle absent sys.modules entry in modeling_utils",
    "updated_at": "2026-03-24T21:12:19Z"
  },
  {
    "additions": 2,
    "author": "hmellor",
    "author_association": "MEMBER",
    "body_excerpt": "- Adds a type hint to `ModernVBertForMaskedLM.__init__` - Removes `tie_word_embeddings` from `Qwen2VLTextConfig` (and therefore also `Qwen2_5_VLTextConfig`) because it's not valid for these models - Remove hack from `ColQwen2Config` (and t\u2026",
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44976",
    "created_at": "2026-03-24T19:26:33Z",
    "deletions": 10,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44976/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44976",
    "labels": [],
    "merged": true,
    "number": 44976,
    "review_comments_count": 3,
    "state": "closed",
    "title": "Fix tie_word_embedding issues with `Qwen2VL`",
    "updated_at": "2026-03-24T20:55:15Z"
  },
  {
    "additions": 6971,
    "author": "philippguevorguian",
    "author_association": "NONE",
    "body_excerpt": null,
    "changed_files": 20,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44975",
    "created_at": "2026-03-24T17:12:31Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44975/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44975",
    "labels": [],
    "merged": false,
    "number": 44975,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: rebase main; clean config reads, ImageProcessor backend, misc cleanup",
    "updated_at": "2026-03-24T17:13:42Z"
  },
  {
    "additions": 799,
    "author": "3outeille",
    "author_association": "MEMBER",
    "body_excerpt": null,
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44974",
    "created_at": "2026-03-24T16:13:25Z",
    "deletions": 82,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44974/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44974",
    "labels": [],
    "merged": false,
    "number": 44974,
    "review_comments_count": 0,
    "state": "open",
    "title": "Refactor core_model_loading to support FSDP shard-on-read loading",
    "updated_at": "2026-03-24T16:28:48Z"
  },
  {
    "additions": 22,
    "author": "andylizf",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Adds `.item()` to `max_seqlen = (cu_seqlens[1:] - cu_seqlens[:-1]).max()` in all vision attention modules that pass this value to `flash_attn_varlen_func`. ### Context On **released versions** (e.g. 4.52.4), using\u2026",
    "changed_files": 19,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44973",
    "created_at": "2026-03-24T15:42:32Z",
    "deletions": 22,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44973/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44973",
    "labels": [],
    "merged": false,
    "number": 44973,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix max_seqlen type in vision attention for torch.compile + FA2",
    "updated_at": "2026-03-24T15:46:30Z"
  },
  {
    "additions": 17,
    "author": "Abdennacer-Badaoui",
    "author_association": "MEMBER",
    "body_excerpt": "As per title. Updating Gemma3/Gemma3n expectations.",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44972",
    "created_at": "2026-03-24T15:11:50Z",
    "deletions": 12,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44972/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44972",
    "labels": [],
    "merged": true,
    "number": 44972,
    "review_comments_count": 10,
    "state": "closed",
    "title": "[AMD CI] Gemma3/Gemma3n Expectations",
    "updated_at": "2026-03-24T16:30:03Z"
  },
  {
    "additions": 0,
    "author": "ArthurZucker",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Removed the tokenizer_class attr was never there to begin with, and kwargs are now supported. This was failing some test on vllm ci. Fixes https://buildkite.com/vllm/ci/builds/57601/steps/canvas?sid=019d1aec-aa5a-41\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44971",
    "created_at": "2026-03-24T14:59:36Z",
    "deletions": 11,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44971/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44971",
    "labels": [],
    "merged": true,
    "number": 44971,
    "review_comments_count": 1,
    "state": "closed",
    "title": "[ `vllm x v5`] nit",
    "updated_at": "2026-03-24T17:40:05Z"
  },
  {
    "additions": 20,
    "author": "IlyasMoutawwakil",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 5,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44970",
    "created_at": "2026-03-24T13:49:21Z",
    "deletions": 76,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44970/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44970",
    "labels": [],
    "merged": false,
    "number": 44970,
    "review_comments_count": 1,
    "state": "open",
    "title": "Fix CPU 16 bytes alignment issue using equivalent fallback",
    "updated_at": "2026-03-24T20:45:07Z"
  },
  {
    "additions": 4,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Extends the CI so we can use Make and read toml files",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44968",
    "created_at": "2026-03-24T11:43:24Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44968/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44968",
    "labels": [],
    "merged": false,
    "number": 44968,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Added Make to the docker and `tomli` to `.[quality]`",
    "updated_at": "2026-03-24T15:06:29Z"
  },
  {
    "additions": 87,
    "author": "Qubitium",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fix: FA kernel launches currently are not thread-safe (nogil) in multi-gpu env. This simple patch fixes the issue. ```py # Set the correct CUDA context before launching the FlashAttention kernel. with torch.cuda.dev\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44967",
    "created_at": "2026-03-24T11:33:45Z",
    "deletions": 84,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44967/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44967",
    "labels": [],
    "merged": false,
    "number": 44967,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix FA kernel launch needs correct cuda device ctx in multi-gpu env",
    "updated_at": "2026-03-24T13:49:04Z"
  },
  {
    "additions": 8,
    "author": "pramilajangid",
    "author_association": "NONE",
    "body_excerpt": "Fixes #44964 ## Summary This PR restores backward compatibility for `CommonKwargs` in `transformers.processing_utils`, which is still referenced by some remote processor implementations. ## Problem After the typed-dict cleanup (commit `533\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44966",
    "created_at": "2026-03-24T11:06:57Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44966/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44966",
    "labels": [],
    "merged": false,
    "number": 44966,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix backward compatibility for CommonKwargs in processing_utils (brea\u2026",
    "updated_at": "2026-03-24T12:48:44Z"
  },
  {
    "additions": 37,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 2,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44965",
    "created_at": "2026-03-24T10:59:31Z",
    "deletions": 32,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44965/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44965",
    "labels": [],
    "merged": false,
    "number": 44965,
    "review_comments_count": 0,
    "state": "open",
    "title": "try",
    "updated_at": "2026-03-24T11:19:27Z"
  },
  {
    "additions": 3,
    "author": "josh-kean",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes an import in src/transformers/video_processing_utils.py that was causing the main build to fail Fixes # 44933 ## Code Agent Policy The Transformers repo is currently being overwhelmed by a large number of PRs\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44958",
    "created_at": "2026-03-23T20:07:09Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44958/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44958",
    "labels": [],
    "merged": false,
    "number": 44958,
    "review_comments_count": 1,
    "state": "open",
    "title": "fixed import error with PILImageResampling",
    "updated_at": "2026-03-24T13:53:00Z"
  },
  {
    "additions": 1473,
    "author": "bigshanedogg",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "> **Draft PR \u2014 waiting for issue approval.** This PR is opened alongside the issue request. > It will be marked ready for review after a maintainer gives the go-ahead on the issue. # What does this PR do? Adds native Transformers support f\u2026",
    "changed_files": 12,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44956",
    "created_at": "2026-03-23T19:34:30Z",
    "deletions": 0,
    "draft": true,
    "files_url": "https://github.com/huggingface/transformers/pull/44956/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44956",
    "labels": [],
    "merged": false,
    "number": 44956,
    "review_comments_count": 0,
    "state": "open",
    "title": "[WIP] Add HyperCLOVAX model",
    "updated_at": "2026-03-23T19:38:26Z"
  },
  {
    "additions": 0,
    "author": "stevhliu",
    "author_association": "MEMBER",
    "body_excerpt": "removes outdated qa pipeline reference",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44954",
    "created_at": "2026-03-23T17:20:37Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44954/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44954",
    "labels": [],
    "merged": false,
    "number": 44954,
    "review_comments_count": 0,
    "state": "open",
    "title": "[docs] pipeline cleanup",
    "updated_at": "2026-03-23T17:30:10Z"
  },
  {
    "additions": 861,
    "author": "zucchini-nlp",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Decouples `kwargs` manipulation from hub's strict decorator, and ensures that all subclasses of a `PreTrainedConfig` accept any kwargs which is what we supported prev. Not all remote code has `@strict` or has an `__\u2026",
    "changed_files": 536,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44953",
    "created_at": "2026-03-23T17:13:39Z",
    "deletions": 824,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44953/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44953",
    "labels": [],
    "merged": true,
    "number": 44953,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Config kwargs",
    "updated_at": "2026-03-24T14:14:46Z"
  },
  {
    "additions": 10,
    "author": "Jess-Co-Del",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes the non existence of output dictionary change, when parameter output_hidden_states=True is passed to models like CLIP or SigLip. This is especially pertinent for the vision model config. According to #42759 no\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44952",
    "created_at": "2026-03-23T17:02:50Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44952/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44952",
    "labels": [],
    "merged": false,
    "number": 44952,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix: Add correct return behaviour when output_hidden_states=True for CLIP and SIGLIP vision models",
    "updated_at": "2026-03-24T11:19:35Z"
  },
  {
    "additions": 113,
    "author": "hemantmm",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? This pull request adds routing replay functionality for mixture-of-experts (MoE) model types by giving users the option to override router probabilities while processing a forward pass through their models. <!-- Con\u2026",
    "changed_files": 4,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44951",
    "created_at": "2026-03-23T16:29:46Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44951/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44951",
    "labels": [],
    "merged": false,
    "number": 44951,
    "review_comments_count": 0,
    "state": "open",
    "title": "feat: Add router_logits override to enable Routing Replay for MoE models",
    "updated_at": "2026-03-23T17:06:24Z"
  },
  {
    "additions": 1197,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? As per the title. This PR finally makes mamba layer caches first class citizen, and adds native support for them. It supports the following layers combinations: - all mamba layers - alternating attention layer/mamba\u2026",
    "changed_files": 61,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44950",
    "created_at": "2026-03-23T16:25:13Z",
    "deletions": 4090,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44950/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44950",
    "labels": [],
    "merged": false,
    "number": 44950,
    "review_comments_count": 0,
    "state": "open",
    "title": "[Cache] Native mamba & hybrid cache",
    "updated_at": "2026-03-24T21:21:00Z"
  },
  {
    "additions": 80,
    "author": "Charly21r",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes #44936 This PR fixes an issue with `NotebookProgressCallback` in the `Trainer` where calling evaluate() before or after training would crash due to the training tracker being `None`. The callback now properly\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44949",
    "created_at": "2026-03-23T16:07:50Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44949/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44949",
    "labels": [],
    "merged": false,
    "number": 44949,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix: NotebookProgressCallback crash when evaluating with the Trainer",
    "updated_at": "2026-03-24T15:10:21Z"
  },
  {
    "additions": 1,
    "author": "heycorgi",
    "author_association": "NONE",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44948",
    "created_at": "2026-03-23T15:33:56Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44948/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44948",
    "labels": [],
    "merged": false,
    "number": 44948,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Create aa.py",
    "updated_at": "2026-03-23T15:34:35Z"
  },
  {
    "additions": 79,
    "author": "zucchini-nlp",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? The doc was generated by Claude. I deleted unnecessary repetitions and fixed a few moments to be more precise. We don't really need to merge it now so if you think the text is too LLM, feel free to take this as an i\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44947",
    "created_at": "2026-03-23T13:23:04Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44947/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44947",
    "labels": [],
    "merged": false,
    "number": 44947,
    "review_comments_count": 7,
    "state": "open",
    "title": "Add doc page for capturing outputs",
    "updated_at": "2026-03-23T17:04:23Z"
  },
  {
    "additions": 14,
    "author": "BSchilperoort",
    "author_association": "NONE",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 13,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "canonical",
    "comments_count": 5,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44946",
    "created_at": "2026-03-23T12:18:34Z",
    "deletions": 14,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44946/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44946",
    "labels": [],
    "merged": true,
    "number": 44946,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Correct docstrings for `from_pretrained` (url input deprecated)",
    "updated_at": "2026-03-23T13:05:16Z"
  },
  {
    "additions": 71,
    "author": "zucchini-nlp",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? @hmellor",
    "changed_files": 5,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44943",
    "created_at": "2026-03-23T10:58:40Z",
    "deletions": 9,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44943/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44943",
    "labels": [],
    "merged": true,
    "number": 44943,
    "review_comments_count": 1,
    "state": "closed",
    "title": "Clearer type hints and fix rope validation in configs",
    "updated_at": "2026-03-23T13:32:11Z"
  },
  {
    "additions": 220,
    "author": "hmellor",
    "author_association": "MEMBER",
    "body_excerpt": null,
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44942",
    "created_at": "2026-03-23T10:46:23Z",
    "deletions": 5,
    "draft": true,
    "files_url": "https://github.com/huggingface/transformers/pull/44942/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44942",
    "labels": [],
    "merged": false,
    "number": 44942,
    "review_comments_count": 0,
    "state": "open",
    "title": "Add inference time layer fusion optimisations via `PreTrainedModel.from_pretrained(fuse_layers=True)`",
    "updated_at": "2026-03-23T11:57:40Z"
  },
  {
    "additions": 4,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Fix the failing job after #43514 (the fix is effefctive, see [here](https://github.com/huggingface/transformers/actions/runs/23433395911/job/68165255513?pr=44941)) [Update Transformers metadata](https://github.com/h\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44941",
    "created_at": "2026-03-23T10:42:09Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44941/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44941",
    "labels": [],
    "merged": true,
    "number": 44941,
    "review_comments_count": 1,
    "state": "closed",
    "title": "Fix failing job `Update Transformers metadata` after #43514",
    "updated_at": "2026-03-23T13:41:39Z"
  },
  {
    "additions": 138,
    "author": "Qubitium",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Model loading of same model path but 2 different threads (2 different instances) have meta device tensor issues: unloaded meta/empty embedding/lm-head when it should not be empty post model load. Cause: `tie_weight(\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44940",
    "created_at": "2026-03-23T09:55:57Z",
    "deletions": 10,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44940/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44940",
    "labels": [],
    "merged": false,
    "number": 44940,
    "review_comments_count": 3,
    "state": "open",
    "title": "fix tie_weights skipping logic is not tied to model thread scope",
    "updated_at": "2026-03-24T15:24:41Z"
  },
  {
    "additions": 2038,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Refactored and cleaned up model linter - separated package - one rule per module - refactored legacy checks into their own rules - simplified pattern, duplication removal",
    "changed_files": 25,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44939",
    "created_at": "2026-03-23T08:45:36Z",
    "deletions": 1446,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44939/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44939",
    "labels": [],
    "merged": true,
    "number": 44939,
    "review_comments_count": 5,
    "state": "closed",
    "title": "refactor: mlinter as its own package",
    "updated_at": "2026-03-24T07:56:15Z"
  },
  {
    "additions": 2,
    "author": "VanshikaSohal",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Fixes two small but impactful bugs in the BART documentation: 1. **Variable shadowing bug**: In the Pipeline example, the variable was named `pipeline` which shadows the imported `pipeline` function. Renamed to `fi\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44935",
    "created_at": "2026-03-22T18:45:01Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44935/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44935",
    "labels": [],
    "merged": true,
    "number": 44935,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix variable shadowing in pipeline example and typo in BART docs (BERT \u2192 BART)",
    "updated_at": "2026-03-23T14:28:04Z"
  },
  {
    "additions": 9,
    "author": "Sai-Suraj-27",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes this failing [T5ModelIntegrationTest](https://github.com/huggingface/transformers/actions/runs/23230643883/job/67524758706#step:14:1449) & this [Qwen2IntegrationTest](https://github.com/huggingface/transformer\u2026",
    "changed_files": 2,
    "cluster_id": "cluster-44848-5",
    "cluster_ids": [
      "cluster-44848-5"
    ],
    "cluster_role": "canonical",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44934",
    "created_at": "2026-03-22T18:03:34Z",
    "deletions": 7,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44934/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44934",
    "labels": [],
    "merged": true,
    "number": 44934,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix failing `T5ModelIntegrationTest`",
    "updated_at": "2026-03-24T14:50:10Z"
  },
  {
    "additions": 1,
    "author": "r266-tech",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Fixes #44908 The `get_inverse_sqrt_schedule` function accepts `timescale` and `last_epoch` parameters, but `get_scheduler` was not forwarding `scheduler_specific_kwargs` to it. This caused user-provided kwargs like\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44908-3",
    "cluster_ids": [
      "cluster-44908-3"
    ],
    "cluster_role": "member",
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44932",
    "created_at": "2026-03-22T17:30:56Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44932/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44932",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44932,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix: Pass scheduler_specific_kwargs to inverse_sqrt scheduler",
    "updated_at": "2026-03-23T12:44:16Z"
  },
  {
    "additions": 1,
    "author": "r266-tech",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Fixes a v5 regression where `CamembertForMaskedLM` (and all CamemBERT masked-LM tasks) produces near-zero, near-uniform logits, making the model completely non-functional. ### Root cause In v5, `modeling_utils.get_\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 8,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44931",
    "created_at": "2026-03-22T17:28:57Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44931/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44931",
    "labels": [],
    "merged": true,
    "number": 44931,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(camembert): add tie_word_embeddings=True to CamembertConfig",
    "updated_at": "2026-03-23T10:47:49Z"
  },
  {
    "additions": 103,
    "author": "javierdejesusda",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## Summary - Fixes #44912 \u2014 MXFP4 quantization error messages combine `is_triton_available()` and `is_kernels_available()` into a single `kernels_available` boolean, making it impossible to identify which dependency is missing - Split the\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44930",
    "created_at": "2026-03-22T17:27:20Z",
    "deletions": 13,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44930/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44930",
    "labels": [],
    "merged": true,
    "number": 44930,
    "review_comments_count": 3,
    "state": "closed",
    "title": "fix: split MXFP4 dependency checks for specific error messages",
    "updated_at": "2026-03-24T15:33:14Z"
  },
  {
    "additions": 26,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "## Problem In `TokenizersBackend.convert_to_native_format()`, when a tokenizer has a custom `__init__` (the `elif` branch), `tokenizer.json` was parsed **twice**: 1. `TokenizerFast.from_file(fast_tokenizer_file)` \u2014 full Rust parse includin\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44927",
    "created_at": "2026-03-22T15:33:23Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44927/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44927",
    "labels": [],
    "merged": true,
    "number": 44927,
    "review_comments_count": 6,
    "state": "closed",
    "title": "fix: improve processor loading performance by avoiding redundant tokenizer parsing",
    "updated_at": "2026-03-23T11:03:52Z"
  },
  {
    "additions": 25,
    "author": "yonigozlan",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Solve import errors when trying to import `from transformers.models.llama4.image_processing_llama4_fast import Llama4ImageProcessorFast` for example",
    "changed_files": 2,
    "cluster_id": "cluster-44897-2",
    "cluster_ids": [
      "cluster-44897-2"
    ],
    "cluster_role": "canonical",
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44926",
    "created_at": "2026-03-22T14:46:17Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44926/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44926",
    "labels": [],
    "merged": true,
    "number": 44926,
    "review_comments_count": 1,
    "state": "closed",
    "title": "Fix backward compatibility for full path imports of Fast Image Processors",
    "updated_at": "2026-03-23T14:16:49Z"
  },
  {
    "additions": 482,
    "author": "kashif",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? This PR adds a first-class MoE routing capture/replay API for Qwen2Moe and introduces shared MoE routing helpers for reuse by other MoE model families. It adds: - a structured `MoERouting` payload in modeling output\u2026",
    "changed_files": 7,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44925",
    "created_at": "2026-03-22T14:04:40Z",
    "deletions": 24,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44925/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44925",
    "labels": [],
    "merged": false,
    "number": 44925,
    "review_comments_count": 0,
    "state": "open",
    "title": "[MOE]  MoE routing capture and replay support",
    "updated_at": "2026-03-24T12:49:30Z"
  },
  {
    "additions": 9,
    "author": "Qubitium",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fix two `nogil` threading bugs (reproduced on 3.14) : 1. Continus Batching crashes with torch graph errors with 2 threads on 2 separate model instances (same model path, but two distinct instances). Cause is missing\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 13,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44924",
    "created_at": "2026-03-22T11:46:49Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44924/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44924",
    "labels": [],
    "merged": true,
    "number": 44924,
    "review_comments_count": 1,
    "state": "closed",
    "title": "Continuous batching thread safety",
    "updated_at": "2026-03-24T05:42:56Z"
  },
  {
    "additions": 3,
    "author": "prakhar-agarwal",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "Addresses issue #44843. Verified with isolated repro logic. Changes made: Updated the logic to properly identify local and offline scenarios upfront. Now, is_local is correctly set to True if: 1. is_offline_mode() is active. 2. The local_f\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44923",
    "created_at": "2026-03-22T05:20:22Z",
    "deletions": 1,
    "draft": true,
    "files_url": "https://github.com/huggingface/transformers/pull/44923/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44923",
    "labels": [],
    "merged": false,
    "number": 44923,
    "review_comments_count": 0,
    "state": "open",
    "title": "fix: avoid unconditional model_info call in _patch_mistral_regex",
    "updated_at": "2026-03-22T05:24:11Z"
  },
  {
    "additions": 10,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## What does this PR do? Fixes #44849. When `output_hidden_states=True` (or `output_attentions=True`) is passed to `model.generate()`, the `@capture_outputs` decorator reads the flag value but leaves it in `**kwargs`. These flags then prop\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44922",
    "created_at": "2026-03-22T01:21:22Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44922/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44922",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44922,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: pop output_* flags from kwargs in capture_outputs to prevent submodule leakage",
    "updated_at": "2026-03-23T12:38:56Z"
  },
  {
    "additions": 4,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## What does this PR do? Fixes #44918. `compute_3d_position_ids` in the Qwen2.5-VL / Qwen3-VL / Qwen3.5 model families destructures `inputs_embeds.shape` into exactly three variables: ```python batch_size, seq_length, _ = inputs_embeds.sha\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44921",
    "created_at": "2026-03-22T00:39:01Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44921/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44921",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44921,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: use shape index access in compute_3d_position_ids for Qwen VL models",
    "updated_at": "2026-03-23T10:00:51Z"
  },
  {
    "additions": 15,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## What does this PR do? Fixes `num_labels` not being propagated from `Qwen3_5Config` to its `text_config` when loading via `AutoConfig.from_pretrained(model, num_labels=N)`. **Root cause:** `Qwen3_5Config.__post_init__` initializes `text_\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44920",
    "created_at": "2026-03-22T00:01:59Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44920/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44920",
    "labels": [],
    "merged": false,
    "number": 44920,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: propagate num_labels/id2label to text_config in Qwen3_5Config",
    "updated_at": "2026-03-23T12:06:04Z"
  },
  {
    "additions": 18,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## What does this PR do? Fixes a crash in `Qwen2_5_VLProcessor.__call__` when processing batched inputs without padding (`padding=False`). **Root cause:** When the tokenizer returns sequences of different lengths (ragged list), `np.array(t\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44919",
    "created_at": "2026-03-21T23:57:37Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44919/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44919",
    "labels": [],
    "merged": false,
    "number": 44919,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: handle ragged batch inputs in Qwen2_5_VLProcessor mm_token_type_ids computation",
    "updated_at": "2026-03-23T10:38:30Z"
  },
  {
    "additions": 5,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## Summary `GPTNeoXConfig.convert_rope_params_to_dict` unconditionally overwrote `rope_parameters[\"partial_rotary_factor\"]` with the default `0.25` when `rotary_pct` was absent from kwargs. On every `from_pretrained` call, `rotary_pct` is\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44917",
    "created_at": "2026-03-21T23:34:32Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44917/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44917",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44917,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(gpt-neox): preserve rotary_pct across save/load cycle",
    "updated_at": "2026-03-23T12:37:48Z"
  },
  {
    "additions": 8,
    "author": "s-zx",
    "author_association": "NONE",
    "body_excerpt": "## Summary Importing `DebertaV2Model` (or anything that depends on it, e.g. `gliner`) raises `IndentationError` on Python 3.13 because `torch.jit.script` calls `inspect.getsource()`, dedents the snippet, and passes it to `ast.parse()`. Pyt\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44916",
    "created_at": "2026-03-21T23:34:07Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44916/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44916",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44916,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(deberta-v2): move \"Copied from\" comments above @torch.jit.script for Python 3.13 compat",
    "updated_at": "2026-03-23T12:34:24Z"
  },
  {
    "additions": 90,
    "author": "maxsloef-goodfire",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? `clean_up_tokenization` applies English-specific string replacements (` .` \u2192 `.`, ` ?` \u2192 `?`, ` ,` \u2192 `,`, etc.) to decoded text. This was designed for BERT-era WordPiece tokenizers where decoding produced artifacts\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44915",
    "created_at": "2026-03-21T20:45:03Z",
    "deletions": 6,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44915/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44915",
    "labels": [],
    "merged": false,
    "number": 44915,
    "review_comments_count": 1,
    "state": "open",
    "title": "fix: skip `clean_up_tokenization` for BPE tokenizers in `PreTrainedTokenizerFast`",
    "updated_at": "2026-03-23T18:45:52Z"
  },
  {
    "additions": 1,
    "author": "maxsloef-goodfire",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? The `Llama3Converter` in `convert_llama_weights_to_hf.py` hardcodes `clean_up_tokenization_spaces=True` (line 468). This causes `tokenizer.decode()` to silently strip spaces before punctuation for all converted Lla\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44914",
    "created_at": "2026-03-21T20:25:51Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44914/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44914",
    "labels": [],
    "merged": true,
    "number": 44914,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: set `clean_up_tokenization_spaces=False` in Llama 3 tokenizer conversion",
    "updated_at": "2026-03-23T08:38:18Z"
  },
  {
    "additions": 8,
    "author": "ouroborosscr",
    "author_association": "FIRST_TIMER",
    "body_excerpt": "Qwen3.5 uses 3D position_ids [3, batch, seq_len] for multi-dimensional rotary embedding. _is_packed_sequence() misinterprets this as a packed sequence, causing cu_seqlens to be constructed with 3x the actual token count. Flash attention th\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44911",
    "created_at": "2026-03-21T15:42:57Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44911/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44911",
    "labels": [],
    "merged": false,
    "number": 44911,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix flash attention crash with 3D position_ids (Qwen3.5)",
    "updated_at": "2026-03-24T14:35:57Z"
  },
  {
    "additions": 1,
    "author": "anshuS1310",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "The `get_scheduler` function was identifying the `inverse_sqrt` scheduler type but failing to pass `**scheduler_specific_kwargs` to the underlying `get_inverse_sqrt_schedule` function. This caused user-defined parameters like `timescale` t\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44908-3",
    "cluster_ids": [
      "cluster-44908-3"
    ],
    "cluster_role": "canonical",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44909",
    "created_at": "2026-03-21T09:59:07Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44909/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44909",
    "labels": [],
    "merged": true,
    "number": 44909,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix: Update optimization.py",
    "updated_at": "2026-03-24T13:06:15Z"
  },
  {
    "additions": 200,
    "author": "syncdoth",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "Fixes #44906 ## Summary - Remove `.expand_as(inputs_embeds)` from placeholder mask creation in `get_placeholder_mask` and equivalent inline patterns across all VLM models. `masked_scatter` natively broadcasts `(B, S, 1)` \u2192 `(B, S, H)`, mak\u2026",
    "changed_files": 71,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44907",
    "created_at": "2026-03-21T06:07:35Z",
    "deletions": 222,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44907/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44907",
    "labels": [],
    "merged": false,
    "number": 44907,
    "review_comments_count": 0,
    "state": "open",
    "title": "Remove unnecessary expand_as in get_placeholder_mask across VLMs",
    "updated_at": "2026-03-23T12:20:03Z"
  },
  {
    "additions": 13,
    "author": "NicoleRobin",
    "author_association": "NONE",
    "body_excerpt": "## Summary - 13 i18n README files used `./awesome-transformers.md` which resolves relative to the `i18n/` directory and leads to a 404 - Replace with the absolute GitHub URL so links work from any location - `README_ko.md` was already corr\u2026",
    "changed_files": 13,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44905",
    "created_at": "2026-03-21T03:25:56Z",
    "deletions": 13,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44905/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44905",
    "labels": [],
    "merged": true,
    "number": 44905,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(i18n): replace broken relative links to awesome-transformers.md with absolute URLs",
    "updated_at": "2026-03-23T12:47:56Z"
  },
  {
    "additions": 101,
    "author": "vivekvar-dl",
    "author_association": "NONE",
    "body_excerpt": "# Fix granite_speech config loading failure with int multiplier fields ## Fixes #44877 ### Problem Loading `granite_speech` configs fails with `StrictDataclassFieldValidationError` when multiplier fields (e.g., `embedding_multiplier`) are\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44904",
    "created_at": "2026-03-21T03:12:37Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44904/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44904",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44904,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(granite_speech): convert int to float for multiplier fields in text_config",
    "updated_at": "2026-03-23T10:37:38Z"
  },
  {
    "additions": 16,
    "author": "yonigozlan",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Some remote code models are using `get_size_dict` directly, and now that size is converted to SizeDict in init, we need to support it as input in `get_size_dict`",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44903",
    "created_at": "2026-03-21T01:25:53Z",
    "deletions": 7,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44903/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44903",
    "labels": [],
    "merged": true,
    "number": 44903,
    "review_comments_count": 5,
    "state": "closed",
    "title": "Support SizeDict import in get_size_dict",
    "updated_at": "2026-03-23T10:28:52Z"
  },
  {
    "additions": 3,
    "author": "guoyangzhen",
    "author_association": "NONE",
    "body_excerpt": "## Problem `_split_tokens_on_unicode()` crashes with `IndexError: string index out of range` when the decoded token stream ends with a dangling Unicode replacement character (\\uFFFD). The computed index `unicode_offset + decoded.index(repl\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44869-3",
    "cluster_ids": [
      "cluster-44869-3"
    ],
    "cluster_role": "canonical",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44902",
    "created_at": "2026-03-20T22:08:49Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44902/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44902",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44902,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: Whisper word timestamp OOB access on trailing replacement char",
    "updated_at": "2026-03-23T11:59:14Z"
  },
  {
    "additions": 19,
    "author": "harshaljanjani",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "### What does this PR do? The following failing Perceiver use case was identified and fixed in this PR: \u2192 c6d2848a23 ([\ud83d\udea8 Fix torch.jit.trace for interpolate_pos_encoding in all vision models](https://github.com/huggingface/transformers/pul\u2026",
    "changed_files": 2,
    "cluster_id": "cluster-44848-5",
    "cluster_ids": [
      "cluster-44848-5"
    ],
    "cluster_role": "member",
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44899",
    "created_at": "2026-03-20T20:02:10Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44899/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44899",
    "labels": [],
    "merged": false,
    "number": 44899,
    "review_comments_count": 2,
    "state": "open",
    "title": "fix(models): Fix Perceiver interpolate_pos_encoding interpolating to the source size",
    "updated_at": "2026-03-23T12:06:49Z"
  },
  {
    "additions": 14,
    "author": "yonigozlan",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Add compatibility with remote code importing image_processing_utils_fast modules and methods using `from transformers.image_processing_utils_fast import ...`",
    "changed_files": 2,
    "cluster_id": "cluster-44897-2",
    "cluster_ids": [
      "cluster-44897-2"
    ],
    "cluster_role": "member",
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44897",
    "created_at": "2026-03-20T19:30:32Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44897/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44897",
    "labels": [],
    "merged": true,
    "number": 44897,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Add backward compatibility for direct imports from legacy `image_processing_utils_fast`",
    "updated_at": "2026-03-20T20:00:12Z"
  },
  {
    "additions": 354,
    "author": "stevhliu",
    "author_association": "MEMBER",
    "body_excerpt": "updates the continuous batching docs - new page for the API reference - adds sections for new features like CUDA graphs, async batching, prefix caching, logprobs (depending on when its merged) - clearer example of generation with varying l\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44896",
    "created_at": "2026-03-20T19:09:41Z",
    "deletions": 81,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44896/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44896",
    "labels": [],
    "merged": false,
    "number": 44896,
    "review_comments_count": 0,
    "state": "open",
    "title": "[docs] continuous batching",
    "updated_at": "2026-03-20T19:31:26Z"
  },
  {
    "additions": 57,
    "author": "SunMarc",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? This PR enables static FP8 experts. This also works on multi-gpu with device-map. A fix for that was to set was to set `torch.cuda.set_device()`. Triton's JIT compiler uses he active device context to determine whic\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44895",
    "created_at": "2026-03-20T19:01:35Z",
    "deletions": 10,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44895/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44895",
    "labels": [],
    "merged": true,
    "number": 44895,
    "review_comments_count": 4,
    "state": "closed",
    "title": "Add static FP8 expert support ",
    "updated_at": "2026-03-24T14:27:31Z"
  },
  {
    "additions": 10,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "## Problem `ProcessorMixin.to_dict()` was calling `copy.deepcopy(self.__dict__)` on the entire processor, including the tokenizer, even though the tokenizer is always deleted from the output immediately after (since tokenizers are saved se\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44894",
    "created_at": "2026-03-20T18:57:53Z",
    "deletions": 9,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44894/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44894",
    "labels": [],
    "merged": true,
    "number": 44894,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix `processing_utils.py`: avoid deepcopying tokenizer in `ProcessorMixin` to improve performance",
    "updated_at": "2026-03-23T10:09:02Z"
  },
  {
    "additions": 18,
    "author": "ai-man-codes",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes #43011 `StaticLayer` was missing a `.crop(max_length)` method, so implemented that according to the discussion of the issue. Added `StaticLayer.crop(max_length)` to match the API of StaticCache with the Dynami\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44846-2",
    "cluster_ids": [
      "cluster-44846-2"
    ],
    "cluster_role": "canonical",
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44893",
    "created_at": "2026-03-20T17:48:23Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44893/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44893",
    "labels": [],
    "merged": false,
    "number": 44893,
    "review_comments_count": 0,
    "state": "open",
    "title": "add `StaticLayer.crop()` to match `DynamicLayer` API",
    "updated_at": "2026-03-20T18:08:10Z"
  },
  {
    "additions": 51,
    "author": "he-yufeng",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "Fixes #44821 The `elif is_remote_url(...)` / `download_url(...)` branch in `get_image_processor_dict` was accidentally removed during the image processor refactor in #43514. This caused `AutoImageProcessor.from_pretrained(url)` to break wi\u2026",
    "changed_files": 5,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44892",
    "created_at": "2026-03-20T16:21:25Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44892/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44892",
    "labels": [],
    "merged": false,
    "number": 44892,
    "review_comments_count": 4,
    "state": "closed",
    "title": "Fix AutoImageProcessor.from_pretrained failing on URL input",
    "updated_at": "2026-03-24T13:30:38Z"
  },
  {
    "additions": 507,
    "author": "kashif",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Add a MoERouterHealthCallback to log MoE router-health metrics. <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the\u2026",
    "changed_files": 7,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44891",
    "created_at": "2026-03-20T16:17:05Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44891/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44891",
    "labels": [],
    "merged": false,
    "number": 44891,
    "review_comments_count": 0,
    "state": "open",
    "title": "[Trainer] add MoERouterHealthCallback Callback",
    "updated_at": "2026-03-20T16:28:43Z"
  },
  {
    "additions": 72,
    "author": "Rocketknight1",
    "author_association": "MEMBER",
    "body_excerpt": "As discussed on Slack, this is the first phase of our approach to controlling the code agent epidemic. This PR places large warnings in both the pull request template and `CONTRIBUTING.md`, which should hopefully be seen by most contributo\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 4,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44890",
    "created_at": "2026-03-20T16:12:45Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44890/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44890",
    "labels": [],
    "merged": true,
    "number": 44890,
    "review_comments_count": 7,
    "state": "closed",
    "title": "Add big angry code agent warnings!",
    "updated_at": "2026-03-23T11:54:48Z"
  },
  {
    "additions": 86,
    "author": "roycho96",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Calling `trainer.evaluate()` before `trainer.train()` with DeepSpeed is broken in three ways: 1. **ZeRO-3 stale state crash:** `evaluate()` creates an inference engine. `train()` starts with `accelerator.free_memor\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44889",
    "created_at": "2026-03-20T15:08:32Z",
    "deletions": 21,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44889/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44889",
    "labels": [],
    "merged": false,
    "number": 44889,
    "review_comments_count": 0,
    "state": "open",
    "title": "[DeepSpeed] Fix evaluate()/predict() before train()",
    "updated_at": "2026-03-21T11:06:07Z"
  },
  {
    "additions": 2,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? In general, it should be much better to let the kernel do what it wants for perfs! There's no reasons to have troubles from it!",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44888",
    "created_at": "2026-03-20T14:45:28Z",
    "deletions": 22,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44888/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44888",
    "labels": [],
    "merged": false,
    "number": 44888,
    "review_comments_count": 0,
    "state": "open",
    "title": "Remove explicit cuda stream in nemotron_h",
    "updated_at": "2026-03-23T15:14:13Z"
  },
  {
    "additions": 2,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? As per the title. On currently pinned version, when we run this small snippet (which is called on some model's `__init__` functions \ud83d\ude05): ```python from transformers.integrations.hub_kernels import lazy_load_kernel ca\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44887",
    "created_at": "2026-03-20T14:00:33Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44887/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44887",
    "labels": [],
    "merged": true,
    "number": 44887,
    "review_comments_count": 1,
    "state": "closed",
    "title": "Bump kernels version dependency to avoid crashes",
    "updated_at": "2026-03-20T19:01:51Z"
  },
  {
    "additions": 14,
    "author": "m-matthias",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Prevent crash in class LwDetrImageLoss when using it with float16 automatic mixed precision on a Cuda device. torch.pow causes an autocast to float32 when used with Cuda, which caused a type mismatch at ``` pos_weig\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44886",
    "created_at": "2026-03-20T13:56:08Z",
    "deletions": 12,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44886/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44886",
    "labels": [],
    "merged": true,
    "number": 44886,
    "review_comments_count": 4,
    "state": "closed",
    "title": "LwDetrImageLoss: Fix dtype casting to prevent crash when using amp on cuda device",
    "updated_at": "2026-03-24T17:02:32Z"
  },
  {
    "additions": 2,
    "author": "guoyangzhen",
    "author_association": "NONE",
    "body_excerpt": "## Problem In _split_tokens_on_unicode(), when the decoded token stream ends with a dangling Unicode replacement character (U+FFFD), the computed index can equal len(decoded_full), causing IndexError: string index out of range. The failing\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44869-3",
    "cluster_ids": [
      "cluster-44869-3"
    ],
    "cluster_role": "member",
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44885",
    "created_at": "2026-03-20T13:03:54Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44885/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44885",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44885,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: prevent IndexError in Whisper word timestamp decode",
    "updated_at": "2026-03-23T12:01:50Z"
  },
  {
    "additions": 14,
    "author": "hmellor",
    "author_association": "MEMBER",
    "body_excerpt": "Some libraries that use Transformers (i.e. vLLM) use `|` on the `size` config. This PR adds `__or__` and `__ror__` so that the following works: ```console $ {\"longest_edge\": 20} | SizeDict(height=10, width=20) {'longest_edge': 20, 'height'\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44884",
    "created_at": "2026-03-20T11:52:15Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44884/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44884",
    "labels": [],
    "merged": true,
    "number": 44884,
    "review_comments_count": 2,
    "state": "closed",
    "title": "Add missing dunder methods to `SizeDict`",
    "updated_at": "2026-03-20T12:21:12Z"
  },
  {
    "additions": 2,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Fix https://github.com/huggingface/transformers/issues/44589.",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44883",
    "created_at": "2026-03-20T11:43:13Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44883/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44883",
    "labels": [],
    "merged": true,
    "number": 44883,
    "review_comments_count": 2,
    "state": "closed",
    "title": "Fix dtype guessing from state dict",
    "updated_at": "2026-03-20T13:12:34Z"
  },
  {
    "additions": 1,
    "author": "itazap",
    "author_association": "MEMBER",
    "body_excerpt": "fixes ```python model = \"meta-llama/Llama-4-Maverick-17B-128E-Instruct\" tok_auto = AutoTokenizer.from_pretrained(model) print(f\"AutoTokenizer: {tok_auto('hello')}\") ``` ``` The above exception was the direct cause of the following exceptio\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44882",
    "created_at": "2026-03-20T11:31:20Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44882/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44882",
    "labels": [],
    "merged": false,
    "number": 44882,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix config type",
    "updated_at": "2026-03-20T16:34:20Z"
  },
  {
    "additions": 142,
    "author": "zucchini-nlp",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? As per title, we don't need a weird way to filter out kwargs anymore because now we don't rely on `tokenizer.apply_chat_template`. I didn't delete the unused `TypedDict` yet and will deprecate for at least 3 minor r\u2026",
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44881",
    "created_at": "2026-03-20T10:44:06Z",
    "deletions": 82,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44881/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44881",
    "labels": [],
    "merged": true,
    "number": 44881,
    "review_comments_count": 10,
    "state": "closed",
    "title": "Allow arbitrary template kwargs in processors",
    "updated_at": "2026-03-24T11:14:33Z"
  },
  {
    "additions": 34,
    "author": "itazap",
    "author_association": "MEMBER",
    "body_excerpt": "incorrect model list update",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44880",
    "created_at": "2026-03-20T10:37:13Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44880/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44880",
    "labels": [],
    "merged": true,
    "number": 44880,
    "review_comments_count": 0,
    "state": "closed",
    "title": "incorrect model list update",
    "updated_at": "2026-03-24T09:27:24Z"
  },
  {
    "additions": 448,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? - uses the Makefile as a single source of truth for running QA checks - adds `tomli` so `make` commands can read the `toml` file when needed - adds a `checkers` Python module that wraps and orchestrates all `checks`\u2026",
    "changed_files": 7,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44879",
    "created_at": "2026-03-20T10:24:29Z",
    "deletions": 90,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44879/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44879",
    "labels": [],
    "merged": false,
    "number": 44879,
    "review_comments_count": 6,
    "state": "open",
    "title": "refactor: unify QA calls",
    "updated_at": "2026-03-24T15:33:37Z"
  },
  {
    "additions": 8,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? `check_docstrings` has been complaining for a while about those.",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44878",
    "created_at": "2026-03-20T10:01:08Z",
    "deletions": 8,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44878/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44878",
    "labels": [],
    "merged": true,
    "number": 44878,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix nemotron config docstrings",
    "updated_at": "2026-03-20T10:11:04Z"
  },
  {
    "additions": 1,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do?",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44876",
    "created_at": "2026-03-20T09:49:54Z",
    "deletions": 7,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44876/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44876",
    "labels": [],
    "merged": true,
    "number": 44876,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix nemotron_h modular",
    "updated_at": "2026-03-20T10:00:35Z"
  },
  {
    "additions": 872,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Refactors `src/transformers/cli/serve.py` to reduce nesting depth, eliminate code duplication, and improve maintainability. No behavioral changes and the public API is unchanged. Also added a module docstring to exp\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 5,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44875",
    "created_at": "2026-03-20T09:06:34Z",
    "deletions": 701,
    "draft": true,
    "files_url": "https://github.com/huggingface/transformers/pull/44875/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44875",
    "labels": [],
    "merged": false,
    "number": 44875,
    "review_comments_count": 0,
    "state": "open",
    "title": "refactor: improved the cli server module code organization",
    "updated_at": "2026-03-23T08:08:17Z"
  },
  {
    "additions": 2,
    "author": "hmellor",
    "author_association": "MEMBER",
    "body_excerpt": "`Llama4`'s was incorrect and causing `StrictDataclassFieldValidationErrors`. `AFMoE`'s was was fine but now it's more specific.",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44874",
    "created_at": "2026-03-20T09:05:02Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44874/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44874",
    "labels": [],
    "merged": true,
    "number": 44874,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix `layer_types` type hint for `AFMoE` and `Llama4`",
    "updated_at": "2026-03-20T12:03:58Z"
  },
  {
    "additions": 75,
    "author": "sergiopaniego",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? ## Problem Online RL training (GRPO, RLOO, PPO) with all VL models using MRoPE with rope_deltas (Qwen2-VL, Qwen2.5-VL, Qwen3-VL, Qwen3.5, GLM4V, PaddleOCR-VL, Ernie4.5-VL-MoE, etc.) crashes with `RuntimeError: Sizes\u2026",
    "changed_files": 15,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44873",
    "created_at": "2026-03-20T08:38:03Z",
    "deletions": 30,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44873/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44873",
    "labels": [],
    "merged": true,
    "number": 44873,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix VL model rope_deltas batch size mismatch in online RL training",
    "updated_at": "2026-03-20T13:51:08Z"
  },
  {
    "additions": 2,
    "author": "IvanFan-Van",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "## Description Update outdated comment that references non-existent file `generation_utils_samplers.py` ## Changes Detail - The comment on line 1200 states \"all samplers can be found in `generation_utils_samplers.py`\" - In reality, all sam\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44872",
    "created_at": "2026-03-20T05:45:46Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44872/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44872",
    "labels": [],
    "merged": false,
    "number": 44872,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix: Update outdated sampler comment in generation/utils.py",
    "updated_at": "2026-03-20T05:45:46Z"
  },
  {
    "additions": 666,
    "author": "JonusClapshaw",
    "author_association": "NONE",
    "body_excerpt": "# What does this PR do? Fixes #42200 `prediction_step` is type-hinted to return `Optional[torch.Tensor]` for logits, but when no `preprocess_logits_for_metrics` is provided it could return a tuple instead of a tensor. This caused `torch_pa\u2026",
    "changed_files": 33,
    "cluster_id": "cluster-44861-3",
    "cluster_ids": [
      "cluster-44861-3"
    ],
    "cluster_role": "member",
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44870",
    "created_at": "2026-03-20T02:28:27Z",
    "deletions": 3,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44870/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44870",
    "labels": [],
    "merged": false,
    "number": 44870,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: ensure prediction_step returns tensor for logits, not tuple #42200",
    "updated_at": "2026-03-20T17:51:19Z"
  },
  {
    "additions": 98,
    "author": "sdharani91",
    "author_association": "NONE",
    "body_excerpt": "# What does this PR do? Fixes #44717 This PR fixes packed-sequence handling for the Qwen3.5 linear-attention fast path. Before this change, Qwen3.5 produced different outputs for: a padded representation of multiple sequences a packed repr\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44867",
    "created_at": "2026-03-19T17:31:45Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44867/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44867",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44867,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Pass packed boundary metadata to Qwen3.5 linear-attention fast kernels",
    "updated_at": "2026-03-20T13:15:15Z"
  },
  {
    "additions": 78,
    "author": "Cyrilvallez",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? As per the title",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 8,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44866",
    "created_at": "2026-03-19T17:27:58Z",
    "deletions": 75,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44866/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44866",
    "labels": [],
    "merged": true,
    "number": 44866,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Align lfm2 cache to other mamba caches",
    "updated_at": "2026-03-20T10:50:28Z"
  },
  {
    "additions": 496,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Added Rule 11 forward() must not access non-nn.Module attributes on submodules (breaks pipeline parallelism with Identity replacement). we want to make sure we just use metadata in config and elesewere when in that\u2026",
    "changed_files": 10,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44865",
    "created_at": "2026-03-19T16:39:59Z",
    "deletions": 26,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44865/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44865",
    "labels": [],
    "merged": true,
    "number": 44865,
    "review_comments_count": 1,
    "state": "closed",
    "title": "chore(typing): added rule 11",
    "updated_at": "2026-03-23T12:29:21Z"
  },
  {
    "additions": 99,
    "author": "SunMarc",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? This PR switches FP8 per-tensor implementation to rely on the official torch impl `torch._scaled_mm`. Note that `torch._scaled_mm` don't explicitly support per tensor. We hack the api a bit as it only support per ro\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44864",
    "created_at": "2026-03-19T16:19:53Z",
    "deletions": 12,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44864/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44864",
    "labels": [],
    "merged": false,
    "number": 44864,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Switch FP8 per tensor quant to use `torch._scaled_mm`",
    "updated_at": "2026-03-20T19:05:05Z"
  },
  {
    "additions": 19,
    "author": "gh-wf",
    "author_association": "NONE",
    "body_excerpt": "Some models (e.g. Nemotron-H) define `_tied_weights_keys` as a list, which caused `AttributeError: 'list' object has no attribute 'keys'` when calling `save_pretrained` during full finetuning. # What does this PR do? `_get_tied_weight_keys\u2026",
    "changed_files": 2,
    "cluster_id": "cluster-44861-3",
    "cluster_ids": [
      "cluster-44861-3"
    ],
    "cluster_role": "canonical",
    "comments_count": 0,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44862",
    "created_at": "2026-03-19T15:14:12Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44862/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44862",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44862,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: handle list-type _tied_weights_keys in _get_tied_weight_keys",
    "updated_at": "2026-03-20T09:47:09Z"
  },
  {
    "additions": 111,
    "author": "remi-or",
    "author_association": "MEMBER",
    "body_excerpt": "Right now, the continuous batching tests all use similar mechanisms, namely: 1. loading a model and a tokenizer 2. preparing data for generate or generate_batch 3. running generate to compare its outputs with generate_batch This PR adds 3\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44858",
    "created_at": "2026-03-19T13:22:04Z",
    "deletions": 188,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44858/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44858",
    "labels": [],
    "merged": true,
    "number": 44858,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[CB] [Minor] Simplify test suite",
    "updated_at": "2026-03-24T11:44:39Z"
  },
  {
    "additions": 63,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? We had (flaky) ```bash tests/models/nemotron_h/test_modeling_nemotron_h.py::NemotronHModelTest::test_sdpa_can_compile_dynamic Fatal Python error: Segmentation fault ``` `NemotronHBlock.forward` creates a temporary `\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44854",
    "created_at": "2026-03-19T10:54:36Z",
    "deletions": 56,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44854/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44854",
    "labels": [],
    "merged": true,
    "number": 44854,
    "review_comments_count": 5,
    "state": "closed",
    "title": "Fix core dumped when `NemotronH` is torch compiled",
    "updated_at": "2026-03-20T14:29:16Z"
  },
  {
    "additions": 99,
    "author": "sergiopaniego",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? `Zamba2MambaMixer.__init__` calls `lazy_load_kernel(\"mamba-ssm\")` and `lazy_load_kernel(\"causal-conv1d\")` unconditionally. Models that inherit from it (like NemotronH) and set `use_mamba_kernels=False` in their conf\u2026",
    "changed_files": 3,
    "cluster_id": "cluster-44848-5",
    "cluster_ids": [
      "cluster-44848-5"
    ],
    "cluster_role": "member",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44853",
    "created_at": "2026-03-19T10:22:40Z",
    "deletions": 72,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44853/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44853",
    "labels": [],
    "merged": false,
    "number": 44853,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix Zamba2MambaMixer ignoring use_mamba_kernels=False",
    "updated_at": "2026-03-23T14:14:40Z"
  },
  {
    "additions": 15,
    "author": "Sai-Suraj-27",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fixes these failing [Qwen3OmniModelIntegrationTests](https://github.com/huggingface/transformers/actions/runs/23230643883/job/67524756897#step:14:1131) <img width=\"2292\" height=\"161\" alt=\"image\" src=\"https://github.\u2026",
    "changed_files": 3,
    "cluster_id": "cluster-44848-5",
    "cluster_ids": [
      "cluster-44848-5"
    ],
    "cluster_role": "member",
    "comments_count": 10,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44848",
    "created_at": "2026-03-19T07:30:39Z",
    "deletions": 14,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44848/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44848",
    "labels": [],
    "merged": false,
    "number": 44848,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix failing `Qwen3OmniModelIntegrationTests`",
    "updated_at": "2026-03-22T10:42:09Z"
  },
  {
    "additions": 72,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Activated `anti-slop` action. - min-account-age: 30 (most slop accounts are < 1 month old) - detect-spam-usernames: true (Catches obvious spam patterns) - min-profile-completeness: 3 (Low bar, but catches bare-bones\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44847",
    "created_at": "2026-03-19T07:15:38Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44847/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44847",
    "labels": [],
    "merged": false,
    "number": 44847,
    "review_comments_count": 2,
    "state": "open",
    "title": "ci: add anti-slop action",
    "updated_at": "2026-03-24T16:04:30Z"
  },
  {
    "additions": 64,
    "author": "RicardoLee510520",
    "author_association": "FIRST_TIMER",
    "body_excerpt": "# What does this PR do? Updated the DeiT model card to follow the new standardized format: - Replaced verbose paper abstract with concise model description - Added Pipeline and AutoModel usage examples - Renamed \"Usage tips\" to \"Notes\" - U\u2026",
    "changed_files": 1,
    "cluster_id": "cluster-44846-2",
    "cluster_ids": [
      "cluster-44846-2"
    ],
    "cluster_role": "member",
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44846",
    "created_at": "2026-03-19T06:30:53Z",
    "deletions": 90,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44846/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44846",
    "labels": [],
    "merged": false,
    "number": 44846,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[Docs] Update DeiT model card to new format",
    "updated_at": "2026-03-20T05:30:17Z"
  },
  {
    "additions": 15,
    "author": "jiqing-feng",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Fixes `torch.compile` failure for Mllama after #42848 introduced a new unified attention mask creation path. The root cause is a **torch inductor C++ codegen bug**: when `padding_mask_function` uses advanced tensor\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44845",
    "created_at": "2026-03-19T06:14:54Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44845/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44845",
    "labels": [],
    "merged": false,
    "number": 44845,
    "review_comments_count": 1,
    "state": "open",
    "title": "Fix Mllama torch.compile failure caused by new attention mask logic",
    "updated_at": "2026-03-23T02:50:36Z"
  },
  {
    "additions": 482,
    "author": "stevhliu",
    "author_association": "MEMBER",
    "body_excerpt": "backfills empty model cards like gptoss and nemotronh with a `model-card.md` skill i created. its pretty minimal at the moment and just includes a brief intro and code examples. let me know if there is anything else we should add!",
    "changed_files": 12,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44837",
    "created_at": "2026-03-18T21:45:31Z",
    "deletions": 102,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44837/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44837",
    "labels": [],
    "merged": true,
    "number": 44837,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[docs] model cards",
    "updated_at": "2026-03-20T22:40:41Z"
  },
  {
    "additions": 187,
    "author": "remi-or",
    "author_association": "MEMBER",
    "body_excerpt": "## Summary This PR adds the `return_logprobs` flag to the continuous batching, enabling the user to retrieve the log probabilites of the tokens generated. # Tests Added a test to compare with regular generate and it passes. All tests pass.\u2026",
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44835",
    "created_at": "2026-03-18T17:48:15Z",
    "deletions": 83,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44835/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44835",
    "labels": [],
    "merged": true,
    "number": 44835,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[CB] Add an option to return logprobs",
    "updated_at": "2026-03-23T18:35:31Z"
  },
  {
    "additions": 192,
    "author": "IlyasMoutawwakil",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44832",
    "created_at": "2026-03-18T15:33:15Z",
    "deletions": 155,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44832/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44832",
    "labels": [],
    "merged": false,
    "number": 44832,
    "review_comments_count": 10,
    "state": "open",
    "title": "DeepGEMM",
    "updated_at": "2026-03-24T13:39:07Z"
  },
  {
    "additions": 3308,
    "author": "lashahub",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "This PR adds `AudioFlamingoNext` as a separate model name that inherits directly from `MusicFlamingo` #43538 and keeps the same architecture and behavior. Changes: - add `audioflamingonext` model files - register it in the auto mappings -\u2026",
    "changed_files": 35,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44830",
    "created_at": "2026-03-18T14:31:45Z",
    "deletions": 61,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44830/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44830",
    "labels": [],
    "merged": false,
    "number": 44830,
    "review_comments_count": 1,
    "state": "open",
    "title": "Add AudioFlamingoNext model",
    "updated_at": "2026-03-20T09:14:54Z"
  },
  {
    "additions": 80,
    "author": "3outeille",
    "author_association": "MEMBER",
    "body_excerpt": "https://github.com/huggingface/transformers/pull/44825",
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 7,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44827",
    "created_at": "2026-03-18T13:36:53Z",
    "deletions": 14,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44827/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44827",
    "labels": [],
    "merged": false,
    "number": 44827,
    "review_comments_count": 6,
    "state": "open",
    "title": "Fix Mistral4 tests",
    "updated_at": "2026-03-23T16:47:42Z"
  },
  {
    "additions": 28,
    "author": "JJJYmmm",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Fix https://github.com/QwenLM/Qwen3.5/issues/97. This PR adds `enable_thinking` to the chat-template kwargs. With this change, `enable_thinking` is treated as a template-level argument in the tokenize=True path, so\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44817",
    "created_at": "2026-03-18T10:44:11Z",
    "deletions": 6,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44817/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44817",
    "labels": [],
    "merged": false,
    "number": 44817,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[Misc] add enable_thinking to template kwargs",
    "updated_at": "2026-03-20T14:56:04Z"
  },
  {
    "additions": 135,
    "author": "ArthurZucker",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 6,
    "cluster_id": "cluster-44815-10",
    "cluster_ids": [
      "cluster-44815-10"
    ],
    "cluster_role": "member",
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44815",
    "created_at": "2026-03-18T09:54:18Z",
    "deletions": 23,
    "draft": true,
    "files_url": "https://github.com/huggingface/transformers/pull/44815/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44815",
    "labels": [],
    "merged": false,
    "number": 44815,
    "review_comments_count": 2,
    "state": "open",
    "title": "Dequant fix",
    "updated_at": "2026-03-24T14:39:52Z"
  },
  {
    "additions": 137,
    "author": "stevhliu",
    "author_association": "MEMBER",
    "body_excerpt": "updates the peft docs: - a more complete training section with a full code snippet, describe saving behavior, resuming from a checkpoint, and distributed training - adds some undocumented API methods (`delete_adapter`, `active_adapters`) -\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44804",
    "created_at": "2026-03-18T00:08:54Z",
    "deletions": 89,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44804/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44804",
    "labels": [],
    "merged": true,
    "number": 44804,
    "review_comments_count": 3,
    "state": "closed",
    "title": "[docs] peft",
    "updated_at": "2026-03-23T17:14:58Z"
  },
  {
    "additions": 1341,
    "author": "yonigozlan",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? (Finally) add support for checking+fixing both generated files and modular files in `check_auto_docstrings`. Also `auto_docstring` was recently added to configs, and this PR updates `check_auto_docstrings` to suppor\u2026",
    "changed_files": 244,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44803",
    "created_at": "2026-03-17T22:40:45Z",
    "deletions": 1105,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44803/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44803",
    "labels": [],
    "merged": true,
    "number": 44803,
    "review_comments_count": 24,
    "state": "closed",
    "title": "Support Modular (!!) + Configs in `check_auto_docstrings`",
    "updated_at": "2026-03-24T17:59:12Z"
  },
  {
    "additions": 333,
    "author": "stevhliu",
    "author_association": "MEMBER",
    "body_excerpt": "updates the Hardware section of the docs for training: - combined CPU/Distributed CPU into a single doc - add more info to the Gaudi doc (mixed precision, torch.compile, distributed training) - add more info to the MPS doc (mixed precision\u2026",
    "changed_files": 9,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44799",
    "created_at": "2026-03-17T17:19:51Z",
    "deletions": 627,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44799/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44799",
    "labels": [],
    "merged": false,
    "number": 44799,
    "review_comments_count": 19,
    "state": "open",
    "title": "[docs] training on specific hardware",
    "updated_at": "2026-03-23T09:09:32Z"
  },
  {
    "additions": 1,
    "author": "vasqu",
    "author_association": "MEMBER",
    "body_excerpt": "Depends on #44887 and kernels being version `12.3` Works OOB with little changes! Example script for demonstration: ```python from transformers import AutoModelForCausalLM, AutoTokenizer fa_version = 4 #model_id = \"openai/gpt-oss-20b\" mode\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44797",
    "created_at": "2026-03-17T15:35:59Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44797/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44797",
    "labels": [],
    "merged": true,
    "number": 44797,
    "review_comments_count": 0,
    "state": "closed",
    "title": "[`FA4`] Add kernels fallback",
    "updated_at": "2026-03-20T19:03:24Z"
  },
  {
    "additions": 5110,
    "author": "SunMarc",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? This PR refactors transformers serve so that it is not in a single file. We split it into multiple files with clear responsabilities. - serve_refactored.py \u2014 only CLI args + wiring - server.py \u2014 FastAPI routes and m\u2026",
    "changed_files": 12,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44796",
    "created_at": "2026-03-17T13:04:06Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44796/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44796",
    "labels": [],
    "merged": false,
    "number": 44796,
    "review_comments_count": 0,
    "state": "open",
    "title": "[refactor] Serving into proper modules",
    "updated_at": "2026-03-24T14:57:47Z"
  },
  {
    "additions": 72,
    "author": "tarekziade",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? This patch - adds a simple cache to the model linter so we skip files that did not change and were valid - reworks `Makefile` targets",
    "changed_files": 6,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44790",
    "created_at": "2026-03-17T08:54:47Z",
    "deletions": 19,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44790/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44790",
    "labels": [],
    "merged": true,
    "number": 44790,
    "review_comments_count": 1,
    "state": "closed",
    "title": "feat: added cache to the model linter",
    "updated_at": "2026-03-24T15:28:29Z"
  },
  {
    "additions": 0,
    "author": "BillionClaw",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "The pipeline() docstring included an example using the 'question-answering' task, but this task is not in SUPPORTED_TASKS and will raise an error when used. Remove this outdated example to avoid confusing users following the documentation.\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 9,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44788",
    "created_at": "2026-03-17T08:38:25Z",
    "deletions": 5,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44788/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44788",
    "labels": [],
    "merged": false,
    "number": 44788,
    "review_comments_count": 0,
    "state": "closed",
    "title": "docs(pipelines): remove outdated question-answering example",
    "updated_at": "2026-03-23T17:19:33Z"
  },
  {
    "additions": 5,
    "author": "bensons",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? Some model repos provide `extra_special_tokens` as a list in their tokenizer_config.json, which caused an `AttributeError: 'list' object has no attribute 'keys'`. This converts list inputs to a dict mapping each tok\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44781",
    "created_at": "2026-03-17T04:59:02Z",
    "deletions": 2849,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44781/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44781",
    "labels": [],
    "merged": false,
    "number": 44781,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix `_set_model_specific_special_tokens` to accept list-format `extra_special_tokens`",
    "updated_at": "2026-03-23T17:18:37Z"
  },
  {
    "additions": 20,
    "author": "michaelbenayoun",
    "author_association": "MEMBER",
    "body_excerpt": "The function `add_tensor_parallel_hooks_to_module` has unused parameters, in this PR we: - Remove `tp_plan`, which is not used. - Remove `parameter_name` which is not used - Remove `layer_name`. This parameter is only used for logging purp\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44768",
    "created_at": "2026-03-16T18:29:52Z",
    "deletions": 9,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44768/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44768",
    "labels": [],
    "merged": false,
    "number": 44768,
    "review_comments_count": 3,
    "state": "open",
    "title": "Remove unused parameters and improve add_tensor_parallel_hooks_t\u2026",
    "updated_at": "2026-03-24T19:23:13Z"
  },
  {
    "additions": 19,
    "author": "harshaljanjani",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "### What does this PR do? The following failing tests were identified and fixed in this PR: \u2192 **PaliGemma 2:** The [PaliGemma 1 test class](https://github.com/huggingface/transformers/blob/main/tests/models/paligemma/test_modeling_paligemm\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 5,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44765",
    "created_at": "2026-03-16T17:26:22Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44765/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44765",
    "labels": [],
    "merged": true,
    "number": 44765,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix(testing): Fix PaliGemma 2 and PaddleOCR-VL test failures on main",
    "updated_at": "2026-03-20T13:55:55Z"
  },
  {
    "additions": 2090,
    "author": "juliendenize",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflec\u2026",
    "changed_files": 15,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 12,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44760",
    "created_at": "2026-03-16T15:54:11Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44760/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44760",
    "labels": [
      "New model"
    ],
    "merged": true,
    "number": 44760,
    "review_comments_count": 8,
    "state": "closed",
    "title": "Add Mistral 4",
    "updated_at": "2026-03-20T10:44:48Z"
  },
  {
    "additions": 339,
    "author": "anuq",
    "author_association": "NONE",
    "body_excerpt": "## What does this PR do? Fixes #35141. When `tie_word_embeddings=False`, calling `resize_token_embeddings()` creates a new `nn.Linear` for the LM head via `_get_resized_lm_head()`. The new module's weight and bias tensors do **not** carry\u2026",
    "changed_files": 4,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44711",
    "created_at": "2026-03-14T19:21:21Z",
    "deletions": 205,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44711/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44711",
    "labels": [
      "Code agent slop"
    ],
    "merged": false,
    "number": 44711,
    "review_comments_count": 0,
    "state": "closed",
    "title": "fix: mark new lm_head params as `_is_hf_initialized` after `resize_token_embeddings`",
    "updated_at": "2026-03-20T13:36:58Z"
  },
  {
    "additions": 15,
    "author": "he-yufeng",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "## What does this PR do? Fixes `AutoProcessor.from_pretrained` silently dropping hub kwargs like `force_download`, `cache_dir`, `token`, `revision`, etc. ### The bug The existing code on line ~300 filters kwargs using `inspect.signature(ca\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44710",
    "created_at": "2026-03-14T18:33:53Z",
    "deletions": 2,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44710/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44710",
    "labels": [],
    "merged": false,
    "number": 44710,
    "review_comments_count": 0,
    "state": "open",
    "title": "Fix AutoProcessor.from_pretrained silently dropping hub kwargs",
    "updated_at": "2026-03-23T13:34:17Z"
  },
  {
    "additions": 219,
    "author": "hmellor",
    "author_association": "MEMBER",
    "body_excerpt": "These models have `base_model_pp_plan`s but currently do not work because the base model's forward pass depends on all the `layers` being `Qwen2VLDecoderLayer`. i.e. if one of the layers is removed/replaced with `Identity`, `decoder_layer.\u2026",
    "changed_files": 52,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44699",
    "created_at": "2026-03-14T11:44:24Z",
    "deletions": 148,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44699/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44699",
    "labels": [],
    "merged": true,
    "number": 44699,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix several based models' pipeline parallel support",
    "updated_at": "2026-03-20T13:53:27Z"
  },
  {
    "additions": 4,
    "author": "harshaljanjani",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "### What does this PR do? The following failing tests were identified and fixed in this PR: \u2192 **Kyutai Speech-To-Text**: [The PR [processors] Unbloating simple processors](https://github.com/huggingface/transformers/pull/40377), [refactore\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 2,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44695",
    "created_at": "2026-03-14T09:05:35Z",
    "deletions": 9,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44695/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44695",
    "labels": [],
    "merged": false,
    "number": 44695,
    "review_comments_count": 3,
    "state": "open",
    "title": "fix(testing): Fix Kyutai Speech-To-Text, LLaVA-OneVision, and LongCatFlash test failures on main CI  ",
    "updated_at": "2026-03-23T11:51:26Z"
  },
  {
    "additions": 408,
    "author": "Rocketknight1",
    "author_association": "MEMBER",
    "body_excerpt": "We've had `parse_response()` in the library for a while, but it's been a soft launch / prototype feature. This PR cleans it up and documents it, making it an official feature! The API is largely unchanged from the prototype, but we drop `x\u2026",
    "changed_files": 5,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44674",
    "created_at": "2026-03-13T15:41:42Z",
    "deletions": 34,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44674/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44674",
    "labels": [],
    "merged": false,
    "number": 44674,
    "review_comments_count": 11,
    "state": "open",
    "title": "Officially launch parse_response",
    "updated_at": "2026-03-24T14:23:26Z"
  },
  {
    "additions": 19,
    "author": "dacorvo",
    "author_association": "MEMBER",
    "body_excerpt": "Fixes #44677 ## Summary - Add `base_model_tp_plan` to `OlmoeConfig`, enabling `from_pretrained(tp_plan=\"auto\")` for OLMoE models - Add `TensorParallelTesterMixin` to OLMoE tests for TP validation coverage - Uses `\"colwise\"` for `q_norm` an\u2026",
    "changed_files": 2,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 6,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44668",
    "created_at": "2026-03-13T14:45:22Z",
    "deletions": 1,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44668/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44668",
    "labels": [],
    "merged": false,
    "number": 44668,
    "review_comments_count": 4,
    "state": "open",
    "title": "Add `base_model_tp_plan` to `OlmoeConfig`",
    "updated_at": "2026-03-24T15:01:02Z"
  },
  {
    "additions": 7084,
    "author": "CyrilSterling",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "# What does this PR do? This PR supports PenguinVL model. Paper: https://arxiv.org/abs/2603.06569 Github repo: https://github.com/tencent-ailab/Penguin-VL HuggingFace Model: https://huggingface.co/collections/tencent/ai-lab ## Before submi\u2026",
    "changed_files": 20,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 7,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44662",
    "created_at": "2026-03-13T13:02:26Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44662/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44662",
    "labels": [],
    "merged": false,
    "number": 44662,
    "review_comments_count": 94,
    "state": "open",
    "title": "[model] Add PenguinVL implementation",
    "updated_at": "2026-03-20T10:00:17Z"
  },
  {
    "additions": 18,
    "author": "kaixuanliu",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "@zucchini-nlp, can you help review? Thx! unit tests to reproduce this bug: `tests/models/phi4_multimodal/test_modeling_phi4_multimodal.py::Phi4MultimodalIntegrationTest::test_audio_text_generation`",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44653",
    "created_at": "2026-03-13T07:14:25Z",
    "deletions": 9,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44653/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44653",
    "labels": [],
    "merged": false,
    "number": 44653,
    "review_comments_count": 7,
    "state": "closed",
    "title": "Fix `AutoImageProcessor` to correctly detect local implementation whe\u2026",
    "updated_at": "2026-03-20T10:33:32Z"
  },
  {
    "additions": 1,
    "author": "ydshieh",
    "author_association": "MEMBER",
    "body_excerpt": "# What does this PR do? Our beautiful Dashboard is missing ..... damm",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 1,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44597",
    "created_at": "2026-03-11T13:53:02Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44597/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44597",
    "labels": [],
    "merged": true,
    "number": 44597,
    "review_comments_count": 0,
    "state": "closed",
    "title": "Fix CircleCI summary report not showing due to missing dependency",
    "updated_at": "2026-03-20T07:33:38Z"
  },
  {
    "additions": 7,
    "author": "jiqing-feng",
    "author_association": "CONTRIBUTOR",
    "body_excerpt": "Fixes Llama4 model loading under BitsAndBytes (BNB) quantization mode. Router quantized incorrectly causes shape mismatch: Llama4Router inherits from nn.Linear, so BNB quantizes its weight into a packed format. However, super().forward() c\u2026",
    "changed_files": 1,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 13,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44588",
    "created_at": "2026-03-11T01:42:33Z",
    "deletions": 0,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44588/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44588",
    "labels": [],
    "merged": false,
    "number": 44588,
    "review_comments_count": 3,
    "state": "open",
    "title": "Fix llama4 bnb mode",
    "updated_at": "2026-03-24T14:42:22Z"
  },
  {
    "additions": 33,
    "author": "wilnn",
    "author_association": "FIRST_TIME_CONTRIBUTOR",
    "body_excerpt": "\u2026kpoint when `save_strategy` is `best` # What does this PR do? fix load_best_model_checkpoint_at_end do not load the best model checkpoint at the end when `save_strategy` is `\"best\"` Fixes # (issue) fix load_best_model_checkpoint_at_end do\u2026",
    "changed_files": 3,
    "cluster_id": null,
    "cluster_ids": [],
    "cluster_role": null,
    "comments_count": 3,
    "conversation_url": "https://github.com/huggingface/transformers/pull/44583",
    "created_at": "2026-03-10T22:37:36Z",
    "deletions": 4,
    "draft": false,
    "files_url": "https://github.com/huggingface/transformers/pull/44583/files",
    "html_url": "https://github.com/huggingface/transformers/pull/44583",
    "labels": [],
    "merged": false,
    "number": 44583,
    "review_comments_count": 2,
    "state": "open",
    "title": "fix load_best_model_checkpoint_at_end do not load the best model chec\u2026",
    "updated_at": "2026-03-21T21:18:14Z"
  }
]