Skip to content

Commit cab4108

Browse files
committed
Squashed commit of the following:
commit 7a9592b Author: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Date: Tue Nov 4 14:32:04 2025 -0700 🐍 Drop Python 3.9 (huggingface#4183) commit 7f15a7f Author: Harras Mansoor <98635627+Harras3@users.noreply.github.com> Date: Wed Nov 5 02:06:31 2025 +0500 Removed outdated warning about batch contamination (huggingface#4423) commit 8b0a3ce Author: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Date: Tue Nov 4 21:37:39 2025 +0100 Update tokenizer apply_chat_template with return_dict=True default (huggingface#4448) commit d9f9e2b Author: Pramodith Ballapuram <16939722+pramodith@users.noreply.github.com> Date: Tue Nov 4 19:56:58 2025 +0000 Support casting to fp32 when word embeddings are tied to lm_head (huggingface#4446) commit 4e138ab Author: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Date: Tue Nov 4 15:15:23 2025 +0100 Upload notebook with T4 selected (huggingface#4449)
1 parent 6d6a603 commit cab4108

File tree

99 files changed

+2068
-1928
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+2068
-1928
lines changed

.github/workflows/tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ jobs:
4040
name: Tests
4141
strategy:
4242
matrix:
43-
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13']
43+
python-version: ['3.10', '3.11', '3.12', '3.13']
4444
fail-fast: false
4545
runs-on:
4646
group: aws-g4dn-2xlarge

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
repos:
22
- repo: https://github.com/astral-sh/ruff-pre-commit
3-
rev: v0.11.10
3+
rev: v0.13.3
44
hooks:
55
- id: ruff-check
66
types_or: [ python, pyi ]

CONTRIBUTING.md

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -285,24 +285,6 @@ def replicate_str(string: str, n: int, sep: str = " ") -> str:
285285
* **Definite Articles:** Removed definite articles where possible to streamline language. (Eg: Changed "The string to replicate" to "String to replicate")
286286
* **Type Annotations:**
287287
* Always include type definitions, indicating if a parameter is optional and specifying the default value.
288-
* Note that `Optional` means that the value can be `None`, and `*optional*` means that it is not required for the user to pass a value.
289-
E.g., for arguments that can't be `None` and aren't required:
290-
291-
```txt
292-
foo (`int`, *optional*, defaults to `4`):
293-
```
294-
295-
For arguments that can be `None` and are required:
296-
297-
```txt
298-
foo (`Optional[int]`):
299-
```
300-
301-
for arguments that can be `None` and aren't required (in this case, if the default value is `None`, you can omit it):
302-
303-
```txt
304-
foo (`Optional[int]`, *optional*):
305-
```
306288

307289
* **String Defaults:**
308290
* Ensured that default string values are wrapped in double quotes:

docs/source/lora_without_regret.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ For reinforcement learning, the blog uses a math reasoning task that we can repr
141141
```python
142142
def strip_reasoning_accuracy_reward(
143143
completions: list[list[dict[str, str]]], solution: list[str], **kwargs
144-
) -> list[Optional[float]]:
144+
) -> list[float | None]:
145145
"""Reward function that strips reasoning tags and checks mathematical accuracy.
146146
147147
This function:

docs/source/reducing_memory_usage.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -90,9 +90,6 @@ from trl import SFTConfig
9090
training_args = SFTConfig(..., packing=True, max_length=512)
9191
```
9292

93-
> [!WARNING]
94-
> Packing may cause batch contamination, where adjacent sequences influence one another. This can be problematic for some applications. For more details, see [#1230](https://github.com/huggingface/trl/issues/1230).
95-
9693
## Liger for reducing peak memory usage
9794

9895
> [Liger Kernel](https://github.com/linkedin/Liger-Kernel) is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%.

examples/datasets/hh-rlhf-helpful-base.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414

1515
import re
1616
from dataclasses import dataclass, field
17-
from typing import Optional
1817

1918
from datasets import load_dataset
2019
from huggingface_hub import ModelCard
@@ -42,15 +41,15 @@ class ScriptArguments:
4241
repo_id: str = field(
4342
default="trl-lib/hh-rlhf-helpful-base", metadata={"help": "Hugging Face repository ID to push the dataset to."}
4443
)
45-
dataset_num_proc: Optional[int] = field(
44+
dataset_num_proc: int | None = field(
4645
default=None, metadata={"help": "Number of workers to use for dataset processing."}
4746
)
4847

4948

5049
def common_start(str1: str, str2: str) -> str:
5150
# Zip the two strings and iterate over them together
5251
common_chars = []
53-
for c1, c2 in zip(str1, str2):
52+
for c1, c2 in zip(str1, str2, strict=True):
5453
if c1 == c2:
5554
common_chars.append(c1)
5655
else:

examples/datasets/llava_instruct_mix.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414

1515
import ast
1616
from dataclasses import dataclass, field
17-
from typing import Optional
1817

1918
from datasets import load_dataset
2019
from huggingface_hub import ModelCard
@@ -43,7 +42,7 @@ class ScriptArguments:
4342
default="trl-lib/llava-instruct-mix",
4443
metadata={"help": "Hugging Face repository ID to push the dataset to."},
4544
)
46-
dataset_num_proc: Optional[int] = field(
45+
dataset_num_proc: int | None = field(
4746
default=None,
4847
metadata={"help": "Number of workers to use for dataset processing."},
4948
)

examples/datasets/lm-human-preferences-descriptiveness.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
# limitations under the License.
1414

1515
from dataclasses import dataclass, field
16-
from typing import Optional
1716

1817
from datasets import load_dataset
1918
from huggingface_hub import ModelCard
@@ -42,7 +41,7 @@ class ScriptArguments:
4241
default="trl-lib/lm-human-preferences-descriptiveness",
4342
metadata={"help": "Hugging Face repository ID to push the dataset to."},
4443
)
45-
dataset_num_proc: Optional[int] = field(
44+
dataset_num_proc: int | None = field(
4645
default=None,
4746
metadata={"help": "Number of workers to use for dataset processing."},
4847
)

examples/datasets/lm-human-preferences-sentiment.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
# limitations under the License.
1414

1515
from dataclasses import dataclass, field
16-
from typing import Optional
1716

1817
from datasets import load_dataset
1918
from huggingface_hub import ModelCard
@@ -42,7 +41,7 @@ class ScriptArguments:
4241
default="trl-lib/lm-human-preferences-sentiment",
4342
metadata={"help": "Hugging Face repository ID to push the dataset to."},
4443
)
45-
dataset_num_proc: Optional[int] = field(
44+
dataset_num_proc: int | None = field(
4645
default=None,
4746
metadata={"help": "Number of workers to use for dataset processing."},
4847
)

examples/datasets/math_shepherd.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
import re
1616
from dataclasses import dataclass, field
1717
from itertools import chain
18-
from typing import Optional
1918

2019
from datasets import load_dataset
2120
from huggingface_hub import ModelCard
@@ -44,7 +43,7 @@ class ScriptArguments:
4443
default="trl-lib/math_shepherd",
4544
metadata={"help": "Hugging Face repository ID to push the dataset to."},
4645
)
47-
dataset_num_proc: Optional[int] = field(
46+
dataset_num_proc: int | None = field(
4847
default=None,
4948
metadata={"help": "Number of workers to use for dataset processing."},
5049
)
@@ -64,7 +63,7 @@ def process_example(example):
6463
labels = [example["label"][idx] == "+" for idx in indexes]
6564

6665
# Split the inputs into steps (caution, the first step is missing here, it is the prompt)
67-
steps = [inputs[i:j] for i, j in zip(chain([0], indexes), chain(indexes, [None]))]
66+
steps = [inputs[i:j] for i, j in zip(chain([0], indexes), chain(indexes, [None]), strict=True)]
6867

6968
# Remove the last step (single ⶻ)
7069
steps = steps[:-1]

0 commit comments

Comments
 (0)