Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

samples is newline delimited #1930

Merged
merged 7 commits into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/new_tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ jobs:
with:
fetch-depth: 2 # OR "2" -> To retrieve the preceding commit.

# Uses the tj-actions/changed-files@v37 action to check for changes.
# Uses the tj-actions/changed-files action to check for changes.
# Outputs provided here: https://github.com/tj-actions/changed-files#outputs
# The `files_yaml` input optionally takes a yaml string to specify filters,
# and prepends the filter name to the standard output names.
- name: Check task folders
id: changed-tasks
uses: tj-actions/changed-files@v37.1.2
uses: tj-actions/changed-files@v44.5.2
with:
# tasks checks the tasks folder and api checks the api folder for changes
files_yaml: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
env:
SKIP: "no-commit-to-branch,mypy"

uses: pre-commit/action@v3.0.0
uses: pre-commit/action@v3.0.1
# # mypy turned off for now
# - name: Lint with mypy
# run: mypy . --ignore-missing-imports --check-untyped-defs --explicit-package-bases --warn-unreachable
Expand Down
17 changes: 8 additions & 9 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,7 @@ repos:
- id: mixed-line-ending
args: [--fix=lf]
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.2.2
rev: v0.4.8
hooks:
# Run the linter.
- id: ruff
Expand All @@ -39,17 +38,17 @@ repos:
# Run the formatter.
- id: ruff-format
- repo: https://github.com/codespell-project/codespell
rev: v2.2.6
rev: v2.3.0
hooks:
- id: codespell
exclude: >
(?x)^(
.*\.json|ignore.txt|lm_eval/tasks/.*|.*yaml|.*\.ipynb
)$
args: [--check-filenames, --check-hidden, --ignore-words=ignore.txt]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.5.1
hooks:
- id: mypy
additional_dependencies: [".[sentencepiece,multilingual,promptsource,gptq]", "types-PyYAML", "types-requests"]
exclude: ^tests/.*$
# - repo: https://github.com/pre-commit/mirrors-mypy
# rev: v1.5.1
# hooks:
# - id: mypy
# additional_dependencies: [".[sentencepiece,multilingual,promptsource,gptq]", "types-PyYAML", "types-requests"]
# exclude: ^tests/.*$
18 changes: 9 additions & 9 deletions lm_eval/api/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@ class TaskConfig(dict):
training_split: Optional[str] = None
validation_split: Optional[str] = None
test_split: Optional[str] = None
fewshot_split: Optional[
str
] = None # TODO: assert that this not None if num_fewshot > 0. (?) assert if this is same split as one evaling (?)
fewshot_split: Optional[str] = (
None # TODO: assert that this not None if num_fewshot > 0. (?) assert if this is same split as one evaling (?)
)
# formatting / prompting options.
# see docs/advanced_task_guide.md for more info
process_docs: Optional[Callable] = None
Expand All @@ -92,9 +92,9 @@ class TaskConfig(dict):
filter_list: Optional[Union[str, list]] = None
should_decontaminate: bool = False
doc_to_decontamination_query: Optional[str] = None
metadata: Optional[
dict
] = None # by default, not used in the code. allows for users to pass arbitrary info to tasks
metadata: Optional[dict] = (
None # by default, not used in the code. allows for users to pass arbitrary info to tasks
)

def __post_init__(self) -> None:
if self.generation_kwargs is not None:
Expand Down Expand Up @@ -229,9 +229,9 @@ def __init__(
self._config: TaskConfig = TaskConfig({**config}) if config else TaskConfig()

self._filters = [build_filter_ensemble("none", [["take_first", None]])]
self.fewshot_rnd: Optional[
random.Random
] = None # purposely induce errors in case of improper usage
self.fewshot_rnd: Optional[random.Random] = (
None # purposely induce errors in case of improper usage
)

def download(
self,
Expand Down
1 change: 0 additions & 1 deletion lm_eval/filters/decontamination.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

@register_filter("decontaminate")
class DecontaminationFilter(Filter):

"""
A filter which evaluates
"""
Expand Down
2 changes: 1 addition & 1 deletion lm_eval/loggers/evaluation_tracker.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ def save_results_samples(
path.mkdir(parents=True, exist_ok=True)

file_results_samples = path.joinpath(
f"samples_{task_name}_{self.date_id}.json"
f"samples_{task_name}_{self.date_id}.jsonl"
)

for sample in samples:
Expand Down
3 changes: 2 additions & 1 deletion lm_eval/models/textsynth.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" TextSynth API
"""TextSynth API
Implementation provided by Fabrice Bellard:
https://github.com/EleutherAI/lm-evaluation-harness/issues/295

Expand All @@ -11,6 +11,7 @@

Homepage: https://textsynth.com/index.html
"""

import logging
import os

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/aclue/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/bbh/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os
import re
Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/belebele/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/bigbench/push_bigbench_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
`pip install "bigbench @ https://storage.googleapis.com/public_research_data/bigbench/bigbench-0.0.1.tar.gz"`
and is included so that the bigbench dependency can be avoided.
"""

import bigbench.api.util as bb_utils
import datasets
from tqdm import tqdm
Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/ceval/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/cmmlu/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/csatqa/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all other splits with this YAML
"""

import argparse
import os

Expand Down
2 changes: 0 additions & 2 deletions lm_eval/tasks/fda/task.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
"""
"""
import re
from typing import List

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/ifeval/instructions.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# limitations under the License.

"""Library of instructions."""

import collections
import json
import logging
Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/ifeval/instructions_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# limitations under the License.

"""Registry of all instructions."""

from lm_eval.tasks.ifeval import instructions


Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/mmlu/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all "other" splits with this YAML
"""

import argparse
import logging
import os
Expand Down
2 changes: 0 additions & 2 deletions lm_eval/tasks/squad_completion/task.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
"""
"""
import re
from typing import List

Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/squadv2/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

Homepage: https://rajpurkar.github.io/SQuAD-explorer/
"""

from functools import partial
from math import exp

Expand Down
2 changes: 1 addition & 1 deletion lm_eval/tasks/tinyBenchmarks/utils_winogrande.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" This code mirrors the utils of the original winogrande task """
"""This code mirrors the utils of the original winogrande task"""


def doc_to_text(doc):
Expand Down
1 change: 1 addition & 0 deletions lm_eval/tasks/tmmluplus/default/_generate_configs.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Take in a YAML, and output all "other" splits with this YAML
"""

import argparse
import os

Expand Down
2 changes: 1 addition & 1 deletion scripts/clean_training_data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ It uses the approach described in the [GPT-3 paper](https://arxiv.org/abs/2005.1
the match, splitting the training data into chunks
3) Any chunks less than `minimum_slice_length` are removed
4) Training data sets split into more than `too_dirty_cutoff` are considered
completey contaminated and removed
completely contaminated and removed

OpenAI used:
```
Expand Down
1 change: 1 addition & 0 deletions scripts/make_table_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Usage:
python make_table_tasks.py --output <markdown_filename>
"""

import json
import logging
import os
Expand Down
1 change: 1 addition & 0 deletions scripts/make_table_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Usage:
python make_table_tasks.py --output <markdown_filename>
"""

import argparse
import logging

Expand Down
Loading