Unitxt 1.7.0
What Changed in Unitxt 1.7.0
This release introduces a few significant changes that modify existing conventions:
- Instructions renamed to system_prompts
This means that from now on, to define a new system-level instruction, you can use this code:
system_prompt = TextualSystemPrompt( # <<<< Class name has changed
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n"
)
add_to_catalog(system_prompt, "system_prompts.models.alpaca", overwrite=True) # <<<< Catalog name has changed
It also means that all the system-level instructions were moved to the catalog under system_prompts
instead of instructions
.
This change is breaking old instruction but was necassry to enable the next very useful change.
- Templates can now (1) generate task specific instruction once at the head of the example, and (2) can add few words the model will say before the models' final prediction
This change was requested by many pepole.
For example here in this COLA dataset example:
User: Classify the grammatical acceptability of the following text to one of these options: unacceptable, acceptable. text: Fred watered the plants flat.
Agent: acceptable
User: Classify the grammatical acceptability of the following text to one of these options: unacceptable, acceptable. text: The pond froze solid.
Agent:
The instruction "Classify the ..." is reapted for every demonstration. Also with the current template there is no way to put few words that the agent will say before the prediciton for instance: "Agent: The class is ". With the new changes both of these important features are enabled.
If the old way for defining tempaltes for classification was:
add_to_catalog(
InputOutputTemplate(
input_format="Classify the {type_of_class} of the following {text_type} to one of these options: {classes}. {text_type}: {text}",
output_format="{label}",
),
"templates.classification.multi_class.default_no_instruction",
overwrite=True,
)
It is now defined this way:
add_to_catalog(
InputOutputTemplate(
input_format="{text_type}: {text}", # <<<< Changed
output_format="{label}",
target_prefix="The {type_of_class} is ", # <<<< Added
instruction="Classify the {type_of_class} of the following {text_type} to one of these options: {classes}.\n", # <<<< Added
),
"templates.classification.multi_class.instruction",
overwrite=True,
)
The new template fields instruction
and target_prefix
will produce this example:
Classify the grammatical acceptability of the following text to one of these options: unacceptable, acceptable.
User: text: Fred watered the plants flat.
Agent: The grammatical acceptability is acceptable
User: text: The pond froze solid.
Agent: The grammatical acceptability is
Notice how the instruction appears only once, and the target prefix is appearing after the 'Agent:'.
Read more in the tutorial on preparing templates.
- Loading from catalog with modifications
Now you can load an item from the catalog and change its fields. For example, if you want to use a task but with a different metric, you can use this syntax:
card = TaskCard(
loader=LoadHF(path="glue", name="cola"),
preprocess_steps=[...],
task="tasks.classification.multi_class[metrics=[metrics.matthews_correlation]]", # <<<< Modified
templates="templates.classification.multi_class.all",
)
add_to_catalog(card, "cards.cola", overwrite=True)
Read more in the tutorial on loading from the catalog.
- Renaming of
additional_inputs
totask_data
In an effort to more accurately represent the origin of certain fields within our system, we've renamed the additional_inputs
parameter to task_data
. This modification underscores the fact that these fields are derived directly from the task definition itself. This change is crucial for maintaining the integrity and reliability of metrics, as it ensures these fields are validated against the task schema. Consequently, developers crafting metrics for specific tasks can effortlessly ascertain which fields are accessible to them by simply referring to the task schema. This alignment between task definitions and metrics development fosters a more intuitive and efficient workflow for unitxt contributors.
Release Changes
BugFixes:
- Fix parser to allow source name that starts with numeric by @marukaz in #530
- Avoid race condition when download files to IBM COS cache by @yoavkatz in #536
- Updating perplexity computation, to apply exp(-x) by @assaftibm in #534
- Avoid duplicate values in UI by @Roni-Friedman in #552
- Fixed the test that generated a new entry in the catalog by @dafnapension in #550
- Fix artifact initialization dict creation to be recursive by @elronbandel in #559
- Enforce tests to use only local catalogs by @elronbandel in #564
- Fix multi label classification template and improve debugging by @yoavkatz in #571
- Fix classification code so multi-label metrics are not aware of 'none' by @yoavkatz in #580
- Fix MultiReferenceTemplate import by @perlitz in #583
- Add uncomitted processor by @elronbandel in #588
- Add missing processor in catalog by @yoavkatz in #590
- Docfix: Fix incorrect artifact names in Adding Dataset doc by @yifanmai in #591
- fixes to perplexity metric, updates to catalog by @assaftibm in #592
- Fix many datasets and templates by @elronbandel in #599
- Fix Test catalog preperation without hugginface access by @elronbandel in #601
- Fix format instruction same as source in templates by @dafnapension in #607
- Fixed formats and system prompts by @elronbandel in #604
- Add scipy to base requirements by @matanor in #611
- Reverese undocumented capitalization in templates by @elronbandel in #616
- Fix broken OptionalField in dataclass by @elronbandel in #619
- Fix some features of the Tempate for ffqa by @dafnapension in #613
- Fix problem in process_instance by @yoavkatz in #628
New Assets:
- Added table serializers operators and add Wikitq table question answering dataset by @csrajmohan in #544
- Added human eval dataset by @OfirArviv in #509
- Added Clinc and news datasets by @ilyashnil in #578
- Added cards for cohere for ai aya dataset by @dafnapension in #579
- Add multi class relation classification task and change nli datasets to use it by @elronbandel in #586
- Eval metrics by @lilacheden in #587
- Add tab_fact dataset, a dataset for classification of textual entailment from tables by @csrajmohan in #582
- Add filtered ffqa dataset by @marukaz in #593
- Add universal_ner by @elronbandel in #622
- Add atis dataset by @elronbandel in #629
Enhancments
- Tests can be done now also on PRs from forks. by @elronbandel in #537 #538
- Show artifact class details in the documentation. by @dafnapension in #528
- UI improvements by @Roni-Friedman in #541
- Update README.md by @eltociear in #540
- Add artifact_identifier to Artifact objects loaded from the catalog, linking them to their catalog name. by @matanor in #545 #547 #546
- allow imports list for executequery and filterbyquery and rename to ExecuteExpression and FilterByExpression by @dafnapension in #542
- Add tests for api is presented in the unitxt paper. by @elronbandel in #558
- Extend the function that evaluate with unitxt metric on external data to new types of data by @assaftibm in #557
- Add Kendall's tau metric by @lilacheden in #535
- Add new table operators for serialization & truncation by @csrajmohan in #567
- Unitxt should operate with no package requirements by default. This adds some tools to do so. by @elronbandel in #570
- Seperate library tests and catalog preperation by @elronbandel in #572
- Add class for constants handling by @elronbandel in #575
- Add code needed for evaluating metrics as models by @lilacheden in #573
- Improved error message when using TemplateDict by @yoavkatz in #499
- Add ability to load from catalog with arguments overwrite by @elronbandel in #581
- Add Grouped instance metric inherit from InstanceMetrics by @sam-data-guy-iam in #452
- Website touch up. by @elronbandel in #597
- Add structured data operators for serializing tablerows, triples and keyvalue pairs added by @csrajmohan in #589
- Allow dicts in overwrite args of fetched artifact by @dafnapension in #598
- filter on loading rather than increase loading limit by @dafnapension in #584
- Reduce log size by printing less by @elronbandel in #605
- Add Support for execution of metrics on a remote host by @matanor in #568
- Add signatures to dataclasses init for clearer docs by @elronbandel in #624
🚨 Breaking Changes 🚨
- Rename answer_relevance to answer_reward by @assaftibm in #539
- Migrate (task-related) Instruction into Template, and introduce (task independent) SystemPrompt by @dafnapension in #565
- Rename additional_inputs to task_data and make it a simple json dumped by @elronbandel in #595
- Add shuffling to banking77 and few more classification datasets @ilyashnil in #603 this was necassry in order to balance the classes in those dataset.
New Contributors
- @eltociear made their first contribution in #540
- @csrajmohan made their first contribution in #544
- @yifanmai made their first contribution in #591
- @sam-data-guy-iam made their first contribution in #452
Full Changelog: 1.6.1...1.7.0