Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[Safety] Fix a Static Task bug and Safety README #3612

Merged
merged 6 commits into from
Jul 7, 2021
Merged

[Safety] Fix a Static Task bug and Safety README #3612

merged 6 commits into from
Jul 7, 2021

Conversation

jxmsML
Copy link
Contributor

@jxmsML jxmsML commented Apr 21, 2021

Patch description
Patch on safety human safety

  • Just notice that the UI always enable response field in static task (if responseField !==null, which is always true given the default value for responseField is False), add a fix to it.
  • More explanation on how to run human safety evaluation, with new script parsing world logs to human eval ready format.
  • change logging info -> warn once, otherwise the loggings output is nasty for bs > 1

Patch on world logging (moved to #3674), this branch is rebased on the feature branch convo_log_pad in #3674

  • filter by message.is_padding() when write episodes to self._current_episodes.
  • modify the test tests/test_eval_model.py on test_save_report.

Testing steps

Other information

python projects/safety_recipes/human_safety_evaluation/format_safety_ready.py --world-logs-path tmp/world_logs.jsonl --eval-logs-dir tmp/human_safety_evaluation
```

2) Specify turn indices per conversation to annotate [here](https://github.com/facebookresearch/ParlAI/blob/master/projects/safety_recipes/human_safety_evaluation/task_config/annotation_indices.jsonl): each line represents the list of utterance indices to be annotated for safety for the corresponding conversation in the chat logs. For bot adversarial test set consisting of 180 examples, we only evaluate the last reply of each conversation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this part necessary if we use the command above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if yes, can we make it automated? if no, can we make it clear that it's not necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes. I just edit this ( running format_safety_ready.py should automatically generate the annotation_indices.jsonl as well as the task_data.jsonl)

with PathManager.open(world_logs_path) as data_file:
for l in data_file.readlines():
episode = json.loads(l.strip())
# TODO: when conversation format is finished please remove this line;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean?

Copy link
Contributor Author

@jxmsML jxmsML Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there is a bug in conversation format that would generate lines in the world_logs as following:

{"dialog": [[{"batch_padding": true, "episode_done": true, "id": "bot_adversarial_dialogue:HumanSafetyEvaluation.persona_False_flatten_False"}, {"id": "TransformerGenerator", "episode_done": false}]], "context": [], "metadata_path": "tmp/world_logs.metadata"}

I added a hack to skip those when parsing but, there is room for removing that hack after the bug above is fixed.

@jxmsML jxmsML force-pushed the safetyfix branch 3 times, most recently from 6410807 to da18ff9 Compare April 21, 2021 18:18
@jxmsML jxmsML changed the title Fix a Static Task bug and Safety README [Safety] Fix a Static Task bug and Safety README Apr 21, 2021
Copy link
Contributor

@emilydinan emilydinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ready to go, but let's wait to merge until the aforementioned bug is fix so we can get rid of that comment/hack and rebase on top

@jxmsML jxmsML requested a review from stephenroller April 28, 2021 22:36
@@ -74,12 +75,17 @@ def _add_msgs(self, acts, idx=0):
"""
msgs = []
for act in acts:
# padding examples in the episode[0]
if isinstance(act, Message) and act.is_padding():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only filter out if act is Message otherwise it'll break the unittests for act is dict

@jxmsML
Copy link
Contributor Author

jxmsML commented May 26, 2021

This pr is rebased on #3674. (separate the changes on safety test and world log saving)

@jxmsML jxmsML changed the base branch from master to convo_log_pad May 26, 2021 20:51
Base automatically changed from convo_log_pad to master June 4, 2021 00:29
@jxmsML jxmsML merged commit 3cd8646 into master Jul 7, 2021
@jxmsML jxmsML deleted the safetyfix branch July 7, 2021 14:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants