Evaluation: Remove parallel from composite evaluators (Azure#38168)

* Remove `parallel` from composite evaluators * update recording * update * output_dir check * fix the test recording * fix the failed unit-test * update changelog * update * fix black issue * revert output_path related change * Update sdk/evaluation/azure-ai-evaluation/CHANGELOG.md Co-authored-by: Neehar Duvvuri <40341266+needuv@users.noreply.github.com> --------- Co-authored-by: Neehar Duvvuri <40341266+needuv@users.noreply.github.com>
allenkim0129 · Nov 5, 2024 · 2c3ad69 · 2c3ad69
1 parent 83dfe39
commit 2c3ad69
Show file tree

Hide file tree

Showing 9 changed files with 19 additions and 340 deletions.
diff --git a/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md b/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md
@@ -5,6 +5,7 @@
 ### Features Added
 
 ### Breaking Changes
+- The `parallel` parameter has been removed from composite evaluators: `QAEvaluator`, `ContentSafetyChatEvaluator`, and `ContentSafetyMultimodalEvaluator`. To control evaluator parallelism, you can now use the `_parallel` keyword argument, though please note that this private parameter may change in the future.
 
 ### Bugs Fixed
 - Output of adversarial simulators are of type `JsonLineList` and the helper function `to_eval_qr_json_lines` now outputs context from both user and assistant turns along with `category` if it exists in the conversation

diff --git a/sdk/evaluation/azure-ai-evaluation/assets.json b/sdk/evaluation/azure-ai-evaluation/assets.json
@@ -2,5 +2,5 @@
   "AssetsRepo": "Azure/azure-sdk-assets",
   "AssetsRepoPrefixPath": "python",
   "TagPrefix": "python/evaluation/azure-ai-evaluation",
-  "Tag": "python/evaluation/azure-ai-evaluation_e3ec13551e"
+  "Tag": "python/evaluation/azure-ai-evaluation_daf1ed16fc"
 }
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_telemetry/__init__.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_telemetry/__init__.py
@@ -123,7 +123,8 @@ def wrapper(*args: P.args, **kwargs: P.kwargs) -> EvaluationResult:
             user_agent=USER_AGENT,
         )
 
-        track_in_cloud = bool(pf_client._config.get_trace_destination())  # pylint: disable=protected-access
+        trace_destination = pf_client._config.get_trace_destination()  # pylint: disable=protected-access
+        track_in_cloud = bool(trace_destination) if trace_destination != "none" else False
         evaluate_target = bool(kwargs.get("target", None))
         evaluator_config = bool(kwargs.get("evaluator_config", None))
         custom_dimensions: Dict[str, Union[str, bool]] = {

diff --git a/...on/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety.py b/...on/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety.py
@@ -71,7 +71,7 @@ class ContentSafetyEvaluator(EvaluatorBase[Union[str, float]]):
     # TODO address 3579092 to re-enabled parallel evals.
     def __init__(self, credential, azure_ai_project, eval_last_turn: bool = False, **kwargs):
         super().__init__(eval_last_turn=eval_last_turn)
-        self._parallel = kwargs.pop("parallel", False)
+        self._parallel = kwargs.pop("_parallel", False)
         self._evaluators: List[Callable[..., Dict[str, Union[str, float]]]] = [
             ViolenceEvaluator(credential, azure_ai_project),
             SexualEvaluator(credential, azure_ai_project),

diff --git a/...ure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py b/...ure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py
diff --git a/...e-ai-evaluation/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal.py b/...e-ai-evaluation/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal.py
@@ -28,9 +28,8 @@ class ContentSafetyMultimodalEvaluator:
     :param azure_ai_project: The scope of the Azure AI project, containing the subscription ID,
         resource group, and project name.
     :type azure_ai_project: ~azure.ai.evaluation.AzureAIProject
-    :param parallel: Specifies whether to use parallel execution for evaluators.
-        If True, evaluators execute in parallel; otherwise, they execute sequentially. Defaults to True.
-    :type parallel: bool
+    :param kwargs: Additional arguments to pass to the evaluator.
+    :type kwargs: Any
 
     :return: A function that evaluates multimodal chat messages and generates content safety metrics.
     :rtype: Callable
@@ -92,8 +91,8 @@ class ContentSafetyMultimodalEvaluator:
 
     """
 
-    def __init__(self, credential, azure_ai_project, parallel: bool = False):
-        self._parallel = parallel
+    def __init__(self, credential, azure_ai_project, **kwargs):
+        self._parallel = kwargs.pop("_parallel", False)
         self._evaluators: List[Callable[..., Dict[str, Union[str, float]]]] = [
             ViolenceMultimodalEvaluator(credential=credential, azure_ai_project=azure_ai_project),
             SexualMultimodalEvaluator(credential=credential, azure_ai_project=azure_ai_project),