fix initalize model

confident-ai · Jan 23, 2025 · 329f102 · 329f102
1 parent 97a6af3
commit 329f102
Show file tree

Hide file tree

Showing 3 changed files with 156 additions and 4 deletions.
diff --git a/deepeval/metrics/utils.py b/deepeval/metrics/utils.py
@@ -269,7 +269,7 @@ def initialize_model(
         return model, False
 
     # If the model is a string, we initialize a GPTModel and use as a native model
-    if isinstance(model, str):
+    if isinstance(model, str) or model is None:
         return GPTModel(model=model), True
 
     # Otherwise (the model is a wrong type), we raise an error

diff --git a/docs/confident-ai/confident-ai-introduction.mdx b/docs/confident-ai/confident-ai-introduction.mdx
@@ -1,13 +1,13 @@
 ---
 id: confident-ai-introduction
-title: Confident AI Introduction
-sidebar_label: Confident AI Introduction
+title: Confident AI QuickStart
+sidebar_label: Confident AI QuickStart
 ---
 
 import Equation from "@site/src/components/equation";
 
 :::caution
-Without best LLM evaluation practices in place, your testing results aren't really valid, and you might be iterating back and fourth between the wrong things, which means your LLM application isn't nearly as performant as they should be.
+Are you following best LLM evaluation practices? Without a serious evaluation workflow, your testing results aren't really valid, and you might be wasting a lot of time iterating on the wrong things.
 :::
 
 **Confident AI is the LLM evaluation platform for DeepEval**. It is native to DeepEval, and was designed for teams building LLM applications to maximize its performance, and to safeguard against unsatisfactory LLM outputs. Whilst DeepEval's open-source metrics are great for running evaluations, there is so much more to building a robust LLM evaluation workflow than collecting metric scores.
@@ -115,6 +115,25 @@ Confident AI solves all of your LLM evaluation problems so you can stop going ar
   />
 </div>
 
+## Installation
+
+Go to the root directory of your project and create a virtual environment (if you don't already have one). In the CLI, run:
+
+```python
+python3 -m venv venv
+source venv/bin/activate
+```
+
+In your newly created virtual environment, run:
+
+```python
+pip install -U deepeval
+```
+
+:::note
+We always recommend keeping `deepeval` updated to its latest version to use Confident AI.
+:::
+
 ## Login to Confident AI
 
 Everything in `deepeval` is already automatically integrated with Confident AI, including any [custom metrics](/docs/metrics-custom) you've built on `deepeval`. To start using Confident AI with `deepeval`, simply login in the CLI:
@@ -139,3 +158,83 @@ deepeval login --confident-api-key "your-confident-api-key"
 ```
 
 :::
+
+## Run Your First Evaluation
+
+Now that you're logged in, create a python file, for example say `experiment_llm.py`. We're going to be evaluating a medical chatbot for this quickstart guide, but it can be any other LLM systems that you are building.
+
+<details><summary>Click to see fake data</summary>
+<p>
+
+```python
+fake_data = [
+    {
+        "input": "I have a persistent cough and fever. Should I be worried?",
+        "actual_output": (
+            "Based on your symptoms, it could be a sign of a viral or bacterial infection. "
+            "However, if the fever persists for more than three days or you experience difficulty breathing, "
+            "please consult a doctor immediately."
+        ),
+        "retrieval_context": [
+            "Coughing that lasts more than three weeks is typically classified as a chronic cough and could indicate conditions such as asthma, chronic bronchitis, or gastroesophageal reflux disease (GERD).",
+            "A fever is the body's natural response to infections, often caused by viruses or bacteria. Persistent fevers lasting more than three days should be evaluated by a healthcare professional as they may indicate conditions like pneumonia, tuberculosis, or sepsis.",
+            "Shortness of breath associated with fever and cough can be a sign of serious respiratory issues such as pneumonia, bronchitis, or COVID-19.",
+            "Self-care tips for mild symptoms include staying hydrated, taking over-the-counter fever reducers (e.g., acetaminophen or ibuprofen), and resting. Avoid suppressing a productive cough without consulting a healthcare provider."
+        ]
+    },
+    {
+        "input": "What should I do if I accidentally cut my finger deeply?",
+        "actual_output": (
+            "If you cut your finger deeply, just rinse it with water and avoid applying any pressure. "
+            "Tetanus shots aren't necessary unless you see redness immediately."
+        ),
+        "retrieval_context": [
+            "Deep cuts that are more than 0.25 inches deep or expose fat, muscle, or bone require immediate medical attention. Such wounds may need stitches to heal properly.",
+            "To minimize the risk of infection, wash the wound thoroughly with soap and water. Avoid using alcohol or hydrogen peroxide, as these can irritate the tissue and delay healing.",
+            "If the bleeding persists for more than 10 minutes or soaks through multiple layers of cloth or bandages, seek emergency care. Continuous bleeding might indicate damage to an artery or vein.",
+            "Watch for signs of infection, including redness, swelling, warmth, pain, or pus. Infections can develop even in small cuts if not properly cleaned or if the individual is at risk (e.g., diabetic or immunocompromised).",
+            "Tetanus, a bacterial infection caused by Clostridium tetani, can enter the body through open wounds. Ensure that your tetanus vaccination is up to date, especially if the wound was caused by a rusty or dirty object."
+        ]
+    }
+]
+
+```
+
+</p>
+</details>
+
+```python title="experiment_llm.py"
+from deepeval import evaluate
+from deepeval.test_case import LLMTestCase
+from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
+
+# See above for contents of fake data
+fake_data = [...]
+
+# Create a list of LLMTestCase
+test_cases = []
+for mock_test_case in mock_test_cases:
+  test_case = LLMTestCase(
+    input=mock_test_case["input"],
+    actual_output=mock_test_case["actual_output"],
+    retrieval_context=mock_test_case["retrieval_context"]
+  )
+  test_cases.append(test_case)
+
+# Define metrics
+answer_relevancy = AnswerRelevancyMetric()
+faithfulness = FaithfulnessMetric()
+
+# Run evaluation
+evaluate(test_cases=test_cases, metrics=[answer_relevancy, faithfulness])
+```
+
+```console
+python experiment_llm.py
+```
+
+And that's it! All you have to do is run `experiment_llm.py`, and Confident AI will automatically display you the results.
+
+:::tip
+If it's not displaying on Confident AI, it means you're not logged in. Run `deepeval login` again if that's the case.
+:::
diff --git a/g.py b/g.py
@@ -0,0 +1,53 @@
+from deepeval import evaluate
+from deepeval.test_case import LLMTestCase
+from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
+
+# See above for contents of fake data
+fake_data = [
+    {
+        "input": "I have a persistent cough and fever. Should I be worried?",
+        "actual_output": (
+            "Based on your symptoms, it could be a sign of a viral or bacterial infection. "
+            "However, if the fever persists for more than three days or you experience difficulty breathing, "
+            "please consult a doctor immediately."
+        ),
+        "retrieval_context": [
+            "Coughing that lasts more than three weeks is typically classified as a chronic cough and could indicate conditions such as asthma, chronic bronchitis, or gastroesophageal reflux disease (GERD).",
+            "A fever is the body's natural response to infections, often caused by viruses or bacteria. Persistent fevers lasting more than three days should be evaluated by a healthcare professional as they may indicate conditions like pneumonia, tuberculosis, or sepsis.",
+            "Shortness of breath associated with fever and cough can be a sign of serious respiratory issues such as pneumonia, bronchitis, or COVID-19.",
+            "Self-care tips for mild symptoms include staying hydrated, taking over-the-counter fever reducers (e.g., acetaminophen or ibuprofen), and resting. Avoid suppressing a productive cough without consulting a healthcare provider.",
+        ],
+    },
+    {
+        "input": "What should I do if I accidentally cut my finger deeply?",
+        "actual_output": (
+            "If you cut your finger deeply, just rinse it with water and avoid applying any pressure. "
+            "Tetanus shots aren't necessary unless you see redness immediately."
+        ),
+        "retrieval_context": [
+            "Deep cuts that are more than 0.25 inches deep or expose fat, muscle, or bone require immediate medical attention. Such wounds may need stitches to heal properly.",
+            "To minimize the risk of infection, wash the wound thoroughly with soap and water. Avoid using alcohol or hydrogen peroxide, as these can irritate the tissue and delay healing.",
+            "If the bleeding persists for more than 10 minutes or soaks through multiple layers of cloth or bandages, seek emergency care. Continuous bleeding might indicate damage to an artery or vein.",
+            "Watch for signs of infection, including redness, swelling, warmth, pain, or pus. Infections can develop even in small cuts if not properly cleaned or if the individual is at risk (e.g., diabetic or immunocompromised).",
+            "Tetanus, a bacterial infection caused by Clostridium tetani, can enter the body through open wounds. Ensure that your tetanus vaccination is up to date, especially if the wound was caused by a rusty or dirty object.",
+        ],
+    },
+]
+
+
+# Create a list of LLMTestCase
+test_cases = []
+for fake_datum in fake_data:
+    test_case = LLMTestCase(
+        input=fake_datum["input"],
+        actual_output=fake_datum["actual_output"],
+        retrieval_context=fake_datum["retrieval_context"],
+    )
+    test_cases.append(test_case)
+
+# Define metrics
+answer_relevancy = AnswerRelevancyMetric()
+faithfulness = FaithfulnessMetric()
+
+# Run evaluation
+evaluate(test_cases=test_cases, metrics=[answer_relevancy, faithfulness])