Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Dec 22, 2024
1 parent 2d448bf commit 6533efe
Show file tree
Hide file tree
Showing 2 changed files with 121 additions and 27 deletions.
42 changes: 15 additions & 27 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT-4o Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated December 21, 2024.</p>
<p>Tests are run every day at 1am PT. Last updated December 22, 2024.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand All @@ -58,12 +58,12 @@ <h1>How's GPT-4o Doing?</h1>
<div class="feature_header" style="min-height: auto">
<div class="feature_header_text" style="gap: var(--spacing-sizing-4)">
<h2>Response Time</h2>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.79 seconds</b> per request.</p>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.78 seconds</b> per request.</p>
<p class="subtitle">This number only accounts for requests made by this application.</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>3.79 s</p>
<p>3.78 s</p>
</div>
</div>
</div>
Expand Down Expand Up @@ -122,7 +122,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>8</pre>
<pre>9</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -230,7 +230,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>{'x': 0.5, 'y': 0.3, 'width': 0.25, 'height': 0.3}</pre>
<pre>{'x': 0.5, 'y': 0.4, 'width': 0.3, 'height': 0.35}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -286,22 +286,10 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"A": {
"quantity": 20,
"price": 10
},
"B": {
"quantity": 25,
"price": 20
},
"C": {
"quantity": 30,
"price": 30
},
"D": {
"quantity": 35,
"price": 40
}
"A": { "quantity": 20, "price": 10 },
"B": { "quantity": 25, "price": 20 },
"C": { "quantity": 30, "price": 30 },
"D": { "quantity": 35, "price": 40 }
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -361,7 +349,7 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
{
"R": 80,
"G": 0,
"B": 128
"B": 130
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -417,7 +405,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/annotationqa.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>By inspecting the image, it appears that all visible cars on the road, including the far ones, have been enclosed with red bounding boxes. Therefore, there don’t seem to be any missing annotations.
<pre>From the image, all visible cars appear to be annotated with red bounding boxes. There don’t seem to be any missing annotations for cars. Here's the JSON response:

```json
{
Expand Down Expand Up @@ -477,9 +465,9 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/measurement.jpg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>Based on the shown image with the ruler for reference, the square sticker appears to be approximately **3 inches by 3 inches**.
<pre>Based on the ruler in the image, the square sticker's sides are approximately 3.0 inches in length.

Heres the JSON response:
Here's the JSON:

```json
{
Expand Down Expand Up @@ -535,7 +523,7 @@ <h2>Zero Shot Classification</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>100%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.006</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.005</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand Down Expand Up @@ -657,7 +645,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/prescription.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>[{'name': 'MARY THOMAS', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]</pre>
<pre>[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down
106 changes: 106 additions & 0 deletions results/2024-12-22.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.00481,
"pass_fail": "Pass",
"response_time": 3.394958019256592,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.00882,
"pass_fail": "Fail",
"response_time": 2.2324256896972656,
"result": "9"
},
"document_ocr": {
"score": 0,
"success": false,
"price": 0.00988,
"pass_fail": "Fail",
"response_time": 2.104623317718506,
"result": "I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the *Midnights* album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times."
},
"handwriting_ocr": {
"score": 1,
"success": true,
"price": 0.00974,
"pass_fail": "Pass",
"response_time": 7.65594744682312,
"result": "The words of songs on the album have been echoing in my head all week. \"Fades into the grey of my day old tea.\""
},
"extraction_ocr": {
"score": 1.0,
"success": true,
"price": 0.00876,
"pass_fail": "Pass",
"response_time": 2.7041170597076416,
"result": "[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]"
},
"math_ocr": {
"score": 1.0,
"success": true,
"price": 0.015070000000000002,
"pass_fail": "Pass",
"response_time": 6.009929656982422,
"result": "3x^2-6x+2"
},
"object_detection": {
"score": 0.728360309641098,
"success": false,
"price": 0.01044,
"pass_fail": "Fail",
"response_time": 2.0986928939819336,
"result": "{'x': 0.5, 'y': 0.4, 'width': 0.3, 'height': 0.35}"
},
"graph_understanding": {
"score": 0.99,
"success": false,
"price": 0.011260000000000001,
"pass_fail": "Fail",
"response_time": 2.1601181030273438,
"result": "```json\n{\n \"A\": { \"quantity\": 20, \"price\": 10 },\n \"B\": { \"quantity\": 25, \"price\": 20 },\n \"C\": { \"quantity\": 30, \"price\": 30 },\n \"D\": { \"quantity\": 35, \"price\": 40 }\n}\n```"
},
"color_recognition": {
"score": 0.9594771241830066,
"success": false,
"price": 0.009850000000000001,
"pass_fail": "Fail",
"response_time": 3.184365749359131,
"result": "```json\n{\n \"R\": 80,\n \"G\": 0,\n \"B\": 130\n}\n```"
},
"annotation_qa": {
"score": 0.0,
"success": false,
"price": 0.01607,
"pass_fail": "Fail",
"response_time": 3.2027697563171387,
"result": "From the image, all visible cars appear to be annotated with red bounding boxes. There don\u2019t seem to be any missing annotations for cars. Here's the JSON response:\n\n```json\n{\n \"missing\": 0\n}\n```"
},
"measurement": {
"score": 0.8571428571428572,
"success": false,
"price": 0.010530000000000001,
"pass_fail": "Fail",
"response_time": 3.712054967880249,
"result": "Based on the ruler in the image, the square sticker's sides are approximately 3.0 inches in length.\n\nHere's the JSON:\n\n```json\n{\n \"length\": 3.0,\n \"width\": 3.0\n}\n```"
},
"easy_captcha": {
"score": 1,
"success": true,
"price": 0.00636,
"pass_fail": "Pass",
"response_time": 1.7338831424713135,
"result": "charybdis indubitable"
},
"easy_captcha_persuade": {
"score": 1,
"success": true,
"price": 0.006860000000000001,
"pass_fail": "Pass",
"response_time": 1.4471263885498047,
"result": "charybdis indubitable"
}
}

0 comments on commit 6533efe

Please sign in to comment.