Skip to content

Pull requests: openai/evals

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Ice linguistic benchmark
#1561 opened Oct 1, 2024 by bjarkiarmanns Loading…
1 task
anthropic_solver.py
#1554 opened Sep 4, 2024 by iHuydang Loading…
13 tasks done
Fix a bug in examples/mmlu.ipynb when using gpt-4o or gpt-4o-mini
#1551 opened Aug 25, 2024 by RobinWitch Loading…
13 tasks done
Fix the is_chat_model function to work with gpt-4o
#1550 opened Aug 22, 2024 by LoryPack Loading…
3 tasks done
Added Icelandic QA evaluation data from news texts
#1548 opened Aug 20, 2024 by thorunna Loading…
12 of 13 tasks
Added Icelandic QA evaluation data from Wikipedia
#1547 opened Aug 20, 2024 by thorunna Loading…
12 of 13 tasks
Updating make-me-say to be compatible with Solvers
#1546 opened Aug 18, 2024 by lennart-finke Loading…
1 task done
Fix Information exposure alert through an exception #1543
#1545 opened Aug 8, 2024 by arpitjain099 Loading…
13 tasks done
Fix log injection error
#1544 opened Aug 8, 2024 by arpitjain099 Loading…
13 tasks done
Remove global OpenAI client initialization
#1539 opened Jul 21, 2024 by michaelAlvarino Loading…
Fix problematic sample in Schelling Point
#1534 opened May 22, 2024 by JunShern Loading…
Update README: Add Langtrace as an Eval vendor
#1531 opened May 21, 2024 by karthikscale3 Loading…
5 of 13 tasks
Add support for gpt-4o
#1530 opened May 16, 2024 by androettop Loading…
show evals in wandb weave
#1522 opened Apr 19, 2024 by yogeshg Draft
13 tasks
Added Quran Eval & Simple Fact Model-Graded Definition
#1511 opened Apr 1, 2024 by sakher Loading…
13 tasks done
Add Classification Rule Articulation Eval
#1510 opened Mar 30, 2024 by danesherbs Loading…
13 tasks done
eval pattern-concat-logic
#1508 opened Mar 28, 2024 by natanaelwf Loading…
13 tasks done
Fix specifying API arguments from the CLI
#1505 opened Mar 27, 2024 by LoryPack Loading…
6 tasks done
[Evals] Add eval for Dhivehi diacritical marks
#1495 opened Mar 16, 2024 by aanaseer Loading…
11 of 12 tasks
Add **kwargs to OpenAIChatCompletionFn
#1494 opened Mar 15, 2024 by ezraporter Loading…
Extending to Azure OpenAI implementation
#1470 opened Feb 23, 2024 by pkt1583 Loading…
Adding Indian Women Menstrual Health Chatbot Eval
#1430 opened Dec 11, 2023 by cranberrydeveloper Loading…
13 tasks done
ProTip! no:milestone will show everything without a milestone.