CRAFT-MD

CRAFT-MD is a robust and scalable evaluation framework designed to assess the conversational reasoning capabilities of clinical Large Language Models (LLMs) in real-world scenarios, going beyond traditional accuracy metrics derived from exam-style questions. The framework simulates doctor-patient interactions, where the clinical LLM's ability to gather medical histories, synthesize information, and arrive at accurate diagnoses is evaluated through a multi-agent setup. This setup includes a patient-AI, a grader-AI, and validation by medical experts to ensure the reliability of the results.

Visit CRAFT-MD Live Leaderboard

Contact

For any questions or suggestions, please open an issue or reach out to us at sjohri@g.harvard.edu

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
bower_components		bower_components
example_files		example_files
explore		explore
javascripts		javascripts
models		models
results		results
stylesheets		stylesheets
.gitignore		.gitignore
CNAME		CNAME
LICENSE		LICENSE
README.md		README.md
cite.txt		cite.txt
favicon.ico		favicon.ico
generate_html_from_csv.py		generate_html_from_csv.py
generate_index_html.py		generate_index_html.py
generate_rank_from_csv.py		generate_rank_from_csv.py
generate_seperate_html_from_csv.py		generate_seperate_html_from_csv.py
generate_seperate_html_from_csv_backup.py		generate_seperate_html_from_csv_backup.py
google81e9c2b1e55b836b.html		google81e9c2b1e55b836b.html
index.html		index.html
logo.png		logo.png
package.json		package.json
server.js		server.js
sitemap.xml		sitemap.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRAFT-MD

Contact

About

Releases

Packages

Languages

License

rajpurkarlab/craft-md-pages

Folders and files

Latest commit

History

Repository files navigation

CRAFT-MD

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages