Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull request #171

Open
wants to merge 80 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
cb4709c
Simple initial implementation
jyx-su Nov 1, 2024
4228d4d
Add FHIR server yaml
jyx-su Nov 6, 2024
49b063c
script for starting the FHIR server
jyx-su Nov 6, 2024
caee1c1
Merge branch 'main' of github.com:stanfordmlgroup/MedAgentBench-dev
jyx-su Nov 8, 2024
c85e7f0
add ref_sol for task1
jyx-su Nov 8, 2024
24f7192
Move code
jyx-su Nov 8, 2024
26f7cb0
Add example of json
jyx-su Nov 8, 2024
8189592
Add files via upload
Nov 12, 2024
c255351
Add files via upload
Nov 12, 2024
e098fe9
Add files via upload
Nov 12, 2024
e055d16
Add files via upload
Nov 12, 2024
f525ca6
Add files via upload
Nov 12, 2024
e1c668b
Update start_fhir.sh
jyx-su Dec 11, 2024
b8995fe
Add files via upload
Dec 16, 2024
2c457a4
Add files via upload
jyx-su Dec 18, 2024
a4be875
Skip API key upload
jyx-su Jan 1, 2025
c4a2429
Stop tracking API key
jyx-su Jan 1, 2025
53cae92
Add 4o-mini agent
jyx-su Jan 1, 2025
6fac1df
move test data
jyx-su Jan 1, 2025
9f6900e
remove dummy implementation
jyx-su Jan 1, 2025
debfe6a
change task_worker
jyx-su Jan 1, 2025
fc6454b
modify assignment config
jyx-su Jan 1, 2025
379a775
task config
jyx-su Jan 1, 2025
698c573
Add to import
jyx-su Jan 1, 2025
d8a3dc7
add send_get_request helper func
jyx-su Jan 1, 2025
01cdf1f
add verify_fhir func
jyx-su Jan 1, 2025
d984e6c
First working version (no logic yet)
jyx-su Jan 1, 2025
b3ec85f
Add function definitions
jyx-su Jan 2, 2025
8b5a8c6
Add func definitions
jyx-su Jan 2, 2025
0ff8c61
Remove final_answer
jyx-su Jan 2, 2025
568170a
Remove final_answer
jyx-su Jan 2, 2025
bf68e6b
Add prompt
jyx-su Jan 2, 2025
c270ec0
modify prompt
jyx-su Jan 2, 2025
7199234
Catch any error
jyx-su Jan 2, 2025
10718fe
Merge branch 'main' of github.com:stanfordmlgroup/MedAgentBench-dev
jyx-su Jan 2, 2025
36812df
update prompt
jyx-su Jan 2, 2025
076ac3c
Add data read
jyx-su Jan 2, 2025
40ad402
Add implementation logic
jyx-su Jan 2, 2025
fd599b8
json for task1&2
jyx-su Jan 6, 2025
eb7d7a2
grader
jyx-su Jan 6, 2025
008b99e
reference solution
jyx-su Jan 6, 2025
a856725
update prompt
jyx-su Jan 6, 2025
acd2104
Add grader logic
jyx-su Jan 6, 2025
dd991fb
change to full dataset
jyx-su Jan 6, 2025
a075343
Add POST funcs
jyx-su Jan 7, 2025
1a3bbd5
expand to task 1-4
jyx-su Jan 7, 2025
ec6f09c
add datetime format
jyx-su Jan 7, 2025
f0d2135
Modify context for task3-4
jyx-su Jan 7, 2025
a0b6ef4
Increase parallelism
jyx-su Jan 7, 2025
5b01c9d
Update prompt
jyx-su Jan 7, 2025
f21e19d
Update medicationrequest.create
jyx-su Jan 8, 2025
8cb31ed
add 4o and o1-mini
jyx-su Jan 8, 2025
42d28f4
Add get funcs only
jyx-su Jan 8, 2025
4344181
add task3 refsol
jyx-su Jan 8, 2025
7569479
add task4 refsol
jyx-su Jan 8, 2025
986a991
add gemini 2.0 flash model
jyx-su Jan 16, 2025
b1605c8
add llama 3.3
jyx-su Jan 16, 2025
48af4b6
remove get_funcs
jyx-su Jan 17, 2025
427f056
Final v1 func defs
jyx-su Jan 17, 2025
831ba6d
Tasks for v1 release
jyx-su Jan 17, 2025
c073e76
ignore configs with api keys
jyx-su Jan 17, 2025
bf511dc
increase concurrency
jyx-su Jan 17, 2025
46ab3fd
add deepseekv3
jyx-su Jan 17, 2025
307a93c
update model list
jyx-su Jan 17, 2025
0f0069b
update config
jyx-su Jan 17, 2025
ec07ed2
disable filtering
jyx-su Jan 17, 2025
89b3205
update prompt
jyx-su Jan 17, 2025
5e2afc2
bug fix
jyx-su Jan 17, 2025
cbe6260
add sol for task5-10
jyx-su Jan 17, 2025
282aba9
Create funcs_v1.py
jyx-su Jan 17, 2025
06f2c60
Create dummy
jyx-su Jan 18, 2025
9458f40
Create folder
jyx-su Jan 18, 2025
45ae4f9
Upload results
jyx-su Jan 18, 2025
6abe76e
update parser for gemini2.0flash
jyx-su Jan 18, 2025
8545e3e
Merge branch 'main' of github.com:stanfordmlgroup/MedAgentBench-dev
jyx-su Jan 18, 2025
c51de9a
bug fix
jyx-su Jan 22, 2025
57a07ff
remove sleep
jyx-su Jan 22, 2025
67ac027
add models
jyx-su Jan 22, 2025
7429d23
add models
jyx-su Jan 22, 2025
624acfa
remove
jyx-su Jan 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,7 @@ download
src/server/tasks/card_game/result
.dockerfile
.dockerfile-cache
analysis
analysis
configs/agents/openai-chat.yaml
configs/agents/together-chat.yaml
configs/agents/gemini-chat.yaml
329 changes: 329 additions & 0 deletions MedAgentBench/application.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,329 @@
#Uncomment the "servlet" and "context-path" lines below to make the fhir endpoint available at /example/path/fhir instead of the default value of /fhir
server:
# servlet:
# context-path: /example/path
port: 8080
#Adds the option to go to eg. http://localhost:8080/actuator/health for seeing the running configuration
#see https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html#actuator.endpoints
management:
#The following configuration will enable the actuator endpoints at /actuator/health, /actuator/info, /actuator/prometheus, /actuator/metrics. For security purposes, only /actuator/health is enabled by default.
endpoints:
enabled-by-default: false
web:
exposure:
include: 'health' # or e.g. 'info,health,prometheus,metrics' or '*' for all'
endpoint:
info:
enabled: true
metrics:
enabled: true
health:
enabled: true
probes:
enabled: true
group:
liveness:
include:
- livenessState
- readinessState
prometheus:
enabled: true
prometheus:
metrics:
export:
enabled: true
spring:
main:
allow-circular-references: true
flyway:
enabled: false
baselineOnMigrate: true
fail-on-missing-locations: false
datasource:
url: 'jdbc:h2:/data/test_db;DB_CLOSE_DELAY=-1;AUTO_SERVER=TRUE'
#url: 'jdbc:h2:file:./target/database/h2'
#url: jdbc:h2:mem:test_mem
username: sa
password: null
driverClassName: org.h2.Driver
max-active: 15

# database connection pool size
hikari:
maximum-pool-size: 10
jpa:
properties:
hibernate.format_sql: false
hibernate.show_sql: false

#Hibernate dialect is automatically detected except Postgres and H2.
#If using H2, then supply the value of ca.uhn.fhir.jpa.model.dialect.HapiFhirH2Dialect
#If using postgres, then supply the value of ca.uhn.fhir.jpa.model.dialect.HapiFhirPostgresDialect
hibernate.dialect: ca.uhn.fhir.jpa.model.dialect.HapiFhirH2Dialect
# hibernate.hbm2ddl.auto: update
# hibernate.jdbc.batch_size: 20
# hibernate.cache.use_query_cache: false
# hibernate.cache.use_second_level_cache: false
# hibernate.cache.use_structured_entries: false
# hibernate.cache.use_minimal_puts: false

### These settings will enable fulltext search with lucene or elastic
hibernate.search.enabled: false
### lucene parameters
# hibernate.search.backend.type: lucene
# hibernate.search.backend.analysis.configurer: ca.uhn.fhir.jpa.search.HapiHSearchAnalysisConfigurers$HapiLuceneAnalysisConfigurer
# hibernate.search.backend.directory.type: local-filesystem
# hibernate.search.backend.directory.root: target/lucenefiles
# hibernate.search.backend.lucene_version: lucene_current
### elastic parameters ===> see also elasticsearch section below <===
# hibernate.search.backend.type: elasticsearch
# hibernate.search.backend.analysis.configurer: ca.uhn.fhir.jpa.search.HapiHSearchAnalysisConfigurers$HapiElasticAnalysisConfigurer
hapi:
fhir:
### This flag when enabled to true, will avail evaluate measure operations from CR Module.
### Flag is false by default, can be passed as command line argument to override.
cr:
enabled: false
caregaps:
reporter: "default"
section_author: "default"
cql:
use_embedded_libraries: true
compiler:
### These are low-level compiler options.
### They are not typically needed by most users.
# validate_units: true
# verify_only: false
# compatibility_level: "1.5"
error_level: Info
signature_level: All
# analyze_data_requirements: false
# collapse_data_requirements: false
# translator_format: JSON
# enable_date_range_optimization: true
enable_annotations: true
enable_locators: true
enable_results_type: true
enable_detailed_errors: true
# disable_list_traversal: false
# disable_list_demotion: false
# enable_interval_demotion: false
# enable_interval_promotion: false
# disable_method_invocation: false
# require_from_keyword: false
# disable_default_model_info_load: false
runtime:
debug_logging_enabled: false
# enable_validation: false
# enable_expression_caching: true
terminology:
valueset_preexpansion_mode: REQUIRE # USE_IF_PRESENT, REQUIRE, IGNORE
valueset_expansion_mode: PERFORM_NAIVE_EXPANSION # AUTO, USE_EXPANSION_OPERATION, PERFORM_NAIVE_EXPANSION
valueset_membership_mode: USE_EXPANSION # AUTO, USE_VALIDATE_CODE_OPERATION, USE_EXPANSION
code_lookup_mode: USE_VALIDATE_CODE_OPERATION # AUTO, USE_VALIDATE_CODE_OPERATION, USE_CODESYSTEM_URL
data:
search_parameter_mode: FILTER_IN_MEMORY # AUTO, USE_SEARCH_PARAMETERS, FILTER_IN_MEMORY
terminology_parameter_mode: FILTER_IN_MEMORY # AUTO, USE_VALUE_SET_URL, USE_INLINE_CODES, FILTER_IN_MEMORY
profile_mode: DECLARED # ENFORCED, DECLARED, OPTIONAL, TRUST, OFF

cdshooks:
enabled: false
clientIdHeaderName: client_id

### This enables the swagger-ui at /fhir/swagger-ui/index.html as well as the /fhir/api-docs (see https://hapifhir.io/hapi-fhir/docs/server_plain/openapi.html)
openapi_enabled: true
### This is the FHIR version. Choose between, DSTU2, DSTU3, R4 or R5
fhir_version: R4
### Flag is false by default. This flag enables runtime installation of IG's.
ig_runtime_upload_enabled: false
### This flag when enabled to true, will avail evaluate measure operations from CR Module.

### enable to use the ApacheProxyAddressStrategy which uses X-Forwarded-* headers
### to determine the FHIR server address
# use_apache_address_strategy: false
### forces the use of the https:// protocol for the returned server address.
### alternatively, it may be set using the X-Forwarded-Proto header.
# use_apache_address_strategy_https: false
### enables the server to overwrite defaults on HTML, css, etc. under the url pattern of eg. /content/custom **
### Folder with custom content MUST be named custom. If omitted then default content applies
#custom_content_path: ./custom
### enables the server host custom content. If e.g. the value ./configs/app is supplied then the content
### will be served under /web/app
#app_content_path: ./configs/app
### enable to set the Server URL
# server_address: http://hapi.fhir.org/baseR4
# defer_indexing_for_codesystems_of_size: 101
# install_transitive_ig_dependencies: true
#implementationguides:
### example from registry (packages.fhir.org)
# swiss:
# name: swiss.mednet.fhir
# version: 0.8.0
# reloadExisting: false
# installMode: STORE_AND_INSTALL
# example not from registry
# ips_1_0_0:
# packageUrl: https://build.fhir.org/ig/HL7/fhir-ips/package.tgz
# name: hl7.fhir.uv.ips
# version: 1.0.0
# supported_resource_types:
# - Patient
# - Observation
##################################################
# Allowed Bundle Types for persistence (defaults are: COLLECTION,DOCUMENT,MESSAGE)
##################################################
# allowed_bundle_types: COLLECTION,DOCUMENT,MESSAGE,TRANSACTION,TRANSACTIONRESPONSE,BATCH,BATCHRESPONSE,HISTORY,SEARCHSET
# allow_cascading_deletes: true
# allow_contains_searches: true
# allow_external_references: true
# allow_multiple_delete: true
# allow_override_default_search_params: true
# auto_create_placeholder_reference_targets: false
# mass_ingestion_mode_enabled: false
### tells the server to automatically append the current version of the target resource to references at these paths
# auto_version_reference_at_paths: Device.patient, Device.location, Device.parent, DeviceMetric.parent, DeviceMetric.source, Observation.device, Observation.subject
# ips_enabled: false
# default_encoding: JSON
# default_pretty_print: true
# default_page_size: 20
# delete_expunge_enabled: true
# enable_repository_validating_interceptor: true
# enable_index_missing_fields: false
# enable_index_of_type: true
# enable_index_contained_resource: false
# upliftedRefchains_enabled: true
# resource_dbhistory_enabled: false
### !!Extended Lucene/Elasticsearch Indexing is still a experimental feature, expect some features (e.g. _total=accurate) to not work as expected!!
### more information here: https://hapifhir.io/hapi-fhir/docs/server_jpa/elastic.html
advanced_lucene_indexing: false
bulk_export_enabled: false
bulk_import_enabled: false
# language_search_parameter_enabled: true
# enforce_referential_integrity_on_delete: false
# This is an experimental feature, and does not fully support _total and other FHIR features.
# enforce_referential_integrity_on_delete: false
# enforce_referential_integrity_on_write: false
# etag_support_enabled: true
# expunge_enabled: true
# client_id_strategy: ALPHANUMERIC
# server_id_strategy: SEQUENTIAL_NUMERIC
# fhirpath_interceptor_enabled: false
# filter_search_enabled: true
# graphql_enabled: true
narrative_enabled: false
mdm_enabled: false
mdm_rules_json_location: "mdm-rules.json"
# local_base_urls:
# - https://hapi.fhir.org/baseR4
logical_urls:
- http://terminology.hl7.org/*
- https://terminology.hl7.org/*
- http://snomed.info/*
- https://snomed.info/*
- http://unitsofmeasure.org/*
- https://unitsofmeasure.org/*
- http://loinc.org/*
- https://loinc.org/*
# partitioning:
# allow_references_across_partitions: false
# partitioning_include_in_search_hashes: false
cors:
allow_Credentials: true
# These are allowed_origin patterns, see: https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/web/cors/CorsConfiguration.html#setAllowedOriginPatterns-java.util.List-
allowed_origin:
- '*'

# Search coordinator thread pool sizes
search-coord-core-pool-size: 20
search-coord-max-pool-size: 100
search-coord-queue-capacity: 200

# Search Prefetch Thresholds.

# This setting sets the number of search results to prefetch. For example, if this list
# is set to [100, 1000, -1] then the server will initially load 100 results and not
# attempt to load more. If the user requests subsequent page(s) of results and goes
# past 100 results, the system will load the next 900 (up to the following threshold of 1000).
# The system will progressively work through these thresholds.
# A threshold of -1 means to load all results. Note that if the final threshold is a
# number other than -1, the system will never prefetch more than the given number.
search_prefetch_thresholds: 13,503,2003,-1

# comma-separated package names, will be @ComponentScan'ed by Spring to allow for creating custom Spring beans
#custom-bean-packages:

# comma-separated list of fully qualified interceptor classes.
# classes listed here will be fetched from the Spring context when combined with 'custom-bean-packages',
# or will be instantiated via reflection using an no-arg contructor; then registered with the server
#custom-interceptor-classes:

# comma-separated list of fully qualified provider classes.
# classes listed here will be fetched from the Spring context when combined with 'custom-bean-packages',
# or will be instantiated via reflection using an no-arg contructor; then registered with the server
#custom-provider-classes:

# Threadpool size for BATCH'ed GETs in a bundle.
# bundle_batch_pool_size: 10
# bundle_batch_pool_max_size: 50

# logger:
# error_format: 'ERROR - ${requestVerb} ${requestUrl}'
# format: >-
# Path[${servletPath}] Source[${requestHeader.x-forwarded-for}]
# Operation[${operationType} ${operationName} ${idOrResourceName}]
# UA[${requestHeader.user-agent}] Params[${requestParameters}]
# ResponseEncoding[${responseEncodingNoDefault}]
# log_exceptions: true
# name: fhirtest.access
# max_binary_size: 104857600
# max_page_size: 200
# retain_cached_searches_mins: 60
# reuse_cached_search_results_millis: 60000
tester:
home:
name: Local Tester
server_address: 'http://localhost:8080/fhir'
refuse_to_fetch_third_party_urls: false
fhir_version: R4
global:
name: Global Tester
server_address: "http://hapi.fhir.org/baseR4"
refuse_to_fetch_third_party_urls: false
fhir_version: R4
# validation:
# requests_enabled: true
# responses_enabled: true
# binary_storage_enabled: true
inline_resource_storage_below_size: 4000
# bulk_export_enabled: true
# subscription:
# resthook_enabled: true
# websocket_enabled: false
# email:
# from: some@test.com
# host: google.com
# port:
# username:
# password:
# auth:
# startTlsEnable:
# startTlsRequired:
# quitWait:
# lastn_enabled: true
# store_resource_in_lucene_index_enabled: true
### This is configuration for normalized quantity search level default is 0
### 0: NORMALIZED_QUANTITY_SEARCH_NOT_SUPPORTED - default
### 1: NORMALIZED_QUANTITY_STORAGE_SUPPORTED
### 2: NORMALIZED_QUANTITY_SEARCH_SUPPORTED
# normalized_quantity_search_level: 2
#elasticsearch:
# debug:
# pretty_print_json_log: false
# refresh_after_write: false
# enabled: false
# password: SomePassword
# required_index_status: YELLOW
# rest_url: 'localhost:9200'
# protocol: 'http'
# schema_management_strategy: CREATE
# username: SomeUsername
Loading