Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13.0 release PR #482

Merged
merged 100 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
4d36f8a
Pushing changes for bert-style models for MetaCAT
shubham-s-agarwal Apr 19, 2024
da9ab06
Pushing fix for LSTM
shubham-s-agarwal Apr 19, 2024
cb65fc3
Pushing changes for flake8 and type fixes
shubham-s-agarwal Apr 19, 2024
869eeae
Pushing type fixes
shubham-s-agarwal Apr 19, 2024
3e02eed
Fixing type issue
shubham-s-agarwal Apr 19, 2024
c899c9c
Pushing changes
shubham-s-agarwal Apr 22, 2024
d1321b8
Pushing change and type fixes
shubham-s-agarwal Apr 22, 2024
9091d9b
Fixing flake8 issues
shubham-s-agarwal Apr 22, 2024
c57dcfe
Pushing flake8 fixes
shubham-s-agarwal Apr 23, 2024
364fdd4
Pushing fixes for flake8
shubham-s-agarwal Apr 23, 2024
7272168
Pushing flake8 fix
shubham-s-agarwal Apr 23, 2024
619c565
Adding peft to list of libraries
shubham-s-agarwal Apr 23, 2024
2a546c3
Pushing changes with load and train workflow and type fixes
shubham-s-agarwal Apr 30, 2024
8efd2a9
Pushing changes with type hints and new documentation
shubham-s-agarwal May 7, 2024
aa5044e
Pushing type fix
shubham-s-agarwal May 7, 2024
fcdc867
Fixing type issue
shubham-s-agarwal May 7, 2024
88ee8e7
Adding test case for BERT and reverting config changes
shubham-s-agarwal May 7, 2024
917dca2
Merging changes from master to metacat_bert branch (#431)
shubham-s-agarwal May 8, 2024
563c3d4
Merge branch 'master' into metacat_bert
mart-r May 8, 2024
decfbfb
Pushing changed tests and removing empty change
shubham-s-agarwal May 8, 2024
fbcdb70
Pushing change for logging
shubham-s-agarwal May 9, 2024
2657515
Revert "Pushing change for logging"
shubham-s-agarwal May 14, 2024
fbe9745
Merge pull request #419 from CogStack/metacat_bert
shubham-s-agarwal May 16, 2024
e46dca8
CU-8694hukwm: Document the materialising of generator when multiproce…
mart-r May 22, 2024
2872d5e
CU-8694fk90t (almost) only primitive config (#425)
mart-r May 22, 2024
8e7c77b
CU-8694gza88: Create codeql.yml (#434)
mart-r May 22, 2024
0c8f5a8
CU-8694mbn03: Remove the web app (#441)
mart-r May 29, 2024
20d2bce
CU-8694n48uw better deprecation (#443)
mart-r May 29, 2024
9d6a4e0
CU-8694pey4u: extract cdb load to cls method, to be used in trainer f…
tomolopolis May 29, 2024
61b5979
CU-8694pey4u: extract meta cat loading also to a cls method
tomolopolis May 29, 2024
6f8db03
Merge branch 'master' into load-cdb-cls-method
tomolopolis May 30, 2024
f836f65
CU-8694pey4u: docstrings
tomolopolis May 30, 2024
bc78a53
CU-8694pey4u: typehints and mypy issues
tomolopolis May 30, 2024
cf8a1ef
CU-8694pey4u: fix flake8
tomolopolis May 30, 2024
c6f0658
CU-8694pey4u: fix flake8
tomolopolis May 30, 2024
db7259a
Merge pull request #446 from CogStack/load-cdb-cls-method
tomolopolis May 30, 2024
e07b9d9
CU-8694pey4u: missing extra config if passed in
tomolopolis May 30, 2024
35de5ea
Merge pull request #448 from CogStack/load-cdb-cls-method
tomolopolis May 30, 2024
7a41641
CU-8694py1jr: Fix issue with reuse of opened file when loading old co…
mart-r May 31, 2024
03a5b56
CU-8694py1jr: Make old config identifier more robust
mart-r May 31, 2024
8e3c3c2
CU-8694py1jr: Add doc string to old config identifier
mart-r May 31, 2024
730cc14
CU-8694py1jr: Add test for old style MetaCAT config load
mart-r May 31, 2024
65a0c60
CU-8694py1jr: Add test for old style main config load (functional)
mart-r May 31, 2024
1a9df4d
CU-8694py1jr: Refactor config utils load tests for more flexibility
mart-r May 31, 2024
0a10265
CU-8694py1jr: Add config utils load tests for NER and Rel CAT configs
mart-r May 31, 2024
03c6881
Merge pull request #449 from CogStack/CU-8694py1jr-fix-other-configs-…
tomolopolis Jun 3, 2024
91ae2dd
CU-8694vcvz7: Trust remote code when loading transfomers NER dataset …
mart-r Jun 19, 2024
e11c1da
CU-8694gzbn3 k fold metrics (#432)
mart-r Jun 19, 2024
e4715ae
CU-8693n892x environment/dependency snapshots (#438)
mart-r Jun 19, 2024
df9f225
CU-8694p8y0k deprecation GHA check (#445)
mart-r Jun 19, 2024
1c3628d
CU-8694u3yd2 cleanup name removal (#450)
mart-r Jun 19, 2024
7058ec4
CU-8694vte2g 1.12 depr removal (#454)
mart-r Jun 19, 2024
3603dd2
Resync master with Production after 1.12 release (#457)
mart-r Jun 20, 2024
97389b4
CU-86951923u: Add option for simplified hash along with a few tests (…
mart-r Jul 17, 2024
018fd7a
CU-8694vbw6y k-fold stats Standard Deviation (#459)
mart-r Jul 18, 2024
96706c8
CU-8694wh3d5 track usage (#458)
mart-r Jul 23, 2024
396910f
Documentation update for chunking
shubham-s-agarwal Jul 24, 2024
132efcb
Updating the logging message
shubham-s-agarwal Jul 24, 2024
1eb008d
Pushing change for lazy evaluation
shubham-s-agarwal Jul 24, 2024
d20bf25
Merge pull request #466 from CogStack/deid_logging_update
shubham-s-agarwal Jul 29, 2024
cf285d0
CU-869588fdc: Comment on blis 1.0.0 (to show GHA failure)
mart-r Jul 29, 2024
7b764fe
CU-869588fdc: Pin blis to below 1.0.0
mart-r Jul 29, 2024
bf4cf63
CU-8694vv985 transitive deps (#463)
mart-r Jul 29, 2024
484c244
Merge pull request #471 from CogStack/CU-869588fdc-pin-blis
tomolopolis Jul 31, 2024
aee6bf6
Changes to documentation for metacat
shubham-s-agarwal Aug 2, 2024
1619f1c
Update config_meta_cat.py
shubham-s-agarwal Aug 2, 2024
838370e
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
f3f3a6a
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
51f65fb
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
f56116b
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
042931a
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
1e087ef
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
5969e11
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
2c20864
Pushing formatting changes
shubham-s-agarwal Aug 6, 2024
7b01c1a
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
4b3c024
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
2541cae
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
9eb6376
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
2737ced
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
f3289e1
Revert "Update meta_cat.py"
shubham-s-agarwal Aug 6, 2024
18fa925
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
9a9ca71
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
9e52002
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
71dfbae
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
d6a3bab
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
4d73b1a
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
5328cba
Fixing flake8 issues
shubham-s-agarwal Aug 7, 2024
289e68d
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
33e32fd
Merge pull request #472 from CogStack/metacat_documentation_upd
shubham-s-agarwal Aug 8, 2024
267cd4f
Fixing bug for metacat
shubham-s-agarwal Aug 8, 2024
b7658ee
Merge pull request #474 from CogStack/metacat_bug_resolve
shubham-s-agarwal Aug 9, 2024
005796a
CU-86956duhb: Add method to backport a model pack from 1.12 to previo…
mart-r Aug 12, 2024
76c2fa2
CU-8694cd9t2: Allow merging config into model pack config before init…
mart-r Aug 12, 2024
c82ad4b
CU-8694fwyje: Update all configs with pre-load parts documented (#473)
mart-r Aug 12, 2024
62e603a
Use the loaded model hash for usage monitor instead of recalculating it
mart-r Aug 22, 2024
c907c97
Merge pull request #477 from CogStack/usageMonitorHashRecalcFix
tomolopolis Aug 22, 2024
209c5e4
fixed issue where the key name has not been declared in name2cuis2sta…
adam-sutton-1992 Aug 27, 2024
7862182
CU-86956du3q revisit regression (#470)
mart-r Aug 28, 2024
6d1247a
CU-8695hydt9: Fix various typos (#480)
mart-r Aug 28, 2024
540224c
CU-8695j1be2: Remove deprecated method on CDB (#481)
mart-r Aug 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
push:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
schedule:
- cron: '36 14 * * 0'

jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }}
permissions:
# required for all workflows
security-events: write

# required to fetch internal or private CodeQL packs
packages: read

# only required for workflows in private repositories
actions: read
contents: read

strategy:
fail-fast: false
matrix:
include:
- language: javascript-typescript
build-mode: none
- language: python
build-mode: none
# CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality

# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- if: matrix.build-mode == 'manual'
shell: bash
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
30 changes: 28 additions & 2 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,33 @@ jobs:
flake8 medcat
- name: Test
run: |
timeout 17m python -m unittest discover
all_files=$(git ls-files | grep '^tests/.*\.py$' | grep -v '/__init__\.py$' | sed 's/\.py$//' | sed 's/\//./g')
num_files=$(echo "$all_files" | wc -l)
midpoint=$((num_files / 2))
first_half_nl=$(echo "$all_files" | head -n $midpoint)
second_half_nl=$(echo "$all_files" | tail -n +$(($midpoint + 1)))
timeout 25m python -m unittest ${first_half_nl[@]}
timeout 25m python -m unittest ${second_half_nl[@]}
- name: Regression
run: source tests/resources/regression/run_regression.sh
- name: Get the latest release version
id: get_latest_release
uses: actions/github-script@v6
with:
script: |
const latestRelease = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('latest_version', latestRelease.data.tag_name);

- name: Make sure there's no deprecated methods that should be removed.
# only run this for master -> production PR. I.e just before doing a release.
if: github.event.pull_request.base.ref == 'main' && github.event.pull_request.head.ref == 'production'
env:
VERSION: ${{ steps.get_latest_release.outputs.latest_version }}
run: |
python tests/check_deprecations.py "$VERSION" --next-version --remove-prefix

publish-to-test-pypi:

Expand All @@ -43,7 +69,7 @@ jobs:
github.event_name == 'push' &&
startsWith(github.ref, 'refs/tags') != true
runs-on: ubuntu-20.04
timeout-minutes: 20
timeout-minutes: 45
concurrency: publish-to-test-pypi
needs: [build]

Expand Down
8 changes: 7 additions & 1 deletion .github/workflows/production.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,13 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
python -m unittest discover
all_files=$(git ls-files | grep '^tests/.*\.py$' | grep -v '/__init__\.py$' | sed 's/\.py$//' | sed 's/\//./g')
num_files=$(echo "$all_files" | wc -l)
midpoint=$((num_files / 2))
first_half_nl=$(echo "$all_files" | head -n $midpoint)
second_half_nl=$(echo "$all_files" | tail -n +$(($midpoint + 1)))
timeout 25m python -m unittest ${first_half_nl[@]}
timeout 25m python -m unittest ${second_half_nl[@]}

- name: Install pypa/build
run: >-
Expand Down
215 changes: 139 additions & 76 deletions configs/default_regression_tests.yml
Original file line number Diff line number Diff line change
@@ -1,79 +1,142 @@
# # Example of some test cases
# # They will try to cover as many possible use cases as possible
# # The idea is that the CUI corresponding to the name is expected to be
# # obtained by MedCAT
# # Only the 'filters' under 'targeting' and the 'phrases' under
# # the test case are the two required sections, the rest is optional
#
# test-case-name-1: # name of this test case
# targeting: # info regarding targets of this test case
# strategy: "ALL" # the strategy for dealing with the filters below
# # so "ALL" means the targets need to match all the below filters
# # and "ANY" means that the targets need to match at least one of the filters
# # if only one type of target it specified, this is irrelevant
# # the default value is "ALL" if not specified
# prefname-only: False # set to True if only prefered names should be checked (defaults to False)
# targfiltersets: # the filters for this specific test case
# # there has to be one type of target, but multiple can be specified
# # if multiple types are target, the strategy defined above is taken into affect
# # each type can specify one or multiple values
# # this example shows has one values
# # the next example (below) will have multiple values
# type_id: "0123" # type_id or type_ids
# cui: "01230" # the target CUI (or list of CUIS)
# name: "name0" # the target names
# # all specified names need to exist within the CDB
# phrases: "The quick brown %s jumped over the lazy cat" # the phrases to go through
# # for each phrases, '%s' is replaced
# # by each name that is to be tested
# test-case-name-2: # name of this test case
# targeting:
# filters:
# type_id: # multiple target type IDs
# - "123"
# - "223"
# cui: # multiple target CUI
# - "1234"
# - "2234"
# name: # multiple names
# - "name1"
# - "name2"
# cui_and_children: # an example with CUI and children
# cui: '111' # the CUI (or CUIs)
# depth: 2 # and the depth of children
# phrases:
# - "The %s was measured"
# - "The %s was not measured"
#
# # The following example was (rather arbitrarily) created and should work for
# # the included SNOMED models
test-case-1:
targeting:
strategy: "ALL"
filters:
type_id: "2680757"
phrases:
- "The %s was measured"
# this is an example test case
# it is based on SNOMED-CT
test-case-1: # The (somewhat) arbitrary name of the test case
targeting: # the description of the replacement targets in the phrase(s)
placeholders: # the placeholders to replace in the phrase(s)
# Note that only 1 concept will be tested for at one time.
# So if the prhase(s) has/have more than 1 placeholder, the
# rest of them will be substitued in without care for whether
# or how accurately the model is able to recognise them.
# For the concepts that are not under test at a given time
# the "first" name is used (because the implementation has
# names in a set, there is possibility for run-to-run variance
# because of different names being used).
#
# There are 2 modes for the placeholders:
# 1. any-combination: false
# In this mode, only the concepts in the same position
# in the various lists are used in conjunction to oneanother.
# Though this also means that it is expected that all of the
# placeholders have the same number of CUIs to use.
# Assuming each of the N placeholders defines M replacement
# cuis, this approach produces M*N cases.
# 2. any-combination: true
# In this mode, any combination of the replacement CUIs is
# allowed. This means that quite a few different combinations
# will be generated and used. It also means that different
# placeholders can have different number of concepts suitbale
# for them.
# Assuming eacho of the N placeholders defines M repalcement
# cuis, this approach produces N * N^M (where `^` is power)
# cases. But for a more complicated set up (i.e where different
# placeholders have a different number of swappable CUIs)
# this calculation is not as straight forward.
#
# NOTE: The above description does not take into account different
# number of names associated with different concepts. For each
# of the "primary" concepts, each possible name is attempted.
- placeholder: '[DISORDER]' # the palceholder that will be substituted in the phrase(s)
cuis: ['4473006', # Intracerebral hemorrhage
'85189001', # Acute appendicitis
'186738001', # vestibular neuritis
'186738001', # vestibular neuritis
]
- placeholder: '[FINDING1]'
cuis: ['162300006', # unilateral headache
'21522001', # abdominal pain
'103298005', # severe vertigo
'103298005', # severe vertigo
]
prefname-only: false # this is an optional keyword for wach placeholder
# if set to true, only the preferred name will be used for
# this concept. Otherwise, all names will be used as
# different sub-cases
- placeholder: '[FINDING2]'
cuis: ['409668002', # photophobia
'422587007', # nausea
'422587007', # nausea
'422587007', # nausea
]
- placeholder: '[FINDING3]'
cuis: ['2228002', # scintillating scotoma
'386661006', # fever
'81756001', # horizontal nystagmus
'81756001', # horizontal nystagmus
]
- placeholder: '[NEGFINDING]'
cuis: ['386661006', # fever
'62315008', # diarrhea
'15188001', # hearing loss
'60862001', # tinnitus
]
any-combination: false # if set to false, same length of CUIs is expected
# for each placeholder and only a combination is used
phrases: # The list of phrases
- >
Description: [DISORDER]

CC: [FINDING1] on presentation; then developed [FINDING3]

HX: On the day of presentation, this 32 y/o RHM suddenly developed [FINDING1] and [FINDING2].
Four hours later he experienced sudden [FINDING3] lasting two hours.
There were no other associated symptoms except for the [FINDING1] and [FINDING2].
He denied [NEGFINDING].
test-case-2:
targeting:
filters:
type_id: "9090192"
phrases:
- "Patient presented with %s"
- "No %s was present"
test-case-3:
targeting:
filters:
type_id: "67667581"
phrases:
- "The patient has been diagnosed with %s"
- "There are no signs of %s"
test-case-4:
targeting:
strategy: "ALL"
filters:
cui_and_children:
cui: "364075005" # 'heart rate'
depth: 4 # and children 4 deep
placeholders:
- placeholder: '[FINDING1]'
cuis: ['49727002', # cough
'29857009', # chest pain
'21522001', # abdominal pain
'57676002', # joint pain
'25064002', # headache
'271807003', # fever
'162397003', # hematuria (blood in urine)
'271757001', # fatigue
'386661006', # weight loss
'62315008', # dysuria (painful urination)
]
- placeholder: '[FINDING2]'
cuis: ['267036007', # shortness of breath
'68962001', # palpatations
'422587007', # nausea
'182888003', # swelling
'404640003', # dizziness
'422400008', # sore throat
'267036007', # shortness of breath
'267064002', # night sweats
'162607003', # back pain
'267102003', # urinary frequency
]
- placeholder: '[DISORDER]'
cuis: ['195967001', # asthma
'194828000', # angina pectoris
'25374005', # gastroenteritis
'69896004', # rheumatoid arthritis
'37796009', # migraine
'186747009', # influenza
'106063007', # urinary tract infection
'444814009', # chronic fatigue syndrome
'95281007', # tuberculosis
'431855005', # cystitis
]
any-combination: false
phrases:
- "The patient's %s was 82 bps"
- >
The patient presents with [FINDING1] and [FINDING2]. These findings are suggestive of [DISORDER].
Further diagnostic evaluation and investigations are required to confirm the diagnosis.
- >
The patient reports [FINDING1] and has also been experiencing [FINDING2]. These symptoms are consistent with a clinical presentation of [DISORDER].
Further assessment and diagnostic tests are required to establish the underlying cause.
- >
Upon evaluation, the patient exhibits [FINDING1] along with [FINDING2]. This combination of findings raises suspicion for [DISORDER].
Comprehensive diagnostic workup is advised to confirm the diagnosis and plan appropriate management.
- >
During the consultation, the patient described [FINDING1] and noted a recent history of [FINDING2]. These clinical features are suggestive of [DISORDER].
Further investigation is necessary to verify the diagnosis and rule out other potential causes.
- >
The patient's symptoms include [FINDING1] and [FINDING2], which are commonly associated with [DISORDER].
It is recommended that additional diagnostic procedures be performed to confirm this working diagnosis.
- >
The clinical presentation of [FINDING1] and [FINDING2] is indicative of [DISORDER].
To ensure accurate diagnosis, further clinical evaluation and diagnostic tests are required.
24 changes: 24 additions & 0 deletions install_requires.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
'numpy>=1.22.0,<1.26.0' # 1.22.0 is first to support python 3.11; post 1.26.0 there's issues with scipy
'pandas>=1.4.2' # first to support 3.11
'gensim>=4.3.0,<5.0.0' # 5.3.0 is first to support 3.11; avoid major version bump
'spacy>=3.6.0,<4.0.0' # Some later model packs (e.g HPO) are made with 3.6.0 spacy model; avoid major version bump
'scipy~=1.9.2' # 1.9.2 is first to support 3.11
'transformers>=4.34.0,<5.0.0' # avoid major version bump
'accelerate>=0.23.0' # required by Trainer class in de-id
'torch>=1.13.0,<3.0.0' # 1.13 is first to support 3.11; 2.1.2 has been compatible, but avoid major 3.0.0 for now
'tqdm>=4.27'
'scikit-learn>=1.1.3,<2.0.0' # 1.1.3 is first to supporrt 3.11; avoid major version bump
'dill>=0.3.6,<1.0.0' # stuff saved in 0.3.6/0.3.7 is not always compatible with 0.3.4/0.3.5; avoid major bump
'datasets>=2.2.2,<3.0.0' # avoid major bump
'jsonpickle>=2.0.0' # allow later versions, tested with 3.0.0
'psutil>=5.8.0'
# 0.70.12 uses older version of dill (i.e less than 0.3.5) which is required for datasets
'multiprocess~=0.70.12' # 0.70.14 seemed to work just fine
'aiofiles>=0.8.0' # allow later versions, tested with 22.1.0
'ipywidgets>=7.6.5' # allow later versions, tested with 0.8.0
'xxhash>=3.0.0' # allow later versions, tested with 3.1.0
'blis>=0.7.5,<1.0.0' # allow later versions, tested with 0.7.9, avoid 1.0.0 (depends on numpy 2)
'click>=8.0.4' # allow later versions, tested with 8.1.3
'pydantic>=1.10.0,<2.0' # for spacy compatibility; avoid 2.0 due to breaking changes
"humanfriendly~=10.0" # for human readable file / RAM sizes
"peft>=0.8.2"
Loading
Loading