Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from CogStack:master #93

Open
wants to merge 1,109 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1109 commits
Select commit Hold shift + click to select a range
296a0cd
Docstring updated to google-style docstring
antsh3k Jan 30, 2023
903d088
CU-2e77a2k Remove unused utility modules
mart-r Feb 1, 2023
1b9dcd3
CU-2e77a2k Remove deprecated utils
mart-r Feb 1, 2023
d1af700
Merge pull request #294 from CogStack/icd10update
antsh3k Feb 1, 2023
e36ace7
Bump django from 3.2.16 to 3.2.17 in /webapp/webapp
dependabot[bot] Feb 3, 2023
9a53b97
Merge pull request #302 from CogStack/dependabot/pip/webapp/webapp/dj…
tomolopolis Feb 7, 2023
4f08145
CU-33g0f3w Read the docs build failures (#306)
mart-r Feb 10, 2023
c983a11
Add options for loading meta models and additional NERs (#300)
baixiac Feb 10, 2023
e5761a7
Style fix
jamesbrandreth Feb 10, 2023
02ba9e7
NO-TICKET reduce the false positives on pushing to test pypi (#307)
baixiac Feb 10, 2023
141059d
Merge pull request #299 from mart-r/cleanupUtils
antsh3k Feb 10, 2023
abc5488
CU-862j5by9q Regression touchup - metadata and ability to split suite…
mart-r Feb 13, 2023
e4a59bb
CU-8677craqe make transformer_ner continue processing other entities …
baixiac Feb 14, 2023
341f92a
Bump django from 3.2.17 to 3.2.18 in /webapp/webapp
dependabot[bot] Feb 15, 2023
c35bc84
Merge pull request #310 from CogStack/dependabot/pip/webapp/webapp/dj…
tomolopolis Feb 16, 2023
7d01d9e
Merge pull request #296 from uclh-criu/meta-eval-confusion
tomolopolis Feb 16, 2023
da3a39c
CU-862j7b9jc Mypy full release - 1.0.0 (#308)
mart-r Feb 20, 2023
4a11191
CU-862j7b9jc Mypy abc hotfix (#311)
mart-r Feb 20, 2023
8de6af5
Merge pull request #309 from CogStack/CU-8677craqe
tomolopolis Feb 20, 2023
a18194e
CU-8677ge6j8 Version identification and updating (#313)
mart-r Mar 6, 2023
642411d
Pin down transformers for the de-identification model (#314)
baixiac Mar 13, 2023
1fc032e
Added function to remove CUI from cdb (#316)
antsh3k Apr 4, 2023
9ea0675
CU-862jjprjw Fix github actions failures (#317)
mart-r Apr 17, 2023
e259b0c
CU-862jr8wkk Pin pydantic dependency to avoid conflicts with v2.0 (#318)
mart-r May 4, 2023
9625ec0
Bump django from 3.2.18 to 3.2.19 in /webapp/webapp
dependabot[bot] May 9, 2023
868bf9e
Merge pull request #320 from CogStack/dependabot/pip/webapp/webapp/dj…
tomolopolis May 15, 2023
9d8d31e
Merge pull request #319 from CogStack/readTheDocsBuild
tomolopolis May 15, 2023
564d15c
CU-863gntc58 Umlspt2ch (#322)
mart-r Jun 5, 2023
5610ec5
Fix for Issue 325 (#326)
mart-r Jun 26, 2023
65645b6
CU-86783u6d9 Add wrapper to simplify De-ID model usage (#324)
mart-r Jun 26, 2023
221b61d
CU-862k1tt90 Fix circular imports by moving raw deid method back to h…
mart-r Jul 5, 2023
4573312
Cu 863h30jyb separate train from data load (#329)
mart-r Jul 5, 2023
9a53faa
CU-86785yhfk Add method to populate cui2snames with data from cui2nam…
mart-r Jul 5, 2023
7862819
Bump django from 3.2.19 to 3.2.20 in /webapp/webapp
dependabot[bot] Jul 5, 2023
c1455e2
Merge pull request #330 from CogStack/dependabot/pip/webapp/webapp/dj…
tomolopolis Jul 6, 2023
8631ae3
CU-346mpwz Improving memory usage of MedCAT models (#323)
mart-r Jul 6, 2023
9711554
Documentation fixes (#332)
mart-r Jul 7, 2023
a1dccf4
Bump aiohttp from 3.8.3 to 3.8.5 (#333)
dependabot[bot] Jul 26, 2023
8fe9dfc
CU-862k77jjj: changes needed for Trainer metrics page
tomolopolis Jul 31, 2023
1e500c5
Merge pull request #336 from CogStack/print_stats-change
tomolopolis Jul 31, 2023
9f9b25b
remove bad merge <p> element
tomolopolis Aug 14, 2023
eec6b5a
CU-8692kpchc Fix for Rosalind link not working (#342)
mart-r Sep 4, 2023
54d8a6d
Add missing self argument (#343)
mart-r Sep 4, 2023
3aaef44
CU-8692kn0yv Fix issue with fake dict in identifier based config
mart-r Sep 4, 2023
e0c6456
CU-8692mevx8 Fix issue with filters not taking effect in train_superv…
mart-r Sep 21, 2023
dd895a9
Bump urllib3 from 1.26.5 to 1.26.17 in /webapp/webapp
dependabot[bot] Oct 3, 2023
4daceb2
CU-8692wb8gf: 'tokenizers>=0.12.0', # 0.13.1 doesn't seem to build
tomolopolis Oct 9, 2023
7f798c2
CU-8692wb8gf: pin to pre 0.12, so rust compiler install reliably work…
tomolopolis Oct 9, 2023
f7c285c
Merge pull request #353 from CogStack/pin-tokenizers-0.12
tomolopolis Oct 9, 2023
158ef58
CU-8692wcmp7: update transformers to the latest version
tomolopolis Oct 9, 2023
a5cdb8a
CU-8692wcmp7: include accelerate as required by the de-id test.
tomolopolis Oct 10, 2023
ab07daa
CU-8692wgmkm: Remove py2neo dependency and the code that used it (#356)
mart-r Oct 10, 2023
128d9ea
CU-8692wgmkm: Remove py2neo dependency and the code that used it (#356)
mart-r Oct 10, 2023
dbe3833
Merge pull request #355 from CogStack/upgrade-transformers
tomolopolis Oct 10, 2023
bbbd79a
Merge pull request #351 from CogStack/dependabot/pip/webapp/webapp/ur…
tomolopolis Oct 10, 2023
b3210f7
Cu 8692wbcq5 docs builds (#359)
mart-r Oct 20, 2023
ed840d0
Bump urllib3 from 1.26.17 to 1.26.18 in /webapp/webapp
dependabot[bot] Oct 20, 2023
d377f0b
CU-8692uznvd: Allow empty-dict config.linking.filters.cuis and conver…
mart-r Oct 30, 2023
ad67048
CU-8692t3fdf separate config on save (#350)
mart-r Oct 30, 2023
e52bda3
CU-2cdpd4t: Unify default addl_info in different methdos. (#363)
mart-r Oct 31, 2023
b6ab62c
Bump django from 3.2.20 to 3.2.23 in /webapp/webapp
dependabot[bot] Nov 2, 2023
94827bb
Changing cdb.add_concept to a protected method
Nov 3, 2023
26b5120
Re-added deprecated method with deprecated flag and addtional comments
Nov 6, 2023
81ba0bf
Initial commit for merge_cdb method
Nov 22, 2023
379a0db
Added indentation to make merge_cdb a class method
Nov 22, 2023
e64b2e0
fixed syntax issues
Nov 23, 2023
eefb010
more lint fixes
Nov 23, 2023
ff48a2a
more lint fixes
Nov 23, 2023
f299677
bug fixes of merge_cdb
Nov 23, 2023
abb68b5
removed print statements
Nov 23, 2023
b6b023b
CU-86931prq4: Update GHA versions (checkout and setup-python) to v4 (…
mart-r Nov 27, 2023
6a5103c
Cu 1yn0v9e duplicate multiprocessing methods (#364)
mart-r Nov 27, 2023
b0ecd83
Merge pull request #370 from CogStack/protected-add_concept
adam-sutton-1992 Nov 27, 2023
72f7dda
869377m3u: Add comment regarding demo link load times to README (#376)
mart-r Nov 29, 2023
900439a
intermediate changes of merge_cdb and testing function
Nov 29, 2023
d8473d9
Added README.md documentation for CPU only installations (#365)
adam-sutton-1992 Nov 30, 2023
76b75cc
Cu 8692zguyq no preferred name (#367)
mart-r Nov 30, 2023
7fddac0
Add trainer callbacks for Transformer NER (#377)
baixiac Dec 5, 2023
6a820f0
changes to merge_cdb and adding unit tests for method
Dec 11, 2023
3f9bc68
resolves merge conflict of imports
Dec 11, 2023
f96758a
fixing lint issues
Dec 12, 2023
1975b1c
fixing flake8 linting
Dec 12, 2023
6a22727
Merge pull request #369 from CogStack/dependabot/pip/webapp/webapp/dj…
tomolopolis Dec 13, 2023
22e4aec
Merge pull request #360 from CogStack/dependabot/pip/webapp/webapp/ur…
tomolopolis Dec 13, 2023
6f752c8
bug fixes, additional tests, and more documentation
Dec 13, 2023
7d694f2
moved set up of cdbs to be merged to tests.helper
Dec 13, 2023
7cdd208
moved merge_cdb to utils and created test_cdb_utils
Dec 14, 2023
fe9ef66
removed class wrapper in cdb utils and fixed class set up in tests
Dec 15, 2023
f70e61d
changed test object setup to class setup
Dec 15, 2023
c74fe1f
removed erroneous new line
Dec 15, 2023
45cef2b
CU-2e77a31 improve print stats (#366)
mart-r Dec 18, 2023
90bf65e
Load stopwords in Defaults before spacy model
Dec 18, 2023
70305f4
Merge pull request #373 from CogStack/CU2e77a5x-cdb-merge-function
adam-sutton-1992 Dec 18, 2023
9e5fca1
CU-8693az82g Remove cdb tests side effects (#380)
mart-r Dec 18, 2023
72ac8d7
Added tests
Dec 18, 2023
22e2255
CU-8693bpq82 fallback spacy model (#384)
mart-r Dec 21, 2023
37a9d92
Remove tests of internals where possible
mart-r Dec 22, 2023
392f80b
Add test for skipping of stopwords
mart-r Dec 22, 2023
276bcf1
Avoid supporting only English for stopwords
mart-r Dec 22, 2023
f0572ee
Merge branch 'master' into stopwords-loading-fix
mart-r Dec 22, 2023
69c2393
Remove debug output
mart-r Dec 22, 2023
80b4387
Make sure stopwords language getter works for file-path spacy models
mart-r Dec 22, 2023
45fa0e2
Merge pull request #1 from CogStack/stopwords-loading-fix
jenniferjiangkells Jan 2, 2024
f0ef8cd
Merge pull request #383 from uclh-criu/stopwords-loading-fix
tomolopolis Jan 3, 2024
abfb1e7
CU-8693cv3w0 Fix fallback spacy model existance on pip installs (#386)
mart-r Jan 8, 2024
d9a1fac
CU-8693b0a61 Add method to get spacy model version (#381)
mart-r Jan 8, 2024
4de8931
CU-8693kp0gw: Pin more recent versions for major dependencies; Avoid …
mart-r Jan 29, 2024
85cbe77
add: metacat can predict on spans in arbitrary spangroups (#391)
jkgenser Jan 30, 2024
0a9a615
CU-8693ruk7p: Bump mypy version in dev-requirements (#396)
mart-r Feb 8, 2024
df74f32
Bump django from 3.2.23 to 3.2.24 in /webapp/webapp (#395)
dependabot[bot] Feb 12, 2024
e8658c4
CU-8693t24ed: Add workaround for older DeID models in newer MedCAT (#…
mart-r Feb 12, 2024
08570eb
CU-2hz5ump deid mulitprocessing (#393)
mart-r Feb 12, 2024
d01084c
Cu 8693u6b4u tests continue on fail (#400)
mart-r Feb 13, 2024
a3138a6
CU-8693v3tt6 SOMED opcs refset selection (#402)
mart-r Feb 16, 2024
02afddb
CU-8693v6epd: Move typing imports away from pydantic (#403)
mart-r Feb 19, 2024
67f1126
CU-8693qx9yp Deid chunking - hugging face pipeline approach (#405)
shubham-s-agarwal Feb 28, 2024
731e20f
Bump django from 3.2.24 to 3.2.25 in /webapp/webapp (#408)
dependabot[bot] Apr 8, 2024
9c69aa9
CU-86947ja9y dill old weights (#411)
mart-r Apr 9, 2024
8c107d6
CU-86949yar7: Add logged warning for when multiprocessing fails due t…
mart-r Apr 17, 2024
a56e5ab
CU-86949zjg9 mp progress (#416)
mart-r Apr 17, 2024
91e2bc8
CU-86948uv4g docstring signature consistency (#413)
mart-r Apr 18, 2024
4bda687
CU-86948uv4g docstring signature consistency (#417)
mart-r Apr 18, 2024
4d36f8a
Pushing changes for bert-style models for MetaCAT
shubham-s-agarwal Apr 19, 2024
da9ab06
Pushing fix for LSTM
shubham-s-agarwal Apr 19, 2024
cb65fc3
Pushing changes for flake8 and type fixes
shubham-s-agarwal Apr 19, 2024
869eeae
Pushing type fixes
shubham-s-agarwal Apr 19, 2024
3e02eed
Fixing type issue
shubham-s-agarwal Apr 19, 2024
c899c9c
Pushing changes
shubham-s-agarwal Apr 22, 2024
d1321b8
Pushing change and type fixes
shubham-s-agarwal Apr 22, 2024
9091d9b
Fixing flake8 issues
shubham-s-agarwal Apr 22, 2024
c57dcfe
Pushing flake8 fixes
shubham-s-agarwal Apr 23, 2024
364fdd4
Pushing fixes for flake8
shubham-s-agarwal Apr 23, 2024
7272168
Pushing flake8 fix
shubham-s-agarwal Apr 23, 2024
619c565
Adding peft to list of libraries
shubham-s-agarwal Apr 23, 2024
e319b61
Small addition to contribution guidelines (#420)
mart-r Apr 23, 2024
4efc16d
CU-8694cbcpu: Allow specifying an AU Snomed when preprocessing (#421)
mart-r Apr 25, 2024
1caa187
CU-8694dpy1c: Return empty generator upon empty stream (#423)
mart-r Apr 29, 2024
2a546c3
Pushing changes with load and train workflow and type fixes
shubham-s-agarwal Apr 30, 2024
abc97fb
Relation extraction (#173)
vladd-bit May 1, 2024
1d78bd0
CU-8694fae3r: Avoid publishing PyPI release when doing GH pre-release…
mart-r May 3, 2024
8efd2a9
Pushing changes with type hints and new documentation
shubham-s-agarwal May 7, 2024
aa5044e
Pushing type fix
shubham-s-agarwal May 7, 2024
fcdc867
Fixing type issue
shubham-s-agarwal May 7, 2024
88ee8e7
Adding test case for BERT and reverting config changes
shubham-s-agarwal May 7, 2024
917dca2
Merging changes from master to metacat_bert branch (#431)
shubham-s-agarwal May 8, 2024
563c3d4
Merge branch 'master' into metacat_bert
mart-r May 8, 2024
decfbfb
Pushing changed tests and removing empty change
shubham-s-agarwal May 8, 2024
fbcdb70
Pushing change for logging
shubham-s-agarwal May 9, 2024
2657515
Revert "Pushing change for logging"
shubham-s-agarwal May 14, 2024
fbe9745
Merge pull request #419 from CogStack/metacat_bert
shubham-s-agarwal May 16, 2024
e46dca8
CU-8694hukwm: Document the materialising of generator when multiproce…
mart-r May 22, 2024
2872d5e
CU-8694fk90t (almost) only primitive config (#425)
mart-r May 22, 2024
8e7c77b
CU-8694gza88: Create codeql.yml (#434)
mart-r May 22, 2024
0c8f5a8
CU-8694mbn03: Remove the web app (#441)
mart-r May 29, 2024
20d2bce
CU-8694n48uw better deprecation (#443)
mart-r May 29, 2024
9d6a4e0
CU-8694pey4u: extract cdb load to cls method, to be used in trainer f…
tomolopolis May 29, 2024
61b5979
CU-8694pey4u: extract meta cat loading also to a cls method
tomolopolis May 29, 2024
6f8db03
Merge branch 'master' into load-cdb-cls-method
tomolopolis May 30, 2024
f836f65
CU-8694pey4u: docstrings
tomolopolis May 30, 2024
bc78a53
CU-8694pey4u: typehints and mypy issues
tomolopolis May 30, 2024
cf8a1ef
CU-8694pey4u: fix flake8
tomolopolis May 30, 2024
c6f0658
CU-8694pey4u: fix flake8
tomolopolis May 30, 2024
db7259a
Merge pull request #446 from CogStack/load-cdb-cls-method
tomolopolis May 30, 2024
e07b9d9
CU-8694pey4u: missing extra config if passed in
tomolopolis May 30, 2024
35de5ea
Merge pull request #448 from CogStack/load-cdb-cls-method
tomolopolis May 30, 2024
7a41641
CU-8694py1jr: Fix issue with reuse of opened file when loading old co…
mart-r May 31, 2024
03a5b56
CU-8694py1jr: Make old config identifier more robust
mart-r May 31, 2024
8e3c3c2
CU-8694py1jr: Add doc string to old config identifier
mart-r May 31, 2024
730cc14
CU-8694py1jr: Add test for old style MetaCAT config load
mart-r May 31, 2024
65a0c60
CU-8694py1jr: Add test for old style main config load (functional)
mart-r May 31, 2024
1a9df4d
CU-8694py1jr: Refactor config utils load tests for more flexibility
mart-r May 31, 2024
0a10265
CU-8694py1jr: Add config utils load tests for NER and Rel CAT configs
mart-r May 31, 2024
03c6881
Merge pull request #449 from CogStack/CU-8694py1jr-fix-other-configs-…
tomolopolis Jun 3, 2024
91ae2dd
CU-8694vcvz7: Trust remote code when loading transfomers NER dataset …
mart-r Jun 19, 2024
e11c1da
CU-8694gzbn3 k fold metrics (#432)
mart-r Jun 19, 2024
e4715ae
CU-8693n892x environment/dependency snapshots (#438)
mart-r Jun 19, 2024
df9f225
CU-8694p8y0k deprecation GHA check (#445)
mart-r Jun 19, 2024
1c3628d
CU-8694u3yd2 cleanup name removal (#450)
mart-r Jun 19, 2024
7058ec4
CU-8694vte2g 1.12 depr removal (#454)
mart-r Jun 19, 2024
3603dd2
Resync master with Production after 1.12 release (#457)
mart-r Jun 20, 2024
97389b4
CU-86951923u: Add option for simplified hash along with a few tests (…
mart-r Jul 17, 2024
018fd7a
CU-8694vbw6y k-fold stats Standard Deviation (#459)
mart-r Jul 18, 2024
96706c8
CU-8694wh3d5 track usage (#458)
mart-r Jul 23, 2024
396910f
Documentation update for chunking
shubham-s-agarwal Jul 24, 2024
132efcb
Updating the logging message
shubham-s-agarwal Jul 24, 2024
1eb008d
Pushing change for lazy evaluation
shubham-s-agarwal Jul 24, 2024
d20bf25
Merge pull request #466 from CogStack/deid_logging_update
shubham-s-agarwal Jul 29, 2024
cf285d0
CU-869588fdc: Comment on blis 1.0.0 (to show GHA failure)
mart-r Jul 29, 2024
7b764fe
CU-869588fdc: Pin blis to below 1.0.0
mart-r Jul 29, 2024
bf4cf63
CU-8694vv985 transitive deps (#463)
mart-r Jul 29, 2024
484c244
Merge pull request #471 from CogStack/CU-869588fdc-pin-blis
tomolopolis Jul 31, 2024
aee6bf6
Changes to documentation for metacat
shubham-s-agarwal Aug 2, 2024
1619f1c
Update config_meta_cat.py
shubham-s-agarwal Aug 2, 2024
838370e
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
f3f3a6a
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
51f65fb
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
f56116b
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
042931a
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
1e087ef
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
5969e11
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
2c20864
Pushing formatting changes
shubham-s-agarwal Aug 6, 2024
7b01c1a
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
4b3c024
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
2541cae
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
9eb6376
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
2737ced
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
f3289e1
Revert "Update meta_cat.py"
shubham-s-agarwal Aug 6, 2024
18fa925
Update meta_cat.py
shubham-s-agarwal Aug 6, 2024
9a9ca71
Update config_meta_cat.py
shubham-s-agarwal Aug 6, 2024
9e52002
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
71dfbae
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
d6a3bab
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
4d73b1a
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
5328cba
Fixing flake8 issues
shubham-s-agarwal Aug 7, 2024
289e68d
Update config_meta_cat.py
shubham-s-agarwal Aug 7, 2024
33e32fd
Merge pull request #472 from CogStack/metacat_documentation_upd
shubham-s-agarwal Aug 8, 2024
267cd4f
Fixing bug for metacat
shubham-s-agarwal Aug 8, 2024
b7658ee
Merge pull request #474 from CogStack/metacat_bug_resolve
shubham-s-agarwal Aug 9, 2024
005796a
CU-86956duhb: Add method to backport a model pack from 1.12 to previo…
mart-r Aug 12, 2024
76c2fa2
CU-8694cd9t2: Allow merging config into model pack config before init…
mart-r Aug 12, 2024
c82ad4b
CU-8694fwyje: Update all configs with pre-load parts documented (#473)
mart-r Aug 12, 2024
62e603a
Use the loaded model hash for usage monitor instead of recalculating it
mart-r Aug 22, 2024
c907c97
Merge pull request #477 from CogStack/usageMonitorHashRecalcFix
tomolopolis Aug 22, 2024
209c5e4
fixed issue where the key name has not been declared in name2cuis2sta…
adam-sutton-1992 Aug 27, 2024
7862182
CU-86956du3q revisit regression (#470)
mart-r Aug 28, 2024
6d1247a
CU-8695hydt9: Fix various typos (#480)
mart-r Aug 28, 2024
540224c
CU-8695j1be2: Remove deprecated method on CDB (#481)
mart-r Aug 28, 2024
b8bb4e3
Production/master sync (#483)
mart-r Aug 30, 2024
b28fa05
CU-8695jwnjk: Fix description of an argument for --help to work in CL…
mart-r Sep 2, 2024
6127f77
Pushing bug fix for metacat (#487)
shubham-s-agarwal Sep 5, 2024
2588670
CU-8695q21f6: Replace rosalind links with S3 ones in docs (#489)
mart-r Sep 16, 2024
56a2856
CU-8695uhe5n: Update docs dependency pins (#491)
mart-r Sep 16, 2024
394e17b
CU-8695m5q4x: Fix issues detecting 1-token concepts (#485)
mart-r Sep 17, 2024
eb912d6
CU-8695pvhfe fix usage monitoring for multiprocessing (#488)
mart-r Sep 17, 2024
cbae5b3
CU-8695knfbg Add name edits to regression suite (#486)
mart-r Sep 30, 2024
a9544f7
CU-869574kvp update snomed preprocessing naming (#469)
mart-r Sep 30, 2024
b433195
CU-8695vu71q: Make report identical run to run in identical cases (#492)
mart-r Sep 30, 2024
44db08b
CU-8695ucw9b deid transformers fix (#490)
mart-r Oct 7, 2024
909cfad
CU-869637yfx: Pin spacy dependency to lower than 3.8 (#494)
mart-r Oct 9, 2024
3e01747
MetaCAT fixes and upgrades (#495)
shubham-s-agarwal Oct 14, 2024
976adc2
CU-869671bn4: Update requirements to fix workflow issue due to mypy …
mart-r Oct 28, 2024
04efda5
CU-86967nnra Drop python 3.8 support (EoL) (#498)
mart-r Oct 28, 2024
e924798
CU-86964zm4d fix preprocessing (#496)
mart-r Nov 1, 2024
c0082ef
CU-8695hghww backwards compatibility workflow (#478)
mart-r Nov 1, 2024
df3df66
CU-8696n7w95: Remove commented code to fix DeID (oversight in PR 490)…
mart-r Nov 15, 2024
37a8a63
CU-8696m1mch: Remove versioning utility since all its parts were depr…
mart-r Nov 19, 2024
fa09b95
Update README.md - fixed typo (#507)
spiros Nov 27, 2024
b96310b
CU-8696nbm9j: Add module to convert vocab vectors and a few simple te…
mart-r Nov 27, 2024
3c44dcb
CU-8696nbm03: Remove unigram table (#503)
mart-r Nov 27, 2024
bb41955
CU-8695d4www pydantic 2 (#476)
mart-r Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[flake8]
extend-ignore =
E124,
; closing bracket does not match visual indentation
E127,
; continuation line over-indented for visual indent
E128,
; continuation line under-indented for visual indent
E221,
; multiple spaces before operator
E225,
; missing whitespace around operator
E231,
; missing whitespace after ',' and ':'
E252,
; missing whitespace around parameter equal
E261,
; at least two spaces before inline comment
E265,
; block comment should start with '# '
E272,
; multiple spaces before keyword
E303,
; too many blank lines
E501,
; line too long
W291,
; trailing whitespace
W605,
; invalid escape sequence

per-file-ignores = __init__.py:F401
95 changes: 95 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
push:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
schedule:
- cron: '36 14 * * 0'

jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }}
permissions:
# required for all workflows
security-events: write

# required to fetch internal or private CodeQL packs
packages: read

# only required for workflows in private repositories
actions: read
contents: read

strategy:
fail-fast: false
matrix:
include:
- language: javascript-typescript
build-mode: none
- language: python
build-mode: none
# CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality

# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- if: matrix.build-mode == 'manual'
shell: bash
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
123 changes: 123 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
name: build

on:
push:
branches: [ master ]
pull_request:
branches: [ master ]

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [ '3.9', '3.10', '3.11' ]
max-parallel: 4

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
- name: Check types
run: |
python -m mypy --follow-imports=normal medcat
- name: Lint
run: |
flake8 medcat
- name: Pydantic 1 check
# NOTE: the following will look for use of pydantic1-specific .dict() method and .__fields__ attribute
# if there are some (that are not annotated for pydantic1 backwards compatibility) a non-zero exit
# code is returned, which will hald the workflow and print out the offending parts
run: |
grep "\.__fields__" medcat -rI | grep -v "# 4pydantic1 - backwards compatibility" | tee /dev/stderr | test $(wc -l) -eq 0
grep "\.dict(" medcat -rI | grep -v "# 4pydantic1 - backwards compatibility" | tee /dev/stderr | test $(wc -l) -eq 0
- name: Test
run: |
all_files=$(git ls-files | grep '^tests/.*\.py$' | grep -v '/__init__\.py$' | sed 's/\.py$//' | sed 's/\//./g')
num_files=$(echo "$all_files" | wc -l)
midpoint=$((num_files / 2))
first_half_nl=$(echo "$all_files" | head -n $midpoint)
second_half_nl=$(echo "$all_files" | tail -n +$(($midpoint + 1)))
timeout 25m python -m unittest ${first_half_nl[@]}
timeout 25m python -m unittest ${second_half_nl[@]}
- name: Regression
run: source tests/resources/regression/run_regression.sh
- name: Model backwards compatibility
run: source tests/resources/model_compatibility/check_backwards_compatibility.sh
- name: Get the latest release version
id: get_latest_release
uses: actions/github-script@v6
with:
script: |
const latestRelease = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('latest_version', latestRelease.data.tag_name);
- name: Make sure there's no deprecated methods that should be removed.
# only run this for master -> production PR. I.e just before doing a release.
if: github.event.pull_request.base.ref == 'main' && github.event.pull_request.head.ref == 'production'
env:
VERSION: ${{ steps.get_latest_release.outputs.latest_version }}
run: |
python tests/check_deprecations.py "$VERSION" --next-version --remove-prefix

publish-to-test-pypi:

if: |
github.repository == 'CogStack/MedCAT' &&
github.ref == 'refs/heads/master' &&
github.event_name == 'push' &&
startsWith(github.ref, 'refs/tags') != true
runs-on: ubuntu-20.04
timeout-minutes: 45
concurrency: publish-to-test-pypi
needs: [build]

steps:
- name: Checkout master
uses: actions/checkout@v4
with:
ref: 'master'
fetch-depth: 0

- name: Set up Python 3.9
uses: actions/setup-python@v4
with:
python-version: 3.9

- name: Install pypa/build
run: >-
python -m
pip install
build
--user

- name: Configure the version
run: >-
sed --in-place
"s/node-and-date/no-local-version/g"
setup.py

- name: Build a binary wheel and a source tarball
run: >-
python -m
build
--sdist
--wheel
--outdir dist/
.

- name: Publish dev distribution to Test PyPI
uses: pypa/gh-action-pypi-publish@v1.4.2
with:
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
continue-on-error: true
59 changes: 59 additions & 0 deletions .github/workflows/production.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: production

on:
push:
branches: [ production, "v[0-9]+.[0-9]+.post" ]
release:
types: [ published , edited ]

jobs:
build-n-publish-to-pypi:
runs-on: ubuntu-20.04
concurrency: build-n-publish-to-pypi
if: github.repository == 'CogStack/MedCAT'

steps:
- name: Checkout production
uses: actions/checkout@v4
with:
ref: ${{ github.event.release.target_commitish }}
fetch-depth: 0

- name: Set up Python 3.9
uses: actions/setup-python@v4
with:
python-version: 3.9

- name: Run UATs
run: |
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
all_files=$(git ls-files | grep '^tests/.*\.py$' | grep -v '/__init__\.py$' | sed 's/\.py$//' | sed 's/\//./g')
num_files=$(echo "$all_files" | wc -l)
midpoint=$((num_files / 2))
first_half_nl=$(echo "$all_files" | head -n $midpoint)
second_half_nl=$(echo "$all_files" | tail -n +$(($midpoint + 1)))
timeout 25m python -m unittest ${first_half_nl[@]}
timeout 25m python -m unittest ${second_half_nl[@]}

- name: Install pypa/build
run: >-
python -m
pip install
build
--user

- name: Build a binary wheel and a source tarball
run: >-
python -m
build
--sdist
--wheel
--outdir dist/
.

- name: Publish production distribution to PyPI
if: startsWith(github.ref, 'refs/tags') && ! github.event.release.prerelease
uses: pypa/gh-action-pypi-publish@v1.4.2
with:
password: ${{ secrets.PYPI_API_TOKEN }}
20 changes: 20 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
/output/
/graphics/
/models/*.dat
/notebooks/wandb/
/notebooks/logs/
/notebooks/results/
dist/
tmp/
*_tmp/
Expand All @@ -13,9 +16,14 @@ build/
.idea
venv
db.sqlite3
.ipynb_checkpoints

# vscode
.vscode

#tmp and similar files
.nfs*
*.log
*.pyc
*.out
*.swp
Expand All @@ -32,6 +40,18 @@ tmp_*
nohup.out
tmp.py
.DS_Store
*.lock
*.egg*

# models files
*.dat
!examples/*.dat
./checkpoints/

# Test output
tests/model_creator/output/*

# docs outputs
docs/auto/
docs/_build

19 changes: 19 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

version: 2

build:
os: ubuntu-20.04
tools:
python: "3.10"

sphinx:
configuration: docs/conf.py

python:
install:
- requirements: docs/requirements.txt
- method: setuptools
path: .
Loading