-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of MedCAT #101
Comments
Was merged, but no release yet. Add Github tag with python-poetry/poetry#313 |
I think that free text should not be part of X in its raw form. I'd add the free text to obs only and allow for an embedding column/matrix to be appended to X after MedCAT was applied. |
Yes, I will take care of implementing an "autodetect" feature for this, so users are not forced to pass every free text column for obs only when creating the MuData/AnnData object. |
1.2.6 was released and we should be ready to implement this now. |
Rewrote our current implementation to work with the latest MedCAT. Think this still requires a redesign. |
As discussed: Keep a "main" MedCat object, so we do not loose any results. Add a function to nicely display such an object, for easier navigation by the user. Add pp functions to filter the object for specific values (like tui, cui, type of disease, symptoms, etc) Add a function that can return a binary column based on user filtering (e.g. which row contains for example pulmonary diseases). So the actual values (which might be multiple values in one row) never need to be actually stored in the AnnData object, only indicators when they are needed. Add a decorator or overwrite plotting functions like umap, pca etc, for example when coloring by pulmonary disease (y/n). |
Did I miss something @Zethson ? |
No, sounds great. Feel free to show early drafts so that we can evaluate our approach before doubling down. Thank you! |
- refactored exsting code - removed most unused functions - MedCat object now only serves for cdb and vocab (not for actual processing functions) - processing functions are now part of the public API rather than static object methods - downstream analysis WIP (prepare annotation results etc, see issue)
MedCat [#101]: extract biomedical concepts/entities from (free) text and analyse them with ehrapy
To extract keywords from free text notes we will be integrating MedCAT.
The goals are as follows:
Tasks in somewhat reasonable order
The text was updated successfully, but these errors were encountered: