Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xueqing - Novelty map #9

Open
huanhe4096 opened this issue Sep 20, 2024 · 6 comments
Open

Xueqing - Novelty map #9

huanhe4096 opened this issue Sep 20, 2024 · 6 comments
Assignees

Comments

@huanhe4096
Copy link
Contributor

No description provided.

@huanhe4096 huanhe4096 self-assigned this Oct 1, 2024
@huanhe4096
Copy link
Contributor Author

Image

We only use a few columns, make a notebook to process the data

@huanhe4096
Copy link
Contributor Author

Need to convert columns, just save the those used in system.

@huanhe4096
Copy link
Contributor Author

huanhe4096 commented Oct 1, 2024

Convert colors to legend:

{ "color": "#037758", "label": "Regulatory T Cell Induction and Immune Modulation" },
{ "color": "#882507", "label": "Antibody Binding and Protein Imaging Studies" },
{ "color": "#0c637f", "label": "Immune Biomarkers and Disease Correlations" },
{ "color": "#6acff1", "label": "Rare Case Reports of Malignant Lesions, Such as Lymphoma and Brain Tumors" },
{ "color": "#ffd166", "label": "Immunotherapies and Combination Treatments for Cancer" },
{ "color": "#83d483", "label": "Cancer Immunotherapy Response (ICIs) and Survival Outcomes" },
{ "color": "#ef476f", "label": "Cellular Immune Responses and Immune Exhaustion in Viral Infections" },
{ "color": "#5b0269", "label": "Genomic Alterations and Tumor Mutational Burden in Colorectal Cancer" },
{ "color": "#40f2e9", "label": "Tumor Expression Profiles and Survival Prediction" },
{ "color": "#020e13", "label": "Gene Expression Signatures and Immune Pathways in Cancer, Espetially HCC" },

@huanhe4096
Copy link
Contributor Author

Using the following to create topics:

python run_annotator.py ~/Desktop/XP-33K.tsv --topic-configs "0,3,3;50,10,5;200,25,10;8000,50,50"

@huanhe4096
Copy link
Contributor Author

* topic_cluster_configs: TopicAnnotatorConfig(point_identifier='pid', stopwords=set(), topic_cluster_configs={0: TopicClusterConfig(prereq_condition={'min_length': 0}, clustering_configs={'min_cluster_size': 3, 'min_samples': 3}, topic_determination_configs={'strategy': 'K_NEIGHBOR_TFIDF', 'k': 5}, labeling_configs={'terms_per_topic': 2, 'term_concat_delimiter': ', '}), 1: TopicClusterConfig(prereq_condition={'min_length': 50}, clustering_configs={'min_cluster_size': 10, 'min_samples': 5}, topic_determination_configs={'strategy': 'TFIDF'}, labeling_configs={'terms_per_topic': 2, 'term_concat_delimiter': ', '}), 2: TopicClusterConfig(prereq_condition={'min_length': 200}, clustering_configs={'min_cluster_size': 25, 'min_samples': 10}, topic_determination_configs={'strategy': 'TFIDF'}, labeling_configs={'terms_per_topic': 2, 'term_concat_delimiter': ', '}), 3: TopicClusterConfig(prereq_condition={'min_length': 8000}, clustering_configs={'min_cluster_size': 50, 'min_samples': 50}, topic_determination_configs={'strategy': 'TFIDF'}, labeling_configs={'terms_per_topic': 2, 'term_concat_delimiter': ', '})})
* loaded points dataframe from /Users/hh667/Desktop/XP-33K.tsv
* generated output path: /Users/hh667/Desktop/XP-33K.topics.json
******************** cluster level 3 ********************
* begin level 3 clustering: {'min_cluster_size': 50, 'min_samples': 50}
* done level 3 clustering successful
* updated level 3 topic tree
* begin level 3 topic modeling: {'strategy': 'TFIDF'}
/Users/hh667/miniconda3/envs/bike-py311/lib/python3.11/site-packages/sklearn/feature_extraction/text.py:525: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None'
  warnings.warn(
* done level 3 topic modeling successful
* found 99 topics
******************** cluster level 2 ********************
* begin level 2 clustering: {'min_cluster_size': 25, 'min_samples': 10}
* done level 2 clustering successful
* updated level 2 topic tree
* begin level 2 topic modeling: {'strategy': 'TFIDF'}
/Users/hh667/miniconda3/envs/bike-py311/lib/python3.11/site-packages/sklearn/feature_extraction/text.py:525: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None'
  warnings.warn(
* done level 2 topic modeling successful
* found 287 topics
******************** cluster level 1 ********************
* begin level 1 clustering: {'min_cluster_size': 10, 'min_samples': 5}
* done level 1 clustering successful
* updated level 1 topic tree
* begin level 1 topic modeling: {'strategy': 'TFIDF'}
/Users/hh667/miniconda3/envs/bike-py311/lib/python3.11/site-packages/sklearn/feature_extraction/text.py:525: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None'
  warnings.warn(
* done level 1 topic modeling successful
* found 821 topics
******************** cluster level 0 ********************
* begin level 0 clustering: {'min_cluster_size': 3, 'min_samples': 3}
* done level 0 clustering successful
* updated level 0 topic tree
* begin level 0 topic modeling: {'strategy': 'K_NEIGHBOR_TFIDF', 'k': 5}
/Users/hh667/miniconda3/envs/bike-py311/lib/python3.11/site-packages/sklearn/feature_extraction/text.py:525: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None'
  warnings.warn(
* done level 0 topic modeling successful
* found 3075 topics

Too much topics in last level ... more than 3000??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant