Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big gap in experimental results #1

Closed
tikboaHIT opened this issue Jun 2, 2021 · 13 comments
Closed

Big gap in experimental results #1

tikboaHIT opened this issue Jun 2, 2021 · 13 comments

Comments

@tikboaHIT
Copy link

hello:

When using your pre-trained model, I found that your accuracy and F1 score cannot be achieved.
The results of the model using hact are as follows:

image.png

The results of the model using cggnn are as follows:

image.png

The results of these experiments are quite different from yours. I'm confused and don't know which phase went wrong.

@guillaumejaume
Copy link
Contributor

Hi,

what version of the BRACS dataset of are using?

@guillaumejaume
Copy link
Contributor

I assume 566 samples is on the new version, which we haven't used for this work. Are you able to reproduce w/ the previous version?

@llvy21
Copy link

llvy21 commented Oct 28, 2021

I run the pre-trained model on the previous_version data with bracs_hact_7_classes_pna.yml, but got similar results.
The results of the model using hact are as follows:
5024FF15-CCA2-4853-BCF4-8000222A7A20
I don't know which phase went wrong.

@guillaumejaume
Copy link
Contributor

guillaumejaume commented Oct 28, 2021

Maybe something to do with the preprocessing. You can download the preprocessed cell, tissue and hact graphs for the BRACS dataset here:

https://ibm.box.com/s/6v4sasavltjzi91gmohswz2ek6i8i3lp

Or by downloading this zip file that includes the test cell graphs:

https://ibm.box.com/shared/static/412lfz992djt8u6bgu13y9cj9qsurwui.zip

Let me know if you can reproduce with these ones

@llvy21
Copy link

llvy21 commented Oct 29, 2021

I downloaded the preprocess cell, tissue and hact graphs, and it works. Thank you. I will check preprocess files later.
The results of the model using hact are as follows:
image

By the way, I see you divided BRACS dataset into four test folds. How did you divide them and why? Are they partitioned randomly?
image

@llvy21
Copy link

llvy21 commented Oct 29, 2021

I can't get correct results with the cell graphs files in the second link.

393EA090-CB92-4519-8E76-2BCD37C6E225

.../python3.7/site-packages/sklearn/metrics/_classification.py:1248: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use zero_division parameter to control this behavior.

But I can get the correct cggnn results using pre-processed files you provided in the first link:
3C7287F3-76F6-4839-8ECE-8A460E9B5DEC

@llvy21
Copy link

llvy21 commented Oct 29, 2021

When pre-processing images, some will cause a warning like this while the others won't:
B9479E5E-02DE-4A9C-B7D6-6A0C0C301EB7
I checked the original images and they're not empty pictures like #2 mentioned.
Will it disturb the pre-process? Thank you for your help! :)

@guillaumejaume
Copy link
Contributor

The warnings should not be an issue to run the preprocessing.

@huihhui
Copy link

huihhui commented Jun 10, 2023

Hello, may I ask where can I download pre-trained models?

@CarlinLiao
Copy link

@guillaumejaume Would you know why a model trained on graphs preprocessed and created locally don't provide the same performance as the graphs you provide in the download link?

@CarlinLiao
Copy link

CarlinLiao commented Jul 10, 2024

I trained my own HACTNet, cell, and tissue graph models to see if it's cell or tissue graphs that's causing the performance gap. Based on my results, it appears that it's mainly the tissue graphs generated by this repo under stock settings that aren't as good for model performance as those uploaded by the histocartography team. Maybe there's something about ColorMergedSuperpixelExtractor or RAGGraphBuilder that different between the paper code and what I ran? I used histocartography version 0.2.0 for this.

Here are my weighted F1 scores on the test set for models trained on the IBM Box graph sets compared to those I created locally from the BRACS ROI previous version using generate_hact_graphs.py and the pretrained checkpoint scores provided in the README:

Model README Trained on uploaded Trained on locally generated
CG Model 56.7 57.5 56.7
TG Model 57.8 57.1 53.8
HACTNet 61.5 61.4 56.5

EDIT: Noticed that my script failed to create the last two dozen training graphs because of a corrupted RoI download. I finished creating my graphs, retrained the models, and updated my findings.

@JingnaQiu
Copy link

I trained my own HACTNet, cell, and tissue graph models to see if it's cell or tissue graphs that's causing the performance gap. Based on my results, it appears that it's mainly the tissue graphs generated by this repo under stock settings that aren't as good for model performance as those uploaded by the histocartography team. Maybe there's something about ColorMergedSuperpixelExtractor or RAGGraphBuilder that different between the paper code and what I ran? I used histocartography version 0.2.0 for this.

Here are my weighted F1 scores on the test set for models trained on the IBM Box graph sets compared to those I created locally from the BRACS ROI previous version using generate_hact_graphs.py and the pretrained checkpoint scores provided in the README:

Model README Trained on uploaded Trained on locally generated
CG Model 56.7 57.5 56.7
TG Model 57.8 57.1 53.8
HACTNet 61.5 61.4 56.5
EDIT: Noticed that my script failed to create the last two dozen training graphs because of a corrupted RoI download. I finished creating my graphs, retrained the models, and updated my findings.

Hi, I'm not able to reproduce the "Trained on uploaded" results, my results remain around 55, did you do hyperparameter tuning or simply follow the settings (learning rate, epochs, batch size) that are provided in README?

@CarlinLiao
Copy link

I didn't do any hyperparamter tuning, I just use the config and settings as shown. I've noticed that test set accuracy tends to fluctuate up and down a few percentage points even with the same settings, so I think 55 is close enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants