Vahadane numpy backend #28

andreped · 2022-12-22T21:14:54Z

I have reproduced the Vahadane implementation from StainTools. The implementation is with numpy backend, but there are two steps that remains for this to work with pure numpy.

Hence, this is a draft, where we can discuss where to go next.

The two challenges are:

Directionary learning step
LARS-LASSO step

Scikit-learn offer similar implementations as the one used by spams, but when swapping them, I was unable to get the same results. Hence, right now, the current implementation depends on spams.

This will also be a problem with both TensorFlow and PyTorch backends, as it might be challenging to implement these from scratch in an optimized manner with both backends.

Adding spams itself is not that challenging. However, pip-installing the original project has deemed challenging in several projects, which I have used spams. Luckily, there is someone who has built precompiled wheels which makes installation seemless:
https://github.com/samuelstjean/spams-python/releases/tag/v2.6.4

I have not yet added spams as a dependency, and will wait for further instructions.

andreped · 2023-01-29T12:26:59Z

Can also be noted that @Leengit and the devs at @Kitware, have done a great job of adding support for the Vahadane algorithm to itk-spcn. Tutorial on how to use it can be found here.

If we ever get some kind of backend which supports it, we should benchmark against their implementation.

liupei101 · 2023-02-06T13:47:23Z

Hi, andreped. I also tried Vahadane for stain norm using staintools. But it seems that the current implementation of Vahadane cannot work well with torch's DataLoader when setting num_works>0 (use multi-threads to load data).

I watched the code of Vahadane implementation, and found that the problem lies in that the Spams which Vahadane implementation requires also uses multi-threads.

I also noticed that this problem can be solved by using DataLoader with num_works=0 or setting numThreads=1 in every spams invoking (please see Issue #43 in staintools), but this leads to an extremely-slow data loading for large WSIs.

So, I wonder, if there is any appropriate solution to this problem. Thank you so much.

andreped · 2023-02-06T15:05:23Z

Hello, @liupei101!

As I guess you saw from my comment above, modifying this implementation such that it only depends on PyTorch (hence, removing the spams backend), is not trivial. Right now, it is outside the scope of what I have time for, as I have mostly participated in this project in my spare time.

As you are aware, Vahadane is not a fast algorithm. It is made faster through parallel computing. That you observed runtime degradation by removing the threading within a single patch to enable more patches to be normalized in parallel, is not really that surprising, as I believe the main bottleneck comes within the spams computation itself, which is done per image. Having threads on top of threads also rarely works well in practice. Some times doing multiprocessing instead to process patches in parallel works better, but in general that tend to run into deadlocks, especially if the goal is to normalize batches of patches on the fly during training. Hence, I'm not sure how one could further optimize the Vahadane implementation, unless the algorithm itself worked directly on batches, and thus supported a batch dimension. But due to the complex nature of the algorithm, I don't see how to do that easily.

If I were you, I would try out the Vahadane implementation in itk-spcn (for tutorial see here).

AFAIK, they have implemented everything from scratch using only ITK in C++ without the spams backend. The code has then been wrapped to be accessible from Python. If that's the case, it might be that you are able to take advantage of PyTorch's multithreaded dataloader, to normalize patches in parallel. But then again, might be that you run into the exact same issue. Might also be that it works, but that the normalization runtime does not linearly decrease with more workers.

If you were to make an attempt, it would be very interesting for me as well. Would be great if you could share a gist, such that I or others could debug this further.

EDIT: BTW, did you try to run our macenko implementation in a similar batch processing pipeline? If you have a gist that demonstrates this, it could be an idea to add that to the README, to show people how to do the same easily.

eyedealism · 2024-01-04T20:23:40Z

Regarding multiprocessing, the LARS-LASSO part of the Vahadane algorithm can work directly on patches. The dictionary learning step requires the whole image, but this step is relatively faster than the LARS-LASSO.

andreped · 2024-01-05T08:03:14Z

Regarding multiprocessing, the LARS-LASSO part of the Vahadane algorithm can work directly on patches. The dictionary learning step requires the whole image, but this step is relatively faster than the LARS-LASSO.

@Robot-Eyes Thats probably true. But the main issue here is that implementing the entire Vahadane algorithm in pure pytorch/tf is much more difficult and tedious than we thought. I made an attempt, but I failed to achieve the expected results in some of the early steps of the algorithm already, so I did not venture any further.

One issue was that I tried to go away from the old LARS LASSO step which was not performed in pure python by using scipy's implementation, or something like that. But the results were very different. If they were the same or similar, I would base the pytorch implementation on it by converting submodules.

As this did not work, we did not merge this solution as part of torchstain, as we do not wish to support this legacy backend.

eyedealism · 2024-01-05T15:48:35Z

Thanks for your reply.

I saw in your commits you printed the lasso results from scipy and spams to compare. I used to use staintools and I know the staintools implementation has its own problem here: the default dictionary learning in spams takes all available threads and runs for 1s no matter how large the image is and how fast your CPUs are. If you run it on the same image, the results will be different every time. Instead, in scipy by default it takes 1000 iterations and a single thread, which I believe is more reasonable. If you have tested it on a small image and a fast computer, then spams probably ran many more iterations than 1000. Maybe one could fix the iteration steps and see whether it helps.

andreped · 2024-01-06T15:34:00Z

Yes, I also used staintools and had great success with Vahadane, so finding ways to optimize it would be of great value. On way is to go to pytorch/tensorflow to enable GPU-compute, but this can further be accelerated, and greatly so with GPUs, if compute can be performed in parallel, where relevant.

I stopped implementing support for Vahadane with the various backbones, as I failed to see how I could get the same or similar results using the implementation available in the other Python library (not spams). There are also a lot of other stuff that needs translating to torch/tf, hence, through a fast-fail principle, I chose to stop there.

But maybe you are interested in making a try? If you manage to get torch implementation working, making an equivalent tf/numpy-implementation is trivial. Are you interested, @Robot-Eyes? You can start from my branch, if you'd like.

eyedealism · 2024-01-07T02:48:51Z

Yes, I could give it a try later.

andreped added 14 commits December 21, 2022 17:58

added vahadane to API

306e032

draft vahadane implementation

82fedba

lasso numpy implementation

a35f27e

OD2RGB numpy converter

c1c860e

RGB2OD numpy converter

e7fe993

numpy standardize brightness

a0f7cf0

almost finished vahadane -DL and LarsLasso remains

cdc6431

updated README regarding vahadane support

603e4b0

refactored README

6bd3b44

fixed typo in README [no ci]

1b11f11

minor refactor in README [no ci]

4e6d792

references typo in redirect

62d9405

reproduced StainTools vahadane, but dep spams

2f018f7

sklearn and spams mismatch

e7a6746

andreped marked this pull request as draft December 22, 2022 21:15

andreped mentioned this pull request Dec 22, 2022

Support for Vahadane normalization #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vahadane numpy backend #28

Vahadane numpy backend #28

andreped commented Dec 22, 2022

andreped commented Jan 29, 2023

liupei101 commented Feb 6, 2023 •

edited

Loading

andreped commented Feb 6, 2023 •

edited

Loading

eyedealism commented Jan 4, 2024 •

edited

Loading

andreped commented Jan 5, 2024

eyedealism commented Jan 5, 2024

andreped commented Jan 6, 2024

eyedealism commented Jan 7, 2024

Vahadane numpy backend #28

Are you sure you want to change the base?

Vahadane numpy backend #28

Conversation

andreped commented Dec 22, 2022

andreped commented Jan 29, 2023

liupei101 commented Feb 6, 2023 • edited Loading

andreped commented Feb 6, 2023 • edited Loading

eyedealism commented Jan 4, 2024 • edited Loading

andreped commented Jan 5, 2024

eyedealism commented Jan 5, 2024

andreped commented Jan 6, 2024

eyedealism commented Jan 7, 2024

liupei101 commented Feb 6, 2023 •

edited

Loading

andreped commented Feb 6, 2023 •

edited

Loading

eyedealism commented Jan 4, 2024 •

edited

Loading