Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError when running models #11

Closed
carsen-stringer opened this issue Jan 18, 2022 · 4 comments
Closed

KeyError when running models #11

carsen-stringer opened this issue Jan 18, 2022 · 4 comments
Assignees

Comments

@carsen-stringer
Copy link

Thanks for the great data release and model release!

Unfortunately I have received this error: chongruo/detectron2-ResNeSt#54, when trying to run both the anchor-free and anchor-based models. I'm using Ubuntu 18.04 OS, torch v1.10, cudatoolkit=11.3.

My apologies if I did not understand the install directions. I installed detectron2 using the instructions from the facebook page using python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html. Then I cloned the two repos as suggested and cd'ed into them to run their train_net.py script.

I used the LIVECell config file with the only modification being the path to the images, I was not sure if there was another script I needed to run? Then I ran the following command in the detectron2-ResNeSt folder

python ./tools/train_net.py --config-file ../LIVECell/model/anchor_based/livecell_config.yaml  --eval-only MODEL.WEIGHTS ../LIVECell_anchor_based_model.pth

and received the error:

Command Line Args: Namespace(config_file='../LIVECell/model/anchor_based/livecell_config.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', '../LIVECell_anchor_based_model.pth'], resume=False)
Traceback (most recent call last):
  File "./tools/train_net.py", line 157, in <module>
    launch(
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "./tools/train_net.py", line 127, in main
    cfg = setup(args)
  File "./tools/train_net.py", line 119, in setup
    cfg.merge_from_file(args.config_file)
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/detectron2/config/config.py", line 69, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/fvcore/common/config.py", line 132, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 491, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.RESNETS.RADIX'

I also tried using the author's new repo and then received a different key error. When trying to directly pip install -e . in their repo I received several errors in the build which I could not figure out how to resolve (tried their suggestions but failed). Do you perhaps have more detailed instructions for installing detectron2 from source from their repo if that's what's required?

Thanks!

@ChristofferEdlund
Copy link
Contributor

Dear @carsen-stringer,
thank you for the interest in our data and models, as well as taking the time to raise this issue to help us make our tools more accessible to others.

We might have narrowed down the potential problem, it seems like the ResNeSt code uses its own fork of detectron2 and in that have changed a defaults.py file to reflect the inclusion of the RADIX variable that is controlled from the config file.

They still use the package name detectron2, so when they later on import their custom code it is done as:

from detectron2.config import get_cfg

Since the default detectron2 does not have ResNeSt changes, we get the key error you posted.

Solution
This unfortunate situation leaves us with having to install the original detectron2 package when using centermask2 (anchor-free) models and installing the detectron2-ResNeSt version of detectron2 when using the anchor-based models. This can be done with:

git clone https://github.com/zhanghang1989/detectron2-ResNeSt.git
python -m pip install -e detectron2-ResNeSt

Then you should be able to run the code via the command you wrote.
When you want to use the centermask code, stick with our install guide:

git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

Please confirm if this fixed the problem, we will update our install instructions accordingly if that is the case.

Kindly,
Christoffer

@ChristofferEdlund ChristofferEdlund self-assigned this Jan 19, 2022
@ChristofferEdlund ChristofferEdlund added documentation Improvements or additions to documentation and removed documentation Improvements or additions to documentation labels Jan 19, 2022
@carsen-stringer
Copy link
Author

Thank you so much for the fast and thorough response, my bad, I thought installing a premade wheel from detectron2 was fine, maybe a warning for lazy people like me would be helpful :) I've gone ahead with the anchor free since the detectron2 install worked without issues!

One note, the dataset registration instructions are also a little confusing. I found that I had to put all the test images into a single folder but that is not the format in which you provide the data (after unzipping images.zip). Again maybe I did something wrong?

register_coco_instances('TEST', {}, '/media/carsen/DATA2/livecell/livecell_coco_test.json', '/media/carsen/DATA2/livecell/images/livecell_test_images/')

Also I found that when I added the import to your COCOEvaluator I received this error:

Traceback (most recent call last):
  File "train_net.py", line 157, in <module>
    launch(
  File "/home/carsen/anaconda3/envs/livecell/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "train_net.py", line 134, in main
    res = Trainer.test(cfg, model)
  File "/home/carsen/anaconda3/envs/livecell/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 609, in test
    evaluator = cls.build_evaluator(cfg, dataset_name)
  File "train_net.py", line 68, in build_evaluator
    evaluator_list.append(COCOEvaluator(dataset_name, output_dir=output_folder))
TypeError: __init__() missing 2 required positional arguments: 'cfg' and 'distributed'

I then changed the call to input cfg, True. And the inference ran again (like it did when I didn't import the specific evaluator), it took awhile, but received this error after the AP printing

[01/19 13:36:19 d2.evaluation.evaluator]: Total inference pure compute time: 0:03:35 (0.138517 s / iter per device, on 1 devices)
evaluate
_eval_predictions
use_fast_impl: False
Loading and preparing results...
DONE (t=0.60s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Running per image evaluation...
Evaluate annotation type *bbox*

DONE (t=6736.20s).
Accumulating evaluation results...
DONE (t=18.25s).
In method
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.477
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.817
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.497
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.471
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.489
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.505
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.212
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.480
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.569
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.602
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.672
Traceback (most recent call last):
  File "train_net.py", line 157, in <module>
    launch(
  File "/home/carsen/anaconda3/envs/livecell/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
    main_func(*args)
  File "train_net.py", line 134, in main
    res = Trainer.test(cfg, model)
  File "/home/carsen/anaconda3/envs/livecell/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 617, in test
    results_i = inference_on_dataset(model, data_loader, evaluator)
  File "/home/carsen/anaconda3/envs/livecell/lib/python3.8/site-packages/detectron2/evaluation/evaluator.py", line 204, in inference_on_dataset
    results = evaluator.evaluate()
  File "/media/carsen/DATA2/livecell/LIVECell/code/coco_evaluation.py", line 155, in evaluate
    self._eval_predictions(set(self._tasks), predictions)
  File "/media/carsen/DATA2/livecell/LIVECell/code/coco_evaluation.py", line 199, in _eval_predictions
    _evaluate_predictions_on_coco(
  File "/media/carsen/DATA2/livecell/LIVECell/code/coco_evaluation.py", line 656, in _evaluate_predictions_on_coco
    pre_per_iou = [precisions[iou_idx, :, :, 0, -1].mean() for iou_idx in precisions.shape[0]]
TypeError: 'int' object is not iterable

The AP values look good though.

I am not sure how to process the instance outputs, they look only like boxes and not full masks and they are not in the same format as the livecell_coco_test.json file. I wrote some code to run the model myself outside of this Trainer.test mode. Perhaps in the future you could provide a script to get masks from your model so that a user doesn't have to write it themselves? Also so that we know we aren't making any mistakes. Thanks!

@ChristofferEdlund
Copy link
Contributor

Thank you for all the great feedback @carsen-stringer, it is very helpful.

Installation

We have now updated the install documentation to reflect the two different install instructions.

Data

You are correct that the cell images are packaged by cell type and not in a flat structure (as is needed when training), I will see if we can change this to make it easier to use our resource.

Eval

Regarding the evaluation, there seems that changes in the API since we wrote the code is generating that error. I would recommend to use the default evaluation from centermask2 instead of our script right now. The downside is that you will not get all the metrics that we calculate, but it should work and you will get the AP metrics.

Using the instances

Regarding the model output, it is in the detectron2 output standard that is explained in their documentation. I am afraid it is not a trivial format to work with and agree that a script to convert it to COCO would be benficial. We have also played around with the thought of hosting an API (similar to a certain cellpose model ;) ) but have not decided on anything yet. For now, I am afraid that the provided detectron2 documentation is the fastest solution. To visualize the results of the anchor-free model i would recommend the demo code in the centermask2 repo, it is based on the detectron2 demo code and usage is described here. I will add this info to our documentation page and update this issue when it is done.

Once again, thank you for bringing all of this to our attention. Let me know if there is more things I can support you with.

@ChristofferEdlund
Copy link
Contributor

Short update:

Data

The downloaded images is now in a flat structure to support a more plug-and play approach.

Eval

The coco_evaluation.py script is updated to work with latest centermask2 and detectron2. It still supports up to 2000 detections per image and prints the mAP for the different IoU thresholds for both precision and recall. Would appreciate to know if it works for you as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants