Skip to content
This repository has been archived by the owner on Dec 5, 2024. It is now read-only.

RCN example #96

Merged
merged 17 commits into from
Dec 3, 2021
Merged

RCN example #96

merged 17 commits into from
Dec 3, 2021

Conversation

shrinuKushagra
Copy link
Contributor

@shrinuKushagra shrinuKushagra commented Nov 19, 2021

This PR contains an example implementation of RCN using the pgmax package.
We load a pre-trained RCN model on a very small subset of mnist (20 examples) and test on a small subset of mnist (20 examples).
The reported accuracy = 0.80.

@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2021

Codecov Report

Merging #96 (49c5ca0) into master (2c81ff8) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##            master       #96   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            8         8           
  Lines          633       633           
=========================================
  Hits           633       633           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c81ff8...49c5ca0. Read the comment docs.

Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several major issues:

  1. We should aim for a single notebook file (with one data file containing the trained models in examples_data)
  2. There seems to be many redundant/random pieces
  3. Since this is intended as a tutorial, please make sure this can actually be run by someone. That means:
    • Make sure the data is accessible
    • Make sure you have properly updated the relevant dependicies
    • Make sure you have detailed docstrings/type hints for all your functions, and relevant comments when needed

examples/rcn/load_data.py Outdated Show resolved Hide resolved
examples/rcn/load_data.py Outdated Show resolved Hide resolved
examples/rcn/load_data.py Outdated Show resolved Hide resolved
examples/rcn/load_data.py Outdated Show resolved Hide resolved
examples/rcn/load_data.py Outdated Show resolved Hide resolved
pgmax/fg/graph.py Outdated Show resolved Hide resolved
examples/rcn/preproc.py Outdated Show resolved Hide resolved
examples/rcn/mnist_small.py Outdated Show resolved Hide resolved
examples/rcn/mnist_small.py Outdated Show resolved Hide resolved
examples/rcn/mnist_small.py Outdated Show resolved Hide resolved
@shrinuKushagra
Copy link
Contributor Author

Addressed the comments.

However, I could not find a principled solution to the Hashable int problem.
Seems like there might be some issues with the library itself.
python/mypy#2412

Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some initially comments after trying to run it. Will make more later

examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
pgmax/fg/graph.py Outdated Show resolved Hide resolved
Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made more comments

examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
@shrinuKushagra
Copy link
Contributor Author

All the comments are now addressed.
Let me know once you run it and other comments you have.

@StannisZhou
Copy link
Contributor

Want to mention that not all comments are addressed. There were 3 unaddressed comments (with reminders), eg. #96 (comment)

But I addressed them in my commits. Left a few more comments from playing with the notebook. Once these are addressed we can merge the PR.

Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the final round of comments

examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
Copy link
Contributor

@swaroopgj swaroopgj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding. I have left comments regarding readability since these are intended for understanding.
If possible I would add a short text at the top to describe RCN model with reference to the paper and the Science implementation available.

examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
# # 6. Compute metrics (accuracy)

# %%
best_model_idx = np.argmax(scores, axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we have two classes that score the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argmax will chose the one with smaller value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know but that is wrongly reducing the accuracy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If two templates have the same score, I am not sure what we can do in this case. It amounts to an arbitrarily selecting between the two (or more) of them.

@shrinuKushagra
Copy link
Contributor Author

Want to mention that not all comments are addressed. There were 3 unaddressed comments (with reminders), eg. #96 (comment)

But I addressed them in my commits. Left a few more comments from playing with the notebook. Once these are addressed we can merge the PR.

Hi Stannis,
Sorry about this. I went back and checked which of the comments were not addressed. It seems that the following comments were missed.

  1. Can we instead use np.unravel_index/ravel_multi_index for these? -- In the context of the valid_configs function.
  2. Changing the for loop in valid_config construction
    3 This can be omitted (it would be automatically initialized to all 0 if omitted) -- In the context of valid_configs

I went over the Conversations tab multiple times and these comments were not there. But looking back at the github emails, I do see that you had made these comments and also reminders for this. The only possible explanation seems that, these comments were present in the Files tab and not in the conversations tab. I just realized that sometimes there are comments that only show in the Files tab and not in Conversations tab. (this is in the context of another pr).

To conclude, I want to apologize for missing them. It was genuine over-sight (from not knowing about a github feature) and I will be more careful in future. Also, thanks for addressing the comments.

examples/rcn.py Outdated
# %% [markdown]
# RCN is a neuroscience based, data-efficient, interpretable object detection/vision system. It is a probabilistic graphical model (PGM) that trains in minutes and generalizes well (for certain traits). It can reason about, and is robust to, occlusion. It can also imagine and interpret scenes.
#
# Please refer to the [Science](https://www.cs.jhu.edu/~ayuille/JHUcourses/ProbabilisticModelsOfVisualCognition2020/Lec21/A%20generative%20vision%20model%20that%20trains%20with%20high%20data%20efficiency%20and%20breaks%20text-based%20CAPTCHAs.pdf) paper for more details. In this notebook, we implement a two-level RCN model on a small subset of the MNIST dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙄
Can you use this link please?
https://www.science.org/doi/10.1126/science.aag2612

@lazarox do we have open-access link to share?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also link to the earlier repo https://github.com/vicariousinc/science_rcn since you used that to train the model I assume.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And added new link where the model is loaded.

examples/rcn.py Outdated


# %%
def fetch_mnist_dataset(test_size: int, seed: int = 5) -> Tuple[np.ndarray, np.ndarray]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def fetch_mnist_dataset(test_size: int, seed: int = 5) -> Tuple[np.ndarray, np.ndarray]:
def fetch_mnist_dataset(test_size: int, seed: int = 5) -> tuple[np.ndarray, np.ndarray]:

And remove import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that using tuple was enabled in python 3.9?
https://www.python.org/dev/peps/pep-0585/

I was trying to use the suggestion but keep getting TypeError: 'type' object is not subscriptable. The project is using python version of 3.7.5, maybe thats why?

Copy link
Contributor

@wlehrach wlehrach Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh - my bad. You need to add:

from __future__ import annotations

to the top of the file - it'll be default in future python version
You often want that for other reasons, as various other typing things will fail otherwise.

Feel free to ignore (or fix), as you prefer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed :D

data["suppression_masks"],
data["filters"],
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here to indicate these are RCN model params

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added detailed description of the parameters.
Moved max_perturb_radius to the relevant cell as it not central to the model and only used to save computation.

examples/rcn.py Outdated
Comment on lines 128 to 131
hps, vps = 12, 12 # Horizontal and vertical pool sizes respectively for RCN models.
num_orients = filters.shape[0]
brightness_diff_threshold = 40.0
max_perturb_radius = 22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would re-group them and explain every variable defined here, with either an inline comment or a comment above it, like you did for pool sizes.
num_orients and bdt are related to preproc to run edge detection.
different from the core PGM model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also won't hurt to expand the pool size to indicate what it exactly is, is it half the window size, is it radius or full size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments. LGTM for the most part. Will take a final look after the comments are addressed. Thanks for patiently addressing all the comments!

examples/rcn.py Outdated Show resolved Hide resolved
# %% [markdown]
# We load the following variables (that have been pre-computed).
# - train_set and train_labels - A sample of MNIST train dataset containing 100 train images and their labels.
# - frcs and edges - Used to represent the learned rcn graphical models.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the learning module in science RCN repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated


# %% [markdown]
# ## 5.2 Run maximum a-posteriori probability (map) inference on all test images
Copy link
Contributor

@StannisZhou StannisZhou Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the word probability is redundant here, and map is usually capitalized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My earlier comment was intended for potential typo in usage of "map product inference".
But you can just use MAP inference here without having to expand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to say MAP inference.

pyproject.toml Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
examples/rcn.py Outdated Show resolved Hide resolved
Copy link
Contributor

@StannisZhou StannisZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Excited to have this example ready!

examples/rcn.py Outdated Show resolved Hide resolved
Co-authored-by: Swaroop Guntupalli <swaroopgj@gmail.com>
@StannisZhou StannisZhou merged commit d8f80b8 into vicariousinc:master Dec 3, 2021
@StannisZhou StannisZhou linked an issue Dec 3, 2021 that may be closed by this pull request
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Experiments on RCN
5 participants