All sampled images in image list #88

PaulHax · 2024-08-06T20:28:46Z

Now, the "image list" shows all images sampled from the dataset. As you scroll down the table, images are loaded, transformed, and object detected. To know what images are in view within the table, the quasar.QTable component has some "virtual scrolling" features that trigger a callback with the "in view" row numbers.

When you select images from the embeddings plot, the app filters the image list. Before, you had to select images from the embeddings plot to show them in the image list.

images.py caches loaded images, transformed images, and their object detection results. images.py uses a least recently used cache strategy.

To seperate caching method from downstream processing, embeddings_extractor.py and object_detector.py now take only loaded PIL.Images as arguments.

PaulHax · 2024-08-21T16:40:02Z

I looked at using functools.lru_cache for the images.get_image function. images.py code runs some side-effects when a image is dropped from the cache (deletes image from the trame state). I did not see a "item ejected" callback for lru_cache.

Grid view does not do virtual scrolling

vicentebolea · 2024-08-27T01:02:53Z

@PaulHax it seems that the current BufferCache does not remove the item from the trame cache, in any case I believe that we should detach the cache from the trame state, the evicting instruction is self.cache.popitem(last=False). lru_cache decorator should be able to substitute this.

vicentebolea · 2024-08-27T01:07:30Z

@PaulHax looks good, I want to better understand this, we went through the changes a couple of weeks ago but I Want to review this in more detail. I wonder if you could edit the description to add a brief note of the feature, the reason for adding the proposed new modules and why we want to change the existing "library" modules.

PaulHax · 2024-08-29T16:21:33Z

it seems that the current BufferCache does not remove the item from the trame cache,

Holy mem leak Batman! Good catch. I fix in PR cuz i can't push to this branch anymore =)

If you have an idea on how to use lru_cache as user scrolls through long image list i'm all ears. I still don't know how that would work.

vicentebolea · 2024-08-29T20:52:16Z

If you have an idea on how to use lru_cache as user scrolls through long image list i'm all ears. I still don't know how that would work.

How about wrapping the image into a class that implements the del method, the class also has a ref to the state. In this manner the imagewrapper class can deallocate itself from the state just before being destroyed.

ScatterPlot could be used for non images data.

PaulHax · 2024-09-04T15:50:54Z

Refactored images.py to use @lru_cache for loading and tranforming images with a wrapper and the __del__ trick. 🎉

vicentebolea

The changes looks good to me, however, I am skeptical of moving data structures to singleton. I wrote a more detailed analysis in the comments among other comments.

I can see how this feature could be implemented without moving the ImageManager to a singleton implemented in image.py and I do not see any drawback other than having to instantiate in each application class when is it is not in a standalone mode.

I am also skeptical of having another annotations cache singleton too.

Other than that great work in this feature.

vicentebolea · 2024-09-05T01:18:41Z

src/nrtk_explorer/app/images/images.py

+class BufferCache:
+    """Least recently accessed item is removed when the cache is full."""
+
+    def __init__(self, max_size: int, on_clear_item: Callable[[str], None]):
+        self.cache: OrderedDict[str, Any] = OrderedDict()
+        self.max_size = max_size
+        self.on_clear_item = on_clear_item
+
+    def add_item(self, key: str, item):
+        """Add an item to the cache."""
+        self.cache[key] = item
+        self.cache.move_to_end(key)
+        if len(self.cache) > self.max_size:
+            oldest = next(iter(self.cache))
+            self.clear_item(oldest)
+
+    def get_item(self, key: str):
+        """Retrieve an item from the cache."""
+        if key in self.cache:
+            self.cache.move_to_end(key)
+            return self.cache[key]
+        return None
+
+    def clear_item(self, key: str):
+        """Remove a specific item from the cache."""
+        if key in self.cache:
+            self.on_clear_item(key)
+            del self.cache[key]
+
+    def clear(self):
+        """Clear the cache."""
+        for key in self.cache.keys():
+            self.on_clear_item(key)
+        self.cache.clear()


Can this LRU be substutited as well by the functool lru_cache?

How to use @lru_cache with the object detection function:

def get_annotations(detector: ObjectDetector, id_to_image: Dict[str, Image.Image]):

vicentebolea · 2024-09-05T01:37:31Z

src/nrtk_explorer/app/images/images.py

+@state.change("current_dataset")
+def clear_all(**kwargs):
+    get_image.cache_clear()
+    get_cached_transformed_image.cache_clear()
+    annotation_cache.clear()
+
+
+@change_checker(state, "dataset_ids")
+def init_state(old, new):
+    if old is not None:
+        # clean old ids that are not in new
+        old_ids = set(old)
+        new_ids = set(new)
+        to_clean = old_ids - new_ids
+        for id in to_clean:
+            annotation_cache.clear_item(id)  # ground truth
+            annotation_cache.clear_item(dataset_id_to_image_id(id))  # original image detection
+            keys = get_image_state_keys(id)
+            annotation_cache.clear_item(keys["transformed_image"])
+            delete_state(state, keys["meta_id"])
+
+    # create reactive annotation variables so ImageDetection component has live Refs
+    for id in new:
+        keys = get_image_state_keys(id)
+        for key in keys.values():
+            if not state.has(key):
+                state[key] = None
+    state.hovered_id = None


This new module contains a variety of functions and classes which do related things however different, this hurt readability.

I believe that things related to the state should ideal stay in their corresponding application (applet) classes. Common code can be factored out here.

Annotations related things could go into another module. lru cache pattern could reduce the LOC of it.

In this commit: d3bd004
I've factored out the annotation/obejct-detection code from images.py. annotations.py computes/retreaves the annotations and manages the caches. stateful_annotations.py copies the annotations to the trame state.

vicentebolea · 2024-09-05T01:55:00Z

src/nrtk_explorer/library/embeddings_extractor.py

-        if len(paths) == 0:
-            return None
+    def extract(self, images, batch_size=32):
+        """Extract features from images"""


There are benefits on keeping the imageManager here, it hides from the caller code the imaging loading while it allows to possibly having different ImageManager by extending the imageManager classes/ducktyping. Also eases testing and debugging.

Overall I take caution with global Objects/Variables, it is a well known anti-pattern to abuse them. I can see that we are moving our source code to that direction (I am did this with the dataset.py) and I am afraid that this will make our code harder to maintain in the future.

we'll address that on a followup PR

* feat: show all images in list * feat(embeddings): selection of points filters list * refactor(object_detector): remove images_manager dependency * refactor: move image modules into images folder * refactor: use BufferCache for images * refactor: use BufferCache for annotations * refactor: clean dead code * fix(embeddings): add transformed img point as computed * chore(ci): run tests without depending on linters * fix(image_list): respect client side sorting and filtering * fix(ScatterPlot): correct is_transformed in hover * fix(image_list): paginate grid view Grid view does not do virtual scrolling * perf(object_detector): reuse last successful batch size * refactor: remove images_manager * fix: flush transformed images to state before detection * fix(images): actually call on_clear callback in BufferCache * feat(image_list): hide dependant columns when transforms disabled * refactor(images): use lru_cache for image functions * refactor(ScatterPlot): rename props to use points ScatterPlot could be used for non images data. * refactor: reorder images.py and doc object_detector * refactor(annotations): move logic from images.py

PaulHax marked this pull request as draft August 6, 2024 20:29

PaulHax force-pushed the all-images branch 2 times, most recently from da0bb37 to 993a697 Compare August 6, 2024 20:40

PaulHax marked this pull request as ready for review August 6, 2024 21:57

PaulHax requested a review from vicentebolea August 7, 2024 15:18

PaulHax force-pushed the all-images branch from f42f790 to c629860 Compare August 7, 2024 15:26

PaulHax force-pushed the all-images branch 2 times, most recently from e65f08e to 98b0b6c Compare August 21, 2024 15:30

This was referenced Aug 21, 2024

Crash Report #92

Closed

Initial Thoughts #93

Open

PaulHax force-pushed the all-images branch 2 times, most recently from 022d09b to 1418c18 Compare August 21, 2024 17:58

PaulHax added 15 commits August 23, 2024 14:54

feat: show all images in list

016eb26

feat(embeddings): selection of points filters list

f1e4126

refactor(object_detector): remove images_manager dependency

e270bf3

refactor: move image modules into images folder

fcf285f

refactor: use BufferCache for images

ae799ea

refactor: use BufferCache for annotations

1ea5b3e

refactor: clean dead code

0d069b8

fix(embeddings): add transformed img point as computed

b0cdc75

chore(ci): run tests without depending on linters

db06424

fix(image_list): respect client side sorting and filtering

43fb47b

fix(ScatterPlot): correct is_transformed in hover

e134291

fix(image_list): paginate grid view

212b3e1

Grid view does not do virtual scrolling

perf(object_detector): reuse last successful batch size

cd40813

refactor: remove images_manager

ff39067

fix: flush transformed images to state before detection

c6714bf

PaulHax force-pushed the all-images branch from 1418c18 to c6714bf Compare August 23, 2024 18:54

PaulHax mentioned this pull request Aug 29, 2024

fix(images): actually call on_clear callback in BufferCache #100

Closed

PaulHax added 4 commits August 30, 2024 11:27

fix(images): actually call on_clear callback in BufferCache

da6ed4b

Merge branch 'main' into all-images

eca1556

feat(image_list): hide dependant columns when transforms disabled

71759f5

refactor(images): use lru_cache for image functions

30b700f

PaulHax force-pushed the all-images branch from 8d64bfe to 30b700f Compare September 4, 2024 13:54

PaulHax added 2 commits September 4, 2024 10:47

refactor(ScatterPlot): rename props to use points

220a07e

ScatterPlot could be used for non images data.

refactor: reorder images.py and doc object_detector

f2ee488

vicentebolea previously requested changes Sep 5, 2024

View reviewed changes

PaulHax force-pushed the all-images branch 3 times, most recently from b0a42ee to d54172f Compare September 6, 2024 22:50

PaulHax mentioned this pull request Sep 10, 2024

Code cleanup #104

Merged

PaulHax force-pushed the all-images branch 3 times, most recently from d47fb47 to b52b183 Compare September 11, 2024 15:16

refactor(annotations): move logic from images.py

d3bd004

PaulHax force-pushed the all-images branch from b52b183 to d3bd004 Compare September 11, 2024 15:18

jourdain self-requested a review September 11, 2024 15:44

jourdain approved these changes Sep 11, 2024

View reviewed changes

PaulHax merged commit 421ca9b into main Sep 11, 2024
12 checks passed

PaulHax deleted the all-images branch September 11, 2024 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All sampled images in image list #88

All sampled images in image list #88

PaulHax commented Aug 6, 2024 •

edited

Loading

PaulHax commented Aug 21, 2024 •

edited

Loading

vicentebolea commented Aug 27, 2024

vicentebolea commented Aug 27, 2024

PaulHax commented Aug 29, 2024

vicentebolea commented Aug 29, 2024

PaulHax commented Sep 4, 2024 •

edited

Loading

vicentebolea left a comment

vicentebolea Sep 5, 2024

PaulHax Sep 5, 2024

vicentebolea Sep 5, 2024

PaulHax Sep 11, 2024

vicentebolea Sep 5, 2024

All sampled images in image list #88

All sampled images in image list #88

Conversation

PaulHax commented Aug 6, 2024 • edited Loading

PaulHax commented Aug 21, 2024 • edited Loading

vicentebolea commented Aug 27, 2024

vicentebolea commented Aug 27, 2024

PaulHax commented Aug 29, 2024

vicentebolea commented Aug 29, 2024

PaulHax commented Sep 4, 2024 • edited Loading

vicentebolea left a comment

Choose a reason for hiding this comment

vicentebolea Sep 5, 2024

Choose a reason for hiding this comment

PaulHax Sep 5, 2024

Choose a reason for hiding this comment

vicentebolea Sep 5, 2024

Choose a reason for hiding this comment

PaulHax Sep 11, 2024

Choose a reason for hiding this comment

vicentebolea Sep 5, 2024

Choose a reason for hiding this comment

PaulHax commented Aug 6, 2024 •

edited

Loading

PaulHax commented Aug 21, 2024 •

edited

Loading

PaulHax commented Sep 4, 2024 •

edited

Loading