The plot methods in Datasets should be flexible #116

calebrob6 · 2021-09-09T19:24:15Z

As a user I will frequently want to plot images, predictions, masks, etc. The current definition of plot in RasterDataset is limited.

The text was updated successfully, but these errors were encountered:

calebrob6 · 2021-11-11T17:42:13Z

As a user I'll want to call plot either with a single sample from the Dataset object (e.g. in a Jupyter Notebook while doing EDA), or with a single sample + predictions (e.g. from a val_step in a LightningModule). In my mind the signature looks something like:

@staticmethod
def plot(sample: Dict[str, Tensor], titles=False, suptitle=None) --> plt.Figure??? (not sure what this type is):
    """Plots a single data sample from the dataset.

     Uses the "image", "mask", and optionally, "prediction" keys in sample
     to create a matplotlib Figure. The "prediction" key is optional, and, if included,
     will add another subplot to the Figure with a rendering of the predictions.

     Args:
         sample: a sample returned by `__getitem__` with an optional additional "prediction" key
         titles: flag indicating whether to render the titles above each subplot
         suptitle: optional super title

    Returns:
         matplotlib Figure object that can be immediately rendered or further customized
    """

@isaaccorley @adamjstewart thoughts?

adamjstewart · 2021-11-11T19:12:39Z

I agree with making the (optional) prediction be an entry in the sample dict.

As for titles and suptitle, I'm not sure if these are necessary. Since the figure is returned, users can always override these themselves.

What I would add is a **kwargs entry that gets passed to imshow (or whatever plotting function is used) that allows users to control exactly how it gets plotted, since that part can't be customized after the fact. Not sure if we need to have a different kwargs for each call to imshow or whether we can use the same kwargs for all.

Originally I was thinking about how to make plot handle both individual samples (returned by Dataset) and batches of data (returned by DataLoader), but I think we should just stick to the first for now. The latter is too complicated, and we can always add a function that reverses the collation to get just the first sample.

calebrob6 · 2021-11-11T20:12:03Z

Agree on the batching -- we could just run plot(...) in a loop for batch. Another thing to consider, if we are calling this method from val_step(...) then the samples will already be transformed (e.g. per channel standardization). How do we best deal with this? (I think an inverse preprocess makes sense)

Mild disagree on removing the titles/suptitle -- those are nice convenience features. Users could do those by themselves, but the point of a plot is to make it easy to get a quick rendering of the data.

What sort of **kwargs to imshow do you think would need to be passed. All that I can think of are the ones that should be hardcoded to properly display the imagery or labels (e.g. cmap, interpolation, vmin, vmax).

adamjstewart · 2021-11-11T20:16:09Z

To clarify, I think we should always add titles/suptitles, I don't think we need a parameter to control whether or not they get added.

For kwargs, I could see someone wanting a custom cmap maybe. imshow might not be the best example, but I'm not sure if there are any other plotting methods we might need.

calebrob6 · 2021-11-11T20:27:49Z

(I'm not understanding you) So you think title and suptitle should be non optional?

For kwargs, I can see where your coming from -- but now, how do they get routed to the correct imshow(...). E.g. for a plot(...) for a land cover dataset you will have an imshow(imagery) and also an imshow(mask, cmap=default_cmap, vmin=0, vmax=7, interpolation="none"). You would need to have separate kwargs routed to both of them. Then, for other datasets maybe you only have a single imshow(...).

adamjstewart · 2021-11-11T20:50:46Z

Yes, titles should be non-optional. I wonder if there is a way to remove them after they've been added...

We could require that kwargs has the same dict structure as sample such that **kwargs['mask'] returns the kwargs for mask plotting. Or we could just ignore kwargs for now until someone actually needs it.

calebrob6 · 2021-11-11T21:42:05Z

Gotcha, would renaming what I have as titles to show_titles that defaults to True (instead of False) work in this case?

I'm still a fan of having an optional suptitle argument that is just a shortcut for writing out a super title in the correct place/size.

Or we could just ignore kwargs for now until someone actually needs it.

Seems good!

isaaccorley · 2021-11-11T21:42:38Z

I don't really have too many opinions, but a couple thoughts:

kornia has a denormalize which can perform the opposite of Normalize given channel means and stds
torchvision.utils.make_grid is nice when you have a batch of tensors and want to create an array to plot them in a grid fashion. The output of this is a RGB tensor but can be used in plt.imshow.

adamjstewart · 2021-11-11T21:48:50Z

@calebrob6 my only complaint about that is that we'll need to add this arg to every plot method, copy the docs, and add an extra if-statement for it. It might be simpler to tell users to clear the title themselves if they really don't like it.

calebrob6 · 2021-11-11T21:59:53Z

Yep makes sense! I just see that as part of the normal work (and not like an extra burden).

I think we roughly agree (that plotting should happen in the dataset, that it should assume the values it gets are the same values that are returned by a call to getitem, and that we should return a figure).

The only tricky part I see is "denormalizing" samples in val_step(...) (or anywhere in a LightningModule). If a LightningModule knows about its DataModule then this should be fixed!

adamjstewart · 2021-11-13T18:52:15Z

I propose that we skip title, suptitle, kwargs, etc. and just have a single way to plot a sample from each dataset. If someone asks for more customization in the future, we can always add it later. Since these values are all optional, it won't be a breaking API change to add it later. I think it's better to keep it simple and add complexity only when we need to.

calebrob6 · 2021-11-13T20:39:32Z

Skipping titles is something I use often (which is why I added it -- e.g. I often create figures in draw.io where I stack several samples and add the titles / other annotation manually). suptitles is especially useful for adding the sample id of an image when debugging (e.g. if validation dataloader is on shuffle, it is useful to do this so you can find an interesting sample again).

adamjstewart · 2021-11-19T20:33:27Z

I think we've decided on a basic API for now, tracking of which datasets still need a new/updated plot method can happen in #253.

adamjstewart added the datasets Geospatial or benchmark datasets label Sep 9, 2021

adamjstewart self-assigned this Nov 3, 2021

adamjstewart mentioned this issue Nov 10, 2021

Adding ETCI2021 datamodule and trainer #234

Merged

adamjstewart closed this as completed Nov 19, 2021

adamjstewart added this to the 0.2.0 milestone Nov 20, 2021

adamjstewart added utilities Utilities for working with geospatial data and removed utilities Utilities for working with geospatial data labels Jan 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The plot methods in Datasets should be flexible #116

The plot methods in Datasets should be flexible #116

calebrob6 commented Sep 9, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021 •

edited

Loading

adamjstewart commented Nov 11, 2021 •

edited

Loading

calebrob6 commented Nov 11, 2021 •

edited

Loading

isaaccorley commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 13, 2021

calebrob6 commented Nov 13, 2021

adamjstewart commented Nov 19, 2021

The plot methods in Datasets should be flexible #116

The plot methods in Datasets should be flexible #116

Comments

calebrob6 commented Sep 9, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021 • edited Loading

adamjstewart commented Nov 11, 2021 • edited Loading

calebrob6 commented Nov 11, 2021 • edited Loading

isaaccorley commented Nov 11, 2021

adamjstewart commented Nov 11, 2021

calebrob6 commented Nov 11, 2021

adamjstewart commented Nov 13, 2021

calebrob6 commented Nov 13, 2021

adamjstewart commented Nov 19, 2021

calebrob6 commented Nov 11, 2021 •

edited

Loading

adamjstewart commented Nov 11, 2021 •

edited

Loading

calebrob6 commented Nov 11, 2021 •

edited

Loading