[Experiment] Distinctive MemoryDataset view on flowchart #1706

rashidakanchwala · 2024-01-16T20:56:19Z

Description

Suggested for quick experimentation by @datajoely: MemoryDatasets are non-persistent datasets, meaning they are not saved after a run. Some flowcharts contain numerous MemoryDatasets, and Kedro-viz currently lacks a way to differentiate them from persistent datasets. This PR introduces opacity to MemoryDatasets, making them easily distinguishable on the flowchart. The transparency also signifies their non-persistent nature.

Currently this feature is under an experiment flag.

Development notes

QA notes

Checklist

Read the contributing guidelines
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added new entries to the RELEASE.md file
Added tests to cover my changes

datajoely · 2024-01-17T02:53:24Z

It's beautiful! I'd suggest this should be default behaviour already - but it's should be really quick to get some user validation of that

inigohidalgo · 2024-01-17T10:56:07Z

I think this is very nice functionality, speaking as someone who has a loot of memorydatasets in my pipeline. In my case it's not relevant, but I know some users use "dummy" inputs/outputs to force node order execution, I'm not sure if they tend to use memorydatasets or another type.

Maybe a user-configurable toggle per dataset type to fade/not fade?

ravi-kumar-pilla · 2024-01-17T11:19:26Z

src/components/flowchart/draw.js

@@ -171,6 +172,10 @@ export const drawNodes = function (changed) {
      )
      .classed('pipeline-node--data', (node) => node.type === 'data')
      .classed('pipeline-node--task', (node) => node.type === 'task')
+      .classed(
+        'pipeline-node--memory-data',
+        (node) => flags?.diffMemoryDatasets && node?.dataset?.includes('memory')


This looks good. Can we have a DatasetType check instead of string matching ?

yes, i think we will eventually refactor and do it properly. for now this is just a quick dirty experiment

ravi-kumar-pilla · 2024-01-17T11:54:25Z

This looks great already to distinguish MemoryDataset. May be we can extend this to opt-in color-code dataset feature (we color datasets based on type - for easier debugging). Thank you !

astrojuanlu · 2024-01-17T18:45:13Z

I really like this idea (more than #1707).

As I said there, I'm wondering if eventually we could provide a more generic customisation feature that would allow users to have more control over this, instead of adding a toggle.

rashidakanchwala · 2024-01-17T21:45:16Z

@astrojuanlu -- I agree. After speaking to a few people, I do realise that while some people might like this, others might not find it so usable. They might want to differentiate other datasets instead. Perhaps we need to build some sort of customisability, as you mentioned in the other PR, to allow users to distinguish datasets rather than us doing it for them.

Ravi also mentioned color-coding datasets, and as we have too many datasets, giving each a color-code might end up being more confusing and less visually appealing? well, this is subjective, isnt it?

hence, providing them some sort of customisability where they can decide how they want to highlight or distinguish certain datasets over others would be better.

I will try and get more feedback on this short experiment. But keep the comments coming :)

rashidakanchwala · 2024-01-19T10:59:05Z

Closing this experiment PR for now -- allowing users to differentiating datasets on kedro-viz seems like a good idea. We are going to try and do #1148 first and see how that picks up with users.

done

1edbda5

rashidakanchwala marked this pull request as ready for review January 16, 2024 20:56

rashidakanchwala requested a review from tynandebold as a code owner January 16, 2024 20:57

rashidakanchwala requested review from astrojuanlu, datajoely, yetudada, merelcht, NeroOkwa and noklam and removed request for tynandebold and yetudada January 16, 2024 20:57

rashidakanchwala mentioned this pull request Jan 16, 2024

[Experiment] Show/Hide Memory Datasets on the flowchart #1707

Closed

5 tasks

ravi-kumar-pilla reviewed Jan 17, 2024

View reviewed changes

NeroOkwa mentioned this pull request Jan 17, 2024

[Experiment] Memory Datasets on the flowchart #1709

Closed

2 tasks

rashidakanchwala closed this Jan 19, 2024

astrojuanlu mentioned this pull request Jan 19, 2024

Allow customization of flowchart icons #1148

Open

rashidakanchwala deleted the memorydatasetdiff branch May 30, 2024 10:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experiment] Distinctive MemoryDataset view on flowchart #1706

[Experiment] Distinctive MemoryDataset view on flowchart #1706

rashidakanchwala commented Jan 16, 2024 •

edited

Loading

datajoely commented Jan 17, 2024

inigohidalgo commented Jan 17, 2024

ravi-kumar-pilla Jan 17, 2024

rashidakanchwala Jan 17, 2024

ravi-kumar-pilla commented Jan 17, 2024

astrojuanlu commented Jan 17, 2024

rashidakanchwala commented Jan 17, 2024

rashidakanchwala commented Jan 19, 2024

[Experiment] Distinctive MemoryDataset view on flowchart #1706

[Experiment] Distinctive MemoryDataset view on flowchart #1706

Conversation

rashidakanchwala commented Jan 16, 2024 • edited Loading

Description

Development notes

QA notes

Checklist

datajoely commented Jan 17, 2024

inigohidalgo commented Jan 17, 2024

ravi-kumar-pilla Jan 17, 2024

Choose a reason for hiding this comment

rashidakanchwala Jan 17, 2024

Choose a reason for hiding this comment

ravi-kumar-pilla commented Jan 17, 2024

astrojuanlu commented Jan 17, 2024

rashidakanchwala commented Jan 17, 2024

rashidakanchwala commented Jan 19, 2024

rashidakanchwala commented Jan 16, 2024 •

edited

Loading