Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experiment] Memory Datasets on the flowchart #1709

Closed
2 tasks
NeroOkwa opened this issue Jan 17, 2024 · 2 comments
Closed
2 tasks

[Experiment] Memory Datasets on the flowchart #1709

NeroOkwa opened this issue Jan 17, 2024 · 2 comments
Assignees
Labels
Design: UX Issue: Feature Request Quick Win Low/Medium priorities but quick to do

Comments

@NeroOkwa
Copy link
Contributor

NeroOkwa commented Jan 17, 2024

Description

MemoryDatasets are non-persistent datasets, that are not saved after a run. Some users have flowcharts containing numerous MemoryDatasets, and Kedro-Viz currently lacks a way to show and differentiate them from persistent datasets.

This was suggested for quick experimentation by @datajoely, but related questions have been asked by slack users:

"Does anyone know how layers are inferred for datasets which are not tagged in the catalog and just exist as memory datasets.
For example: If you have a pipeline where the only elements you persist are lets say till primary layer, and then you jump to model output in the end. How will the layers be inferred in that case ?"

Context

This is particularly useful for larger and more complex pipelines, where there is a greater need to track datasets. Another benefit is that it supports the debugging of the pipeline.

Possible Implementation

The first step would be to show MemoryDatasets on the Kedro-Viz flowchart, and then the ability to filter them out of the view as required. Implementation would require:

  • Design
  • Engineering

@rashidakanchwala has created two experimental PRs to address this:

  1. Distinctive MemoryDataset view on flowchart [Experiment] Distinctive MemoryDataset view on flowchart #1706 - This PR introduces opacity to MemoryDatasets, making them easily distinguishable on the flowchart. The transparency also signifies their non-persistent nature.
  2. Show/Hide Memory Datasets on the flowchart [Experiment] Show/Hide Memory Datasets on the flowchart #1707 - This PR offers an experimental toggle to show/hide Memory Datasets, functioning similarly to the show/hide dataset.

Acceptance Criteria

  • The user launches Kedro-Viz and sees their project in the flowchart UI.
  • They go to the settings panel and select the toggle 'Show MemoryDatasets in flowchart', saving their option.
  • An opaque version of the MemoryDataset is shown in flowchart.
  • When the MemoryDataset is selected in the flowchart, the side panel opens with details of the datasets
  • Users can also go the filter panel and select Memory Datasets. This shows/hides it from the flowchart view.
@datajoely
Copy link
Contributor

Are we going to revisit the UI side of things?

@rashidakanchwala
Copy link
Contributor

Yes, we have prioritised for this ticket - #1148.We will explore the concept of allowing users to customise (icons/color) their dataset through the catalog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design: UX Issue: Feature Request Quick Win Low/Medium priorities but quick to do
Projects
Status: Done
Development

No branches or pull requests

4 participants