Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining the scope: features and target audiences #4

Open
krassowski opened this issue Oct 4, 2021 · 0 comments
Open

Defining the scope: features and target audiences #4

krassowski opened this issue Oct 4, 2021 · 0 comments

Comments

@krassowski
Copy link
Collaborator

krassowski commented Oct 4, 2021

It would be good to start by agreeing on a list of features that we believe would be useful for a repository exploration tool to include, and how those may benefit different target audiences. Here is a basic list of feature I think are worth considering:

Exploring the directory structure

  • For exploring studies: with more files and directories we would like to see
  • For audience: lost in the forest of folders (everyone)
  • Context: it's not very common for analysis software to have dozens of nested directories, but it can happen; it might be more frequent for specific languages, but as a rule I think that more experienced software engineers tend to add more tooling configuration into the version control systems (think .github, .vscode, binder, dist, docker), which may lead to novices/non-programmers getting lost and unable to find the information relevant to them easily.

See https://next.github.com/projects/repo-visualization for the GitHub's take on this one.

Exploring the flow of the data

  • For exploring studies: using multi-step processes scattered across multiple files
  • For audience: interested in finding out more about a specific step in the process
  • Context: usually more complex studies include a flowchart describing the data wrangling process; it is however lacking links to the corresponding place in the codebase

Exploring the code structure

  • For exploring studies: providing code for re-use, or having a substantial chunk of analysis code attached
  • For audience: end-users = researchers, possibly with good computer literacy but limited programming knowledge to strictly what is required for their analysis, scientific software developers trying to contribute/extend someone else's work, students trying to understand/dissect an algorithm/method
  • Context: code navigation became easier in recent year on GitHub after introduction of the code-jumping and integration with the in-browser IDE; still navigation there is no tool providing an integrated overview where relevant parts of code can be annotated with the the relevance to the specific sections of associated method/paper, and where a functions/classes dependencies diagram could be interactively explored.

with the flow of data being the main focus of the initial proposal. While a general-purpose tool can be developed with similar features, I would very much like to focus on supporting the academic community, and on allowing to create an output which can be used independent of the tool, e.g. an SVG graph with embedded links which the code author could embed in their repository (like this one generated with https://github.com/krassowski/nbpipeline).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant