Overall purpose/point... WiP #1

yarikoptic · 2023-05-19T21:21:51Z

NB WiP -- yet to finish dumping ideas...

A somewhat in spirit of automations like con/tinuous, datalad/git-annex, datalad-extensions testing, etc.

An automation to provide "tuned" forks of multiple git repos, possibly without offering "tune ups" back to original locations (since they might not want them). Use cases:

NIDM improved openneuro datasets. ref: https://github.com/OpenNeuroDatasets-NIDM/nidm-openneuro-doc/issues has an outline of some initial original ideas/TODOs
neurobagel - related but different ;) annotation of some of the openneuro datasets

Following high level configuration structure I see

sources: list of original locations where to clone/fork from
- organization, e.g. https://github.com/OpenNeuroDatasets (starting with github but later might be extended to gitlab)
- repos-regex (optional), to what repos within organization to limit, .* by default
- name-tuneup (optional, later): how to possibly rename fork in case of multiple sources possibly colliding, e.g. "s,^,openneuro-,g" to add an openneuro- prefix
destination: e.g. https://github.com/OpenNeuroDatasets-NIDM (starting with github but who knows - might later want to support gitlab)
transformations: list of commands to do

conjob init https://github.com/OpenNeuroDatasets https://github.com/OpenNeuroDatasets-NIDM/ BIDS2NIDM

which would

initiate OpenNeuroDatasets-NIDM organization
populate it with forks of all (default) repos from OpenNeuroDatasets
run BIDS2NIDM on the default branch
save the results in its own default branch

Possible additional features:

initiation/update of PRs against original repos to "offer" changes introduced
rerender command to update all templated produced

Aspects which come to mind

to scale up we better make every repo monitor original location in its own CI but that would require .github/workflows change too in that repo in some branch. Could be done and just operate "out of branch" (e.g. some action to run in conjob/ branch while working on main branch of the repo)
so it would b

The text was updated successfully, but these errors were encountered:

surchs · 2023-05-19T23:49:31Z

This sounds very cool @yarikoptic. For Neurobagel, I think this would be the rough workflow for the first (retro-spective) annotation for e.g. OpenNeuro:

We create a participants.json (example, schema) for all (most) OpenNeuro datasets: Annotate the OpenNeuro datasets neurobagel/bulk_annotations#2
We put this augmented participants.json file in a fork / branch of the corresponding OpenNeuro datalad dataset, replacing the previous participants.json (?)
When that fork / branch gets updated (e.g. because of some clever bot watching the upstream dataset, or because we added the augmented .json),
- a job fires that runs the Neurobagel CLI container on the BIDS dataset in the fork, to bundle the participants.tsv with the augmented participants.json into a graph-ready everything.jsonld
- a second job fires to run the CLI on the BIDS dataset, adding the imaging metadata to the everything.jsonld file. This works with just the datalad clone form of the data (i.e. symlinks for the big files)
- a third job/step uploads / pushes the everything.jsonld file somewhere it can get picked up and put in the Neurobagel OpenNeuro graph
The most recent version of metadata can be searched at https://query.neurobagel.org/ (and would probably link back to the fork for datalad get purposes ?)

Remi-Gau · 2023-05-20T01:05:42Z

Not sure if that's relevant but the all-repos package may come in handy.
I have used it for the bids app organization maintenance and it is used when you want to perform the same operation on a while bunch of repo.

https://github.com/bids-apps/maintenance-tools

I'd be curious to see if it plays well with datalad

Remi-Gau · 2023-05-21T17:21:33Z

possibly relevant: having "patch" datasets

bids-standard/bids-specification#814

if possible this would prevent having to create a sibling of each dataset we want to annotate

we would still want ways to make sure annotation are not obsolete and stay in synch with upstream

yarikoptic · 2023-06-09T01:07:58Z

in reply to @surchs above - 3 activities about annotations:

add retrospective annotations (neurobagel team did them) -- activity done once
add prospective annotations (user or neurobagel or openneuro team do) -- done for each new dataset without annotations. To be done using e.g. https://annotate.neurobagel.org/
update existing annotations upon dataset updates (by openneuro) . con-job should automate -- keep master (or annotated) branch in https://github.com/OpenNeuroDatasets-JSONLD up-to-date with https://github.com/OpenNeuroDatasets

prototype of a rough bash script which does everything for a given dataset is https://github.com/OpenNeuroDatasets-JSONLD/.github/blob/main/code/prototype-neurobagel.sh and is being now ran to populate that organization with adjusted forks. Body of the script is fairly generic, the only specific invocation is cloning of openneuro-annotations and the bottom where we invoke update_json.

surchs mentioned this issue May 23, 2023

Create and then store Neurobagel data dictionaries from bulk annotation neurobagel/bulk_annotations#3

Closed

9 tasks

surchs mentioned this issue Jun 7, 2023

Create process to automatically process annotated BIDS datalad dataset into graph ready file neurobagel/bulk_annotations#4

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overall purpose/point... WiP #1

Overall purpose/point... WiP #1

yarikoptic commented May 19, 2023 •

edited

Loading

surchs commented May 19, 2023 •

edited by yarikoptic

Loading

Remi-Gau commented May 20, 2023

Remi-Gau commented May 21, 2023 •

edited

Loading

yarikoptic commented Jun 9, 2023

Overall purpose/point... WiP #1

Overall purpose/point... WiP #1

Comments

yarikoptic commented May 19, 2023 • edited Loading

surchs commented May 19, 2023 • edited by yarikoptic Loading

Remi-Gau commented May 20, 2023

Remi-Gau commented May 21, 2023 • edited Loading

yarikoptic commented Jun 9, 2023

yarikoptic commented May 19, 2023 •

edited

Loading

surchs commented May 19, 2023 •

edited by yarikoptic

Loading

Remi-Gau commented May 21, 2023 •

edited

Loading