You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Imagine some lab wanted control over their own data, but wanted their projects/datasets to be searchable on other instances. This would require some sort of federation like Mastodon.
So if someone wanted to import a dataset from another instance, they could run something like
If someone doesn't use a prefix, we assume it's https://calkit.io.
If/when we support this, we can open source this software.
We would want the Python package to be able to manage different tokens for different servers, and projects should know which server they belong to. Should we not allow the same project on multiple instances? That could be confusing as to which one is the "true" project.
One very simple way to enable this federation is to hard code a list of instance domains. If another instance wants to join, they can submit a PR to add on to that list. Then, if someone wants to fetch a list of projects, they make a request to all of the instances and join them together. However, is there any way to have shared user accounts? Maybe that's not desirable. Users can join whatever instances they want separately. This means that in order to use GitHub authentication, each satellite will need to create its own GitHub app. However, if a user or web app makes a request to some other instance, they send a token that can be verified against the issuer, and their email can be gleaned from that, which will allow them to be authorized on the other instance.
Maybe we can also have aggregator servers that periodically fetch and cache public data from all of the satellites, such that actions like searching for datasets or whatever can be done with one request instead of many. The satellites will then need to send requests to the aggregators on any relevant events, e.g., public data being created or destroyed.
User stories
As an admin, I want to run my own Calkit instance and for the projects and datasets to be searchable from any Calkit instance. This way, I can have control over my infrastructure and costs, while still allowing the research to be part of the overall web of knowledge and artifacts.
As a researcher, I want to be able to search for projects, datasets, publications, figures, functions across all instances and be able to make use of them in my own work. I also want my work to be able to be found by others out there without needing to put it on the centralized server.
Things that can be unique about each instance
The domain
Cloud storage bucket
Subscription plans -- whether or not to charge
Whether or not it should make queries to the federated network
What is on the home page? If a lab setup their own instance, maybe they'd want to show something about their lab
Some title with the instance name, probably up in the nav bar
GitHub app
Zenodo app
Stripe app
The text was updated successfully, but these errors were encountered:
Imagine some lab wanted control over their own data, but wanted their projects/datasets to be searchable on other instances. This would require some sort of federation like Mastodon.
So if someone wanted to import a dataset from another instance, they could run something like
If someone doesn't use a prefix, we assume it's
https://calkit.io
.If/when we support this, we can open source this software.
We would want the Python package to be able to manage different tokens for different servers, and projects should know which server they belong to. Should we not allow the same project on multiple instances? That could be confusing as to which one is the "true" project.
One very simple way to enable this federation is to hard code a list of instance domains. If another instance wants to join, they can submit a PR to add on to that list. Then, if someone wants to fetch a list of projects, they make a request to all of the instances and join them together. However, is there any way to have shared user accounts? Maybe that's not desirable. Users can join whatever instances they want separately. This means that in order to use GitHub authentication, each satellite will need to create its own GitHub app. However, if a user or web app makes a request to some other instance, they send a token that can be verified against the issuer, and their email can be gleaned from that, which will allow them to be authorized on the other instance.
Maybe we can also have aggregator servers that periodically fetch and cache public data from all of the satellites, such that actions like searching for datasets or whatever can be done with one request instead of many. The satellites will then need to send requests to the aggregators on any relevant events, e.g., public data being created or destroyed.
User stories
Things that can be unique about each instance
The text was updated successfully, but these errors were encountered: