Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register externally-managed files via the client #661

Merged
merged 34 commits into from
Feb 27, 2024

Conversation

danielballan
Copy link
Member

@danielballan danielballan commented Feb 21, 2024

Tiled enables registration of externally-managed files with a catalog. In main, this is a server-side activity where the user interacts directly with the SQL database via Python, and thus requires the database credentials. In this PR, this becomes a client-side activity where the user interacts with a new HTTP endpoint, POST /register/{path} and must hold a new scope, register.

The module tiled.catalog.register becomes tiled.client.register.

The CLI tiled catalog register <DATABASE_URI> becomes tiled register <TILED_URL> --api-key <API_KEY>.

Why didn't we do this in the first place?

We originally considered doing this client-side from the start. I had two concerns:

  1. Security. This PR enables clients to ask the server to expose files on the server. This is naturally a security concern. However, giving the file-registering program direct access to the database is also a security concern, arguably a worse one. And we have the following protections:

    • In multiple-user deployments, the required scope, register, is given only to Tiled admins and service accounts that they elect to grant it to.
    • The server configuration must explicitly list areas of its local storage where files may be registered, via its readable_storage configuration. Anything outside of that will be rejected. This is enforced at read time as well, to be sure that even with a corrupted database there is no way that Tiled would serve an arbitrary file off of the system.
  2. Latency. When scanning large directories, this may be slow. But there is a path to making it faster. In a future PR, we can add POST /nodes/metadata/{path} and POST /nodes/register/{path} for bulk creation of nodes, with internally- and externally-managed data respectively.

I have stacked @dylanmcreynolds commits from #654 on top of my commits here. My intent is to update and ideally simplify it before merging, but I'll need some input from @dylanmcreynolds and @Wiebke on that.

@padraic-shafer
Copy link
Contributor

TODO: After #660 gets merged, will need to apply these analogous updates to test_container_fields.py.

await register(client, tmpdir)

@dylanmcreynolds
Copy link
Contributor

Excellent!

Should we consider adding cleint-side register text to the how-to document rather than replacing it? Even with this, are there cases where we would want to support both?

@danielballan
Copy link
Member Author

Ideally I'd like to add whatever convenience APIs we need to reduce that chunk of Python to something pithier. I am aware of these use cases:

  • Walk a directory. We have that covered.
  • Register a specific file manually providing details (e.g. custom specs...) instead of taking whatever the automatic discovery generates. We need to document and perhaps streamline this.

Can you go into more detail about what needs to be addressed?

@dylanmcreynolds
Copy link
Contributor

I think I was trying to ask a different question. My doc descrbed a labor-intensive but effective method for registering files in code that has direct access to the database, while this PR adds functionality to do the same through the client. Do we want to keep the documentation of the more involved server-side register, adding the client side as an alternative? Or do we want to abandon my documentation in favor of new client-side register text?

Copy link
Contributor

@padraic-shafer padraic-shafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this plan. At a high level the code changes look good so far.

@dylanmcreynolds
Copy link
Contributor

After talking to @danielballan and @Wiebke , I'm pretty excited about this PR. To clarify my previous understanding was incorrect, and this moves the register function from the server to the client.

### Complex cases

Sometimes it is necessary to take more manual control of this registration
process. Using the Python client,
Copy link
Contributor

@dylanmcreynolds dylanmcreynolds Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add sentence here?

Sometimes it is necessary to take more manual control of this registration
process. For example, if you want to add a `spec` or custom `metadata` to the data set, you will want to do that programmatically usign the Tiled client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this specific suggest but I'm glad I capture the gist of it.

Copy link
Contributor

@dylanmcreynolds dylanmcreynolds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick suggestion for the register.md file.

Copy link
Contributor

@dylanmcreynolds dylanmcreynolds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great

Copy link
Contributor

@dylanmcreynolds dylanmcreynolds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great

danielballan and others added 2 commits February 27, 2024 11:43
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
Copy link
Contributor

@padraic-shafer padraic-shafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. One small change noted in the comment.

docs/source/how-to/register.md Outdated Show resolved Hide resolved
danielballan and others added 3 commits February 27, 2024 12:04
Co-authored-by: Padraic Shafer <76011594+padraic-shafer@users.noreply.github.com>
@danielballan
Copy link
Member Author

An embarrassing number of commits later, this docs section is finally right :-D

image

@danielballan
Copy link
Member Author

....Plus one last commit for s/demonstrate/demonstrates/.

Copy link
Contributor

@padraic-shafer padraic-shafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the docs!

@padraic-shafer padraic-shafer merged commit 89f7106 into bluesky:main Feb 27, 2024
8 checks passed
@danielballan danielballan deleted the client-register branch February 27, 2024 18:19
Copy link
Contributor

@Wiebke Wiebke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am late to the party, but this is great!

@padraic-shafer
Copy link
Contributor

@Wiebke Thanks for adding your support! The stacked PRs were starting to pile up, so with one approval from ALS + another from NSLS-II, I made a call that we were ok to merge this without further review. I'll aim to not make a habit of that though. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants