Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDCC/7715 Signed Urls for external tools #9001

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Sep 27, 2022

What this PR does / why we need it: This PR eliminates the need to share a user's apiToken with external tools, reducing the risk in trusting tools and the privacy of the user's local machine. It adds a new set of allowedApiCalls to the tool manifest which allow Dataverse to provide a defined set of signedUrls to the tool.

Two options are available
POST - where the queryparams and allowedApiCall signed URLs are sent in a post body to the tool URL, after which Dataverse redirects to a redirect URL provided in the response (as OpenDP works now)
GET - where a base64 encoded callback param is sent to the tool as a query param in the URL. The decoded token is a signed URL to retrieve all the rest of the query params and allowedApiCalls.

In all cases, URLs are only signed if the dataset/datafile in question is not public.

Which issue(s) this PR closes:

Closes #7715
Closes #8999

Special notes for your reviewer: There are tests for the major functionality. Nominally some IT tests could be added but the unit tests cover the same functionlity (i.e. creating/validating a sighed URL, assuring the json sent back in the dataset and datafile callback URLs is valid/correct.

@rtreacy - note that I dropped the internal "apis" object in the tool manifest. If you run this branch, you'll need to moidy the tool manifest for openDP

Re: POST - I think openDP wanted this because we didn't have a way to do a GET and get many signed URLs (they wouldn't all fit on the GET URL). With the new callback mechanism, that is now possible and OpenDP might want to just use a GET. My guess is that GET will be easier for most external tools to use.

Suggestions on how to test this: The API calls themselves can be tested. One can also register a tool with the new signedURL mechanism and manually verify the callback works:
Tool Manfiest: {
"displayName": "Read Text",
"toolName": "textPreviewer",
"description": "Read the text file.",
"types": [
"preview"
],
"scope": "file",
"toolUrl": "https://gdcc.github.io/dataverse-previewers/previewers/v1.3/TextPreview.html",
"toolParameters": {
"httpMethod": "GET",
"queryParameters": [
{
"fileid": "{fileId}"
},
{
"siteUrl": "{siteUrl}"
},
{
"datasetid": "{datasetId}"
},
{
"datasetversion": "{datasetVersion}"
},
{
"locale": "{localeCode}"
}
]
},
"contentType": "text/plain",
"allowedApiCalls": [
{
"name": "retrieveDataFile",
"httpMethod": "GET",
"urlTemplate": "/api/v1/access/datafile/{fileId}",
"timeOut": 270
}
]
}

Test steps:

  1. add a text file
  2. hit Preview
  3. select this previewer if more than one is registered
  4. On a test instance on http, it won't show on the datafile page (since the tool is at an https address) but clicking open in new page will pop up a new tab
  5. In that tab, cut/paste the callback param into a Base 64 online decoder
  6. Use the decoded URL
  7. Verify that a) it works, and b) that the contents includes all the query params and the signed URL requested.
  8. Try the signed URL included to verify it works.

Could/should repeat this with a non-null dataverse.api.signature-secret jvm/microProfile setting.

It should not be too hard to create updated Previewers so we may be able to test real ones once they exist.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?: included, doc updates as well

Additional documentation:

rtreacy and others added 30 commits July 29, 2021 17:35
impements POST-redirect-GET for DP Creator tool
POST is currently done on server, gets a redirect response, and GETs the new location in the browser
Need to change the way the base context is gotten for POST, as in the GET code, it always uses the extenal tool url as provided in the configuration - the redirect use be a different context than the configured tool url.
…/dataverse into 7715-signed-urls-for-external-tools
…ar DPCreator

WIP - still need to handle use of signed Url to access resource on dataverse
…7715-signed-urls-for-external-tools

Validation fix and API call
…7715-signed-urls-for-external-tools

Define/use an additional secret key, Refactor token replacement for signed urls
…7715-signed-urls-for-external-tools

7715 signed urls for external tools
…7715-signed-urls-for-external-tools

Minor tweaks from IQSS#7325 review
7715-signed-urls-for-external-tools
- use the user if supplied
- require superuser
@scolapasta scolapasta self-assigned this Nov 14, 2022
@scolapasta scolapasta removed their assignment Nov 17, 2022
@kcondon kcondon self-assigned this Nov 21, 2022
@kcondon
Copy link
Contributor

kcondon commented Nov 22, 2022

Issues found:

  1. Calling get external tool params endpoint throws null ptr in logs, returns {} on command line
    [2022-11-22T20:05:58.962+0000] [Payara 5.2022.3] [SEVERE] [] [] [tid: _ThreadID=76 _ThreadName=http-thread-pool::http-listener-1(4)] [timeMillis: 1669147558962] [levelValue: 1000] [[
    java.lang.NullPointerException
    at edu.harvard.iq.dataverse.util.URLTokenUtil.getTokenValue(URLTokenUtil.java:147)
    at edu.harvard.iq.dataverse.util.URLTokenUtil.getParam(URLTokenUtil.java:104)
    at edu.harvard.iq.dataverse.externaltools.ExternalToolHandler.lambda$getParams$0(ExternalToolHandler.java:161)
  2. Not sure what FILEMETADATA_ID is in get tool params for file:
    '''curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/files/$FILE_ID/metadata/$FILEMETADATA_ID/toolparams/$TOOL_ID
    '''
  3. get tools params for file doesn't appear to understand :persistentId key used elsewhere in place of db id, says endpoint does not exist.
  4. Separate question: having trouble visualizing the operational steps and how one might test the apis directly, first, with tool, second. You mention a. configuring signedURL mechanism b. POST and GET methods of accessing preconfigured signed URL? c. applying resultant signedURL as auth to a restricted file or dataset api endpoint request. Are these steps documented?

@qqmyers
Copy link
Member Author

qqmyers commented Nov 23, 2022

  1. The code was not checking for the case when you call the dataset endpoint for a tool with file scope or vice versa. These now give a bad request response.
    [Kevin] This catches an error when the dataset is unpublished but still throws the null ptr on a published ds with file tool. It also throws a null ptr on file endpoint using file tool and proper metadata id.
    2)Filemetadata_id is the id of the filemetadata object, required to know which version you're interested in. To use this call manually, you can call the /api/files//metadata call - the "id" in the json is the right value. I've added a note about this in the guides. (It can also be sent in the URL when launching a tool if the tool requests that param).
    [Kevin] Thanks
  2. The :persistentId syntax won't result in an error now.
    [Kevin] This works now, thanks.
  3. The easiest round trip is to
  • launch a tool in a new page:

image

Calling the appropriate file or dataset scope toolparam endpoint directly is basically starting at the fourth bullet above.

There is also a separate workflow to directly request a signed URL, e.g. calling
curl -H 'Content-type:application/json' -H 'X-Dataverse-key:6b0342b9-c546-4e9d-a497-2f9cf668e24e' http://localhost:8080/api/admin/requestSignedUrl -d '{"url":"http://localhost:8080/api/v1/datasets/:persistentId/?persistentId=doi:10.5072/FK2/GMLPBN","timeOut":5,"user":"dataverseAdmin"}'
and then testing the signedUrl returned in the json response.

@mreekie
Copy link

mreekie commented Nov 30, 2022

Daily:

  • There's a separate test fix PR going in.
  • Issues found during QA

7715-signed-urls-for-external-tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include Signed-urls in the External Tools Framework
9 participants