Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

access_url behaviour change to raise a deprecation rather than error #340

Closed
bsipocz opened this issue Jun 8, 2022 · 5 comments
Closed
Milestone

Comments

@bsipocz
Copy link
Member

bsipocz commented Jun 8, 2022

I think it should be considered to use deprecations for the change of behavior in access_url, keeping the old behavior of returning the first element from access_urls along with the warnings, rather than raising the exception straight away.

I milestone this issue is only relevant before the 1.4 release.

@bsipocz bsipocz added this to the v1.4 milestone Jun 8, 2022
msdemlei added a commit that referenced this issue Jun 9, 2022
There is a warning now, though, that ought to tip people off when things
go wrong.  See also #340; it does
not directly address the problem outlined there.

Relatedly, we now properly parse the arrays-in-strings we get back from
TAP.  So far, an empty result yielded [""] in the array of, e.g., access_url.
We ought to have proper string arrays in VOTable one day.
@msdemlei
Copy link
Contributor

msdemlei commented Jun 9, 2022

PR #341 addresses using access_url on a resource directly.

It also adds an access_url column in the result of get_summary.

However, we can't really bring back access_url to to_table -- that would require an access_url column in the query result, and there's no sane way to have that. I'm afraid the legacy behaviour was based on the wrong assumption that there is one access URL per resource, which is simply not true.

I note in passing that the use in the NAVO notebook (I've looked at UseCase_II) is purely informational, that is, it just shows the results for demonstration and does not use the access_url any more.

This kind of use is what get_summary is for. That, however, doesn't have ivoid (I don't think that's human-readable and hence I'd argue it doesn't belong there), though, so we'd have to ask the authors to change to:

uv_services.get_summary()[
    np.array(['GALEX' in u.short_name for u in uv_services])
    ]['short_name', 'access_url']

But frankly, get_summary() is there exactly for displaying a human-readable table, so I'd say they should use get_summary() directly. There is no filtering facility on RegistryResults yet. Do we want to support that? Or can we get away just saying "Well, run a new query with short_name ILIKE '%GALEX%'"?

@bsipocz
Copy link
Member Author

bsipocz commented Jun 9, 2022

We'll be changing the notebooks themselves and bump the version number to require pyvo 1.4, so I was mentioning them mostly as a demonstration that users may use the previous API in similar ways we did and thus end with errors.
Given the recent discussions about how to notify downstream users, or recognise their incompatibility ahead of a release, I felt this may be a situation we are actually in. We can easily do the changes (especially if we call it 2.0 rather than 1.4), and point out fixed for those who report failures, or give some warnings in the code. (And I totally agree that rephrasing the navo notebooks is certainly something we should do)

@bsipocz
Copy link
Member Author

bsipocz commented Jun 9, 2022

(The one minor issue with get_summary is that it's not available in prior versions (as far as I see). So while everything is indeed much nicer and logical with the refactor it's somewhat difficult to do come up with an example that works for all versions. Which may be OK, it was just something I felt better to be raised as a concern).

@tomdonaldson
Copy link
Contributor

#341 doesn't address the fact that access_url is now missing in the output of to_table(), but it does help users deal with unexpected multi-value access_urls.

As far as I can tell, there was never any documentation guaranteeing the columns that would result from to_table(). It's certainly possible that some users empirically relied on the access_url column, but the suggested uses for these results keep the access_url hidden. I think the updated documentation is thorough and sufficient enough that no additional fixes should be needed for this issue.

@tomdonaldson
Copy link
Contributor

Based on the discussion here and offline, I think we can proceed without raising deprecation regarding the access_url changes. After releasing v1.4, we can respond to issues if or when they arise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants