-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APIDataSet reduce number of arguments and move to Load args #1633
APIDataSet reduce number of arguments and move to Load args #1633
Conversation
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
…tantiate authenticator objects from config. Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
…eaming, certs etc. Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
b8ba671
to
82584e6
Compare
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for working on this @daBlesr! Really nice that someone has picked this up since it had got a bit stuck.
Moving arguments to load_args
definitely makes sense to me here given the API of requests.request
that you posted in the issue. Like you suggest, we aim to keep the interface as close to the underlying library as possible so this seems like a very good improvement to our APIDataSet
👍
Your changes to auth
also very much make sense and I totally see where you're coming from here, but I think they should not be part of this PR. Let's limit the scope here to just be the load_args
part. The auth
question is more of an open question (as are credentials in kedro in general) - I will post more about this in #711.
…dded migration docs. Changed test for simple auth, to include `load_args` Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Hi @AntonyMilneQB, thanks for reviewing. I have removed the auth changes. You've seen a working example, so it might come in handy in the future when generalising for other Datasets 😉 . Also I have updated the docs as per your comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, especially on the tests!! Thank you so much for this contribution. I have lots of small comments and questions but nothing big so very happy to approve this 😀
One potential problem I foresee here is that it seems that some requests
arguments expect tuple
rather than just a general Iterable, notably auth
and cert
. This is a problem because when a user specifies these arguments in yaml, they always come out as a list. This was one of the original motivations for #1101.
The only fix I can think for this is for us to manually go through load_args
and convert anything that is a list
to a tuple
(checking for Iterable
won't work for at least cert
because it could also be a string, which we don't want to convert to tuple!). That seems kind of ugly though. Interested what you think! Possibly our great configuration overhaul will provide a better solution here (like a native way of putting Python lists into yaml e.g. #1011).
Also interested in seeing you have any comments on this issue on credentials, partly inspired by the problems you've uncovered here! 🙏 #1646
… converted to tuple before sending the request. Changed docs. Added tests. Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion from lists to tuples seems good, but I'm slightly confused about what you ended up doing with the types for credentials
/auth
- feels like you opted to go for something non-breaking here but didn't revert everything you intended to? I don't mind where we end up here (your original solution vs. this new one) so long as all the code, docstrings, release notes are consistent on it. Please do feel free to merge whenever you're happy with it 🙂
Thank you again for all your work on this!
auth = tuple(auth) | ||
self._auth = credentials or self._load_args_auth | ||
self._cert = self._load_args.pop("cert", None) | ||
self._timeout = self._load_args.pop("timeout", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: when the user hasn't specified a timeout
, it will end up as timeout: None
in self._request_args
and hence the _describe
method. Might be slightly neater to only pass these arguments to _request_args
if we really want to - something like:
if "cert" is in self._load_args:
self._load_args["cert"] = self._convert_type(self._load_args["cert"])
# just pass to self._request_args using **self._load_args
timeout: int = 60, | ||
credentials: Union[Iterable[str], AuthBase] = None, | ||
load_args: Dict[str, Any] = None, | ||
credentials: Union[Tuple[str, str], List[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to revert the type hint and docstring here if it can still be AuthBase
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still doubting about this, since technically it works. However, I think it might be confusing to the reader, since APIDataSet
are automatically constructed from the catalog, and AuthBases
cannot be inferred this way. So I am wondering, in what cases would this constructor type be used when using Kedro? At this moment, for a user that wants to do a request using an AuthBase
instance, should imo take a different approach, such as extending APIDataSet
. However, if you have other ideas, I'll put AuthBase
in the typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very slow reply I know, sorry!! 😅 The use case here would be that the APIDataSet
is constructed directly using the Python API rather than from the data catalog. In this case you can easily pass arbitrary Python objects to the constructor. This is unusual but does occur and is something we support. So I've put AuthBase
back in the type hint 🙁
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
…ture/api-dataset-load-args � Conflicts: � kedro/extras/datasets/api/api_dataset.py Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Signed-off-by: Niels Drost <codingdutchman@gmail.com>
Hi, what is the status of this PR? Do you need more time working on this? |
Signed-off-by: Antony Milne <antony.milne@quantumblack.com>
Signed-off-by: Antony Milne <antony.milne@quantumblack.com>
Hi @afuetterer, thanks for the push here and sorry for the huge delay - this was 99% ready to merge and then seemed to have stalled. I've just polished it off now and I think we should be ready to merge. Thank you @daBlesr for all your hard work on this, it's much appreciated 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks for the contribution @daBlesr ⭐ and for finishing the PR @AntonyMilneQB ⭐
Description
Resolves #711
Development notes
I have changed the implementation of APIDataSet by removing all the specific configuration parameters, and generally passing
load_args
to a request instance as keyword arguments. Also, users are now able to construct different kind of Authenticator objects to use with requests.Checklist
RELEASE.md
file