Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSTORE-1185] Real-time retrieval of feature vector via RonDB REST API #1228

Closed
wants to merge 50 commits into from

Conversation

vatj
Copy link
Contributor

@vatj vatj commented Feb 19, 2024

This PR adds/fixes/changes...

  • please summarize your changes to the code
  • and make sure to include all changes to user-facing APIs

JIRA Issue: -

Priority for Review: -

Related PRs: -

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Tests on VM

Checklist For The Assigned Reviewer:

- [ ] Checked if merge conflicts with master exist
- [ ] Checked if stylechecks for Java and Python pass
- [ ] Checked if all docstrings were added and/or updated appropriately
- [ ] Ran spellcheck on docstring
- [ ] Checked if guides & concepts need to be updated
- [ ] Checked if naming conventions for parameters and variables were followed
- [ ] Checked if private methods are properly declared and used
- [ ] Checked if hard-to-understand areas of code are commented
- [ ] Checked if tests are effective
- [ ] Built and deployed changes on dev VM and tested manually
- [x] (Checked if all type annotations were added and/or updated appropriately)

@vatj vatj requested a review from kennethmhc February 19, 2024 11:04
python/hsfs/feature_view.py Outdated Show resolved Hide resolved
python/hsfs/feature_view.py Outdated Show resolved Hide resolved
python/hsfs/feature_view.py Outdated Show resolved Hide resolved
python/hsfs/core/vector_server.py Outdated Show resolved Hide resolved
python/hsfs/client/rondb_rest_client.py Outdated Show resolved Hide resolved
python/hsfs/client/rondb_rest_client.py Outdated Show resolved Hide resolved
python/hsfs/core/rondb_engine.py Outdated Show resolved Hide resolved
python/hsfs/core/rondb_engine.py Outdated Show resolved Hide resolved
entry=entry,
passed_features=passed_features,
return_type=self._rondb_engine.RETURN_TYPE_FEATURE_VECTOR,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a note for future improvement: since the result returned from rest API is a vector, it can be used directly if _apply_transformation use the feature index instead of feature name for transformation. So it does not need to convert from list to dict and then back to list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I will add it to the list of possible future improvements in the design log

python/hsfs/core/rondb_engine.py Outdated Show resolved Hide resolved
Copy link
Contributor

@kennethmhc kennethmhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should discuss with others about if we should support both client in a single fv.

@@ -295,6 +295,9 @@ def replace_public_host(self, url):
"""no need to replace as we are already in external client"""
return url

def _is_external(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why set private is it is used by other class?

else:
warn(
"Online Store Rest Client is already initialised. To reset connection or/and override configuration, "
+ "use reset_online_store_rest_client or get_instance methods with optional configuration"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_instance below do not take any arguments

@@ -78,7 +78,7 @@ def __init__(
self._query = query
self._featurestore_id = featurestore_id
self._feature_store_id = featurestore_id # for consistency with feature group
self._feature_store_name = featurestore_name
self._feature_store_name = util.strip_feature_store_suffix(featurestore_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a breaking change. maybe create another variable instead?

@@ -211,9 +215,19 @@ def init_serving(
training_dataset_version: Optional[int] = None,
external: Optional[bool] = None,
options: Optional[dict] = None,
init_online_store_sql_client: Optional[bool] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case of supporting both client in a single feature view instance? It seems that the motivation of using rest client is that sql client is not available. Alternatively, users can create 2 fv instance and init_serving with different clients.

options: Additional options as key/value pairs for configuring online serving engine.
* key: kwargs of SqlAlchemy engine creation (See: https://docs.sqlalchemy.org/en/20/core/engines.html#sqlalchemy.create_engine).
For example: `{"pool_size": 10}`
* key: "config_online_store_rest_client" - dict, optional. Optional configuration options to override defaults for the Online Store REST Client.
* key: "reset_online_store_rest_client" - bool, optional. If set to True, the Online Store REST Client will be reset. Provide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what's the purpose of this reset_online_store_rest_client flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force overwriting the connection with new config details

options: Additional options as key/value pairs for configuring online serving engine.
* key: kwargs of SqlAlchemy engine creation (See: https://docs.sqlalchemy.org/en/20/core/engines.html#sqlalchemy.create_engine).
For example: `{"pool_size": 10}`
* key: "config_online_store_rest_client" - dict, optional. Optional configuration options to override defaults for the Online Store REST Client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing which options users can provide in this dictionary.

python/tests/core/test_vector_server.py Show resolved Hide resolved
@@ -0,0 +1 @@
{"type": "service_account", "project_id": "test"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is generated by some storage connector test. We should put it in the gitignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vatj vatj closed this Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants