Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Start Offline-to-online ingestion" method in Python SDK #1051

Merged
merged 9 commits into from
Oct 15, 2020

Conversation

pyalex
Copy link
Collaborator

@pyalex pyalex commented Oct 14, 2020

What this PR does / why we need it:

This PR aim to glue together various backend implementations of job submission and Feast SDK by introducing some cleaner structure with SparkJob, SparkJobParameters interfaces and separate launch modules.
Also offline_to_online_ingestion is added to JobLauncher interface. Implementations for standalone & dataproc launchers provided as well.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:


Copy link
Collaborator

@oavdeev oavdeev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a way to structure this without the diamond inheritance? After re-reading this a couple times, subjectively i have a pretty hard time following what implements what methods, and what gets called when.

One way I can think of is to maybe split SparkJob interface into two:

  • SparkJobParams that deals with preparing the job to launch (so it provides file path / class name / args)
  • SparkJob that deals purely with the job lifecycle after it has been launched (so, id/name/status)

move common job properties to job classes

Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>

universal test
@pyalex
Copy link
Collaborator Author

pyalex commented Oct 15, 2020

@oavdeev that makes sense

Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
@pyalex pyalex changed the title WIP Offline-to-online method in SDK "Start Offline-to-online ingestion" method in Python SDK Oct 15, 2020
@pyalex pyalex added the kind/feature New feature or request label Oct 15, 2020
Signed-off-by: Oleksii Moskalenko <moskalenko.alexey@gmail.com>
sdk/python/feast/pyspark/abc.py Outdated Show resolved Hide resolved
self._entity_source = entity_source
self._destination = destination

def get_name(self) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a chance that we'll run into limits on name length on different platforms, on EMR it is 256 chars max

start: datetime,
end: datetime,
jar: str,
**kwargs,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look like we use kwargs

@@ -123,22 +129,45 @@ def start_historical_feature_retrieval_job(
job_id: str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would make sense to remove job_id here as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the job_id

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: khorshuheng, oavdeev, pyalex

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Co-authored-by: Oleg Avdeev <oleg.v.avdeev@gmail.com>
Signed-off-by: Khor Shu Heng <khor.heng@gojek.com>
Signed-off-by: Khor Shu Heng <khor.heng@gojek.com>
Signed-off-by: Khor Shu Heng <khor.heng@gojek.com>
@khorshuheng
Copy link
Collaborator

/lgtm

@feast-ci-bot feast-ci-bot merged commit 065b310 into feast-dev:master Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants