-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor submission method and add command API as defualt #442
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-spark contributing guide. |
@ueshin Can you also review this? I somehow can't add you to reviewer |
DEFAULT_POLLING_INTERVAL = 3 | ||
|
||
|
||
class BasePythonJobHelper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid getting these massive .py files how do folks feel about putting classes in separate files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean base class in one place then the ones inheritance it in separate ones? I don't really feel like a 300 line file is massive. Is there any advantage of breaking it down to 3 100 line file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh definitely not massive as of now but if the logic in these classes grows we this could get large quite quickly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense! But the logic here will likely stay the same, and we probably looking at refactoring it into multi adapter format in the longer term, or starting to adopt dbt-databricks
for databricks specific submission. So I am inclined to leave it as is for now to avoid over-optimizing.
json={ | ||
"path": path, | ||
"content": b64_encoded_content, | ||
"language": "PYTHON", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure it matters but looks like 'language' is uppercase here and lowercase elsewhere, maybe put this in a static variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the rest into a static variable but this one needs to be upper cased as the API is a different one and actually require uppercase.
self.polling_interval = DEFAULT_POLLING_INTERVAL | ||
|
||
def get_timeout(self): | ||
timeout = self.parsed_model["config"].get("timeout", 60 * 60 * 24) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move the timeout into a DEFAULT_TIMEOUT var?
context.destroy(context_id) | ||
|
||
|
||
python_submission_helpers = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: upper case as this is a "global" var?
Incremental tests failing in integration test is being fixed at #445 |
self.polling_interval = DEFAULT_POLLING_INTERVAL | ||
|
||
def get_timeout(self): | ||
timeout = self.parsed_model["config"].get("timeout", DEFAULT_TIMEOUT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we not use timeout
passed to submit_python_job
anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably cleaner way to set a timeout in the end since it pulls it from config. Really just a placeholder now. Any thoughts, suggestions?
And we don't really want user to call submit_python_job
, see dbt-labs/dbt-core#5596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am gonna just merge this, feel free to open an issue if we feel like other ways to do timeout is better!
Merging since it only fails on one test that's being fixed in #445 |
Sorry, now I'm wondering why
nvm, maybe I'm missing something in |
resolves #424 #419
Description
Use Command API as default python model submission method. User can still create the notebook by adding
submission_method:'notebook'
in config for the modelChecklist
changie new
to create a changelog entry