-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC server (#1274) #1301
RPC server (#1274) #1301
Conversation
@beckjake this looks really good! (and yeah, there's more work to do) can you rebase this against also, sometime between now and Weds can you spend some time taking your bulleted todo list and writing up some issues? this is a big project, and I think it'd be too much to try to tackle all of that in this branch. |
26131d6
to
8b48bf5
Compare
- make tasks all have a "from_args" that handles initializing correct config type, etc - make it possible to process a single node's refs at a time - Make remote run/compile tasks + rpc server task, wire them up - add ref() and source() support, and vestigial doc() support - refactor results a bit to support new result behavior - don't write to the filesystem on requests - handle uniqueness issues
8b48bf5
to
1a0df17
Compare
Oops, I finished them all instead! I still have to write tests but this seems to work ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks so good! I'm going to spend some time playing with it. You indicated that there may be some tweaks to make around error handling / formatting. Can check that out, and will also report back with any other initial thoughts too.
This is really, very exciting!
core/dbt/config/errors.py
Outdated
@@ -0,0 +1,8 @@ | |||
from contextlib import contextmanager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's up with this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is what happens when you git add core
and have dead code lying around...
core/dbt/main.py
Outdated
run_sub = _build_run_subparser(subs, base_subparser) | ||
compile_sub = _build_compile_subparser(subs, base_subparser) | ||
generate_sub = _build_docs_generate_subparser(docs_subs, base_subparser) | ||
_add_common_arguments(run_sub, compile_sub, generate_sub) | ||
_add_common_arguments(run_sub, compile_sub, generate_sub, rpc_sub) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the common args like --models
, --exclude
, --non-destructive
, and --full-refresh
applicable for the rpc task? I imagine that --threads
and --no-version-check
might be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that's what this does!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! I think we probably don't want --models
, --exclude
, --non-destructive
, and --full-refresh
to be acceptable flags to this task. I don't think they're sensible in the context of a client <> server paradigm. Can we remove these?
I could buy the use case for --threads
and --no-version-check
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, and since --models
, --exclude
are both ignored, removing them won't change anything.
It's worth pointing out that although I'm not sure of the ultimate consequences, --full-refresh
and --non-destructive
do have impacts on the execution of code in the dbt rpc server model! It flips a global flag value stored in dbt.flags
that's then accessible to macros and such, and some adapter methods also inspect them. I think the solution here is probably that flags
should be an object generated at runtime based on config/args and passed around as needed, rather than a module with attributes that are set during argument parsing. And then we should accept them in RPC calls, probably.
I think --no-version-check
should probably be available everywhere that loads project files (which the server does), and we do actually use the profile's threads
in the server, so it makes sense to keep the cli override around.
} | ||
unique_id, node = self.parser.parse_sql_node(node_dict) | ||
|
||
self.manifest = ParserUtils.add_new_refs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this addresses my comment above about refs missing from parsing
core/dbt/task/compile.py
Outdated
self.job_queue = self.linker.as_graph_queue(self.manifest, | ||
selected_uids) | ||
|
||
# TODO: how can we get a timeout in here? Spin off a subprocess? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question, and one which I think @cmcarthur would be well suited to help think through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, need to remove this TODO.
@@ -320,3 +333,73 @@ def get_result(self, results, elapsed_time, generated_at): | |||
|
|||
def task_end_messages(self, results): | |||
dbt.ui.printer.print_run_end_messages(results) | |||
|
|||
|
|||
class RemoteCallable(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i need to come back and give this class another look
error = str(exc) | ||
conn.send([result, error]) | ||
|
||
def safe_handle_request(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How happy are you with this implementation of timeouts? It doesn't look unreasonable to me, but curious if you view this as something close to production-ready, or if your TODO above indicates that we need to give this some more thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this this is probably how it will shake out, I'm fine with it, I think. I mostly anticipate issues on Windows, potentially.
I do want to make sure we're close()
ing everything that needs it though.
General thought: making requests against this server from a webapp is difficult because the webapp will necessarily be running on a different port. While Chrome shows me:
I think the server hits a snag before that -- it looks like the OPTIONS request is handled ungracefully by the server. Generally, is this something we should handle in dbt core? Or should it be handled by some layer in front of dbt-core? I don't feel strongly either way -- curious what you guys think. |
@drewbanin IMO this should be handled by cloud or whatever layer people put in front of this server, not by core |
this looks great to me. the only comment i have is i could foresee wanting a separate log handler for access logs. that would make it easy to divert just the access logs or just the general logs when this is included as a library |
Yes. You don't want your python programs handling stuff like this. Toss it behind nginx or apache. |
3dcc570
to
a6b68ab
Compare
Fix python 2.7 remove TODO remove useless file I added by accident close Pipe members Give RPC its own logger, include remote addrs
a6b68ab
to
a90ef25
Compare
I believe with the last push I've addressed all the feedback! I would really like to get windows tests fully working again for this PR. |
argument parsing fixes change the table to a list of columns + list of rows
The RPC server does not work on windows. I bet it's a simple change (change some parameter to werkzeug's server) but per our discussion at sprint planning I'm not going to get into it right now. |
sounds good, thanks for the update @beckjake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel really satisfied with the results of this code review, and i'm trying though unable to break the server locally :)
I imagine there will be a batch of minor updates to make here, but this is a phenomenal starting point. ship it.
Edit: after resolving merge conflicts haha
c1ae4c7
to
02e88a3
Compare
dbt RPC server - resolves #1274
This implements a basic json-rpc server using werkzeug for routing. You can compile/run sql with the
compile
/run
RPC commands, both of which accept a mandatoryname
parameter, a mandatory base64-encodedsql
parameter, and an optional integertimeout
parameter.ref
,source
, and all jinja things work like you'd expect, the created pseudo-nodes go through the regular compilation model.