-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Spark load][Fe 3/5] Fe create job #3715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| private static final Logger LOG = LogManager.getLogger(BulkLoadJob.class); | ||
|
|
||
| // input params | ||
| protected BrokerDesc brokerDesc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think brokerDesc should be in the subclass of BulkLoadJob.
Although currently both broker load and spark load need a broker, but for spark load, it may not be required in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @morningman, I will improve this later, including persistence with json
| private static final Logger LOG = LogManager.getLogger(SparkLoadJob.class); | ||
|
|
||
| // for global dict | ||
| public static final String BITMAP_DATA_PROPERTY = "bitmap_data"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this property is hard to understand and is coupled with the detail implementation of the global dict.
How about changing it to a more abstract nouns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for temporary use. I am investigating load from hive table, and i will update it recently.
fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java
Outdated
Show resolved
Hide resolved
| // TODO(wyb): spark-load | ||
| //handler.killEtlJob(sparkAppHandle, appId, id, sparkResource); | ||
| } catch (Exception e) { | ||
| LOG.warn("kill etl job failed. id: {}, state: {}", id, state, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save the error msg somewhere for user to get?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s not necessary, because clear job is just trying to kill etl job as much as possible.
0d78b0c to
edfa668
Compare
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| :} | ||
| ; | ||
|
|
||
| opt_cluster ::= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you just removed this grammar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hadoop uses opt_system, opt_cluster is no longer used
Spark load interface #3010 (comment)
The spark configurations in load stmt can override the existing configuration in the resource for temporary use.
Fe analyzes LoadStmt and creates SparkLoadJob in LoadManager.
Abstract a base class BulkLoadJob that contains shared code between BrokerLoadJob and SparkLoadJob.
Users cancel spark load job through MySQL client.
#3433