[Spark load][Fe 3/5] Fe create job #3715

wyb · 2020-05-28T12:23:59Z

Users create spark load job through MySQL client.
Spark load interface #3010 (comment)

LOAD LABEL db_name.label_name 
(
  DATA INFILE ("/tmp/file1") INTO TABLE table_name, ...
)
WITH RESOURCE resource_name
[(key1 = value1, ...)]
[PROPERTIES (key2 = value2, ... )]

The spark configurations in load stmt can override the existing configuration in the resource for temporary use.

Fe analyzes LoadStmt and creates SparkLoadJob in LoadManager.
Abstract a base class BulkLoadJob that contains shared code between BrokerLoadJob and SparkLoadJob.
Users cancel spark load job through MySQL client.

CANCEL LOAD WHERE LABEL = 'label0'

#3433

fe/src/main/java/org/apache/doris/load/loadv2/JobState.java

fe/src/main/java/org/apache/doris/analysis/ResourceDesc.java

fe/src/main/java/org/apache/doris/analysis/BrokerDesc.java

fe/src/main/java/org/apache/doris/common/Config.java

morningman · 2020-05-30T01:46:29Z

fe/src/main/java/org/apache/doris/load/loadv2/BulkLoadJob.java

+    private static final Logger LOG = LogManager.getLogger(BulkLoadJob.class);
+
+    // input params
+    protected BrokerDesc brokerDesc;


I think brokerDesc should be in the subclass of BulkLoadJob.
Although currently both broker load and spark load need a broker, but for spark load, it may not be required in future.

After discussion with @morningman, I will improve this later, including persistence with json

morningman · 2020-05-30T01:51:39Z

fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java

+    private static final Logger LOG = LogManager.getLogger(SparkLoadJob.class);
+
+    // for global dict
+    public static final String BITMAP_DATA_PROPERTY = "bitmap_data";


this property is hard to understand and is coupled with the detail implementation of the global dict.
How about changing it to a more abstract nouns?

This is for temporary use. I am investigating load from hive table, and i will update it recently.

fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java

morningman · 2020-05-30T01:54:41Z

fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java

+                    // TODO(wyb): spark-load
+                    //handler.killEtlJob(sparkAppHandle, appId, id, sparkResource);
+                } catch (Exception e) {
+                    LOG.warn("kill etl job failed. id: {}, state: {}", id, state, e);


Save the error msg somewhere for user to get?

I think it’s not necessary, because clear job is just trying to kill etl job as much as possible.

morningman

LGTM

morningman · 2020-06-07T02:31:10Z

fe/src/main/cup/sql_parser.cup

    :}
    ;

-opt_cluster ::=


Did you just removed this grammar?

hadoop uses opt_system, opt_cluster is no longer used

wyb mentioned this pull request May 28, 2020

[Spark load] Doris support Spark load #3433

Closed

wyb changed the title ~~[Spark load] fe create job~~ [Spark load] Fe create job May 28, 2020

imay added area/load Issues or PRs related to all kinds of load kind/feature Categorizes issue or PR as related to a new feature. api-review Categorizes an issue or PR as actively needing an API review. labels May 29, 2020

imay requested changes May 29, 2020

View reviewed changes

fe/src/main/java/org/apache/doris/load/loadv2/JobState.java Show resolved Hide resolved

fe/src/main/java/org/apache/doris/analysis/ResourceDesc.java Outdated Show resolved Hide resolved

fe/src/main/java/org/apache/doris/analysis/BrokerDesc.java Outdated Show resolved Hide resolved

morningman requested changes May 30, 2020

View reviewed changes

wyb changed the title ~~[Spark load] Fe create job~~ [Spark load][Fe 3/5] Fe create job May 30, 2020

Add create spark load job

edfa668

wyb force-pushed the spark_load_fe_create_job branch from 0d78b0c to edfa668 Compare June 3, 2020 13:29

Remove unused import

7f6a7c6

morningman approved these changes Jun 8, 2020

View reviewed changes

imay approved these changes Jun 9, 2020

View reviewed changes

imay added the approved Indicates a PR has been approved by one committer. label Jun 9, 2020

morningman merged commit 4fa9d8c into apache:master Jun 9, 2020

EmmyMiao87 mentioned this pull request Sep 1, 2020

Release Notes 0.13.0 #4370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Spark load][Fe 3/5] Fe create job #3715

[Spark load][Fe 3/5] Fe create job #3715

Uh oh!

wyb commented May 28, 2020 •

edited by morningman

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

morningman May 30, 2020

Uh oh!

wyb Jun 1, 2020

Uh oh!

morningman May 30, 2020

Uh oh!

wyb Jun 1, 2020

Uh oh!

Uh oh!

morningman May 30, 2020

Uh oh!

wyb Jun 1, 2020

Uh oh!

morningman left a comment

Uh oh!

morningman Jun 7, 2020

Uh oh!

wyb Jun 9, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Spark load][Fe 3/5] Fe create job #3715

[Spark load][Fe 3/5] Fe create job #3715

Uh oh!

Conversation

wyb commented May 28, 2020 • edited by morningman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

morningman May 30, 2020

Choose a reason for hiding this comment

Uh oh!

wyb Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

morningman May 30, 2020

Choose a reason for hiding this comment

Uh oh!

wyb Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

morningman May 30, 2020

Choose a reason for hiding this comment

Uh oh!

wyb Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

morningman Jun 7, 2020

Choose a reason for hiding this comment

Uh oh!

wyb Jun 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wyb commented May 28, 2020 •

edited by morningman

Loading

wyb Jun 9, 2020 •

edited

Loading