[Spark load][Fe 5/6] Fe submit spark etl job #3716

wyb · 2020-05-28T16:55:45Z

After user creates a spark load job which status is PENDING, Fe will schedule and submit the spark etl job.

Begin transaction
Create a SparkLoadPendingTask for submitting etl job
2.1 Create etl job configuration according to Spark load interface #3010 (comment)
2.2 Upload the configuration file and job jar to HDFS with broker
2.3 Submit etl job to spark cluster
2.4 Wait for etl job submission result
Update job state to ETL and log job update info if etl job is submitted successfully

#3433

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java

fe/src/main/java/org/apache/doris/catalog/OlapTable.java

morningman · 2020-05-30T02:01:13Z

fe/src/main/java/org/apache/doris/common/Pair.java

 public class Pair<F, S> {
    public static PairComparator<Pair<?, Comparable>> PAIR_VALUE_COMPARATOR = new PairComparator<>();

+    @SerializedName(value = "first")


I'am not sure this is ok, cause there is no guarantee that the F and S object can also be serialized by GSON

Users guarantee this when use Pair class？like Map and List.

I checked, this is not work

I add a comment.
When using Pair for persistence, users need to guarantee that F and S can be serialized through Gson

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java

morningman · 2020-05-30T02:19:19Z

fe/src/main/java/org/apache/doris/load/loadv2/SparkEtlJobHandler.java

+    private static final String CONFIG_FILE_NAME = "jobconfig.json";
+    private static final String APP_RESOURCE_LOCAL_PATH = PaloFe.DORIS_HOME_DIR + "/lib/" + APP_RESOURCE_NAME;
+    private static final String JOB_CONFIG_DIR = "configs";
+    private static final String MAIN_CLASS = "org.apache.doris.load.loadv2.etl.SparkEtlJob";


How about get it from SparkEtlJob.class.getXXX()?

ok, I comment and will replace with it when SparkEtlJob class is merged.

morningman · 2020-05-30T02:21:16Z

fe/src/main/java/org/apache/doris/load/loadv2/SparkEtlJobHandler.java

+                throw new LoadException(errMsg + "spark app state: " + state.toString());
+            }
+            if (retry >= GET_APPID_MAX_RETRY_TIMES) {
+                throw new LoadException(errMsg + "wait too much time for getting appid. spark app state: "


Suggested change

throw new LoadException(errMsg + "wait too much time for getting appid. spark app state: "

throw new LoadException(errMsg + " wait too much time for getting appi d. spark app state: "

errMsg already have a space at the end.

morningman · 2020-06-10T05:50:08Z

fe/src/main/java/org/apache/doris/common/Pair.java

 public class Pair<F, S> {
    public static PairComparator<Pair<?, Comparable>> PAIR_VALUE_COMPARATOR = new PairComparator<>();

+    @SerializedName(value = "first")


I checked, this is not work

morningman · 2020-06-17T13:44:39Z

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java

+                                                + ", msg=" + tReadResponse.getOpStatus().getMessage());
+            }
+            failed = false;
+            return tReadResponse.getData();


broker's pread() method does not guarantee to read the specified length of data currently.
But #3881 is trying to solve this problem. Just for remind.

fe/src/main/java/org/apache/doris/load/loadv2/SparkLoadJob.java

morningman

LGTM

imay

LGTM

morningman

LGTM

wyb mentioned this pull request May 28, 2020

[Spark load] Doris support Spark load #3433

Closed

imay added area/load Issues or PRs related to all kinds of load kind/feature Categorizes issue or PR as related to a new feature. labels May 29, 2020

imay self-assigned this May 29, 2020

imay requested changes May 29, 2020

View reviewed changes

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java Outdated Show resolved Hide resolved

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java Outdated Show resolved Hide resolved

fe/src/main/java/org/apache/doris/common/util/BrokerUtil.java Outdated Show resolved Hide resolved

morningman requested changes May 30, 2020

View reviewed changes

wyb changed the title ~~[Spark load] Fe submit spark etl job~~ [Spark load] [Fe 4/5] Fe submit spark etl job May 30, 2020

wyb changed the title ~~[Spark load] [Fe 4/5] Fe submit spark etl job~~ [Spark load][Fe 4/5] Fe submit spark etl job May 30, 2020

wyb changed the title ~~[Spark load][Fe 4/5] Fe submit spark etl job~~ [Spark load][Fe 5/6] Fe submit spark etl job Jun 10, 2020

wyb force-pushed the spark_load_fe_submit_etl_job branch 2 times, most recently from 6d0ef4e to 20df6f2 Compare June 13, 2020 13:06

morningman reviewed Jun 17, 2020

View reviewed changes

morningman previously approved these changes Jun 18, 2020

View reviewed changes

imay previously approved these changes Jun 18, 2020

View reviewed changes

wyb dismissed stale reviews from imay and morningman via ee27b34 June 18, 2020 08:51

imay previously approved these changes Jun 18, 2020

View reviewed changes

Fe submit spark etl job

efcc796

wyb dismissed imay’s stale review via efcc796 June 19, 2020 02:44

wyb force-pushed the spark_load_fe_submit_etl_job branch from ee27b34 to efcc796 Compare June 19, 2020 02:44

morningman approved these changes Jun 19, 2020

View reviewed changes

morningman added the approved Indicates a PR has been approved by one committer. label Jun 19, 2020

morningman merged commit 532d15d into apache:master Jun 19, 2020

EmmyMiao87 mentioned this pull request Sep 1, 2020

Release Notes 0.13.0 #4370

Closed

	throw new LoadException(errMsg + "wait too much time for getting appid. spark app state: "
	throw new LoadException(errMsg + " wait too much time for getting appi d. spark app state: "

[Spark load][Fe 5/6] Fe submit spark etl job #3716

[Spark load][Fe 5/6] Fe submit spark etl job #3716

Uh oh!

Conversation

wyb commented May 28, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

imay left a comment

Choose a reason for hiding this comment

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants