Skip to content

Conversation

@lw-lin
Copy link
Contributor

@lw-lin lw-lin commented Feb 14, 2016

When ran under hadoop2 environment and log level setting to DEBUG, ParquetLoader would evoke job.toString() in several methods, which might cause the whole application to stop due to :

java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

    at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
    at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
    at java.lang.String.valueOf(String.java:2847)
    at java.lang.StringBuilder.append(StringBuilder.java:128)
    at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260)
    at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54)
    ...

The reason is that in the hadoop 2.x branch, org.apache.hadoop.mapreduce.Job.toString() has added an ensureState(JobState.RUNNING) check; see map-reduce: Job.java#452. In contrast, the hadoop 1.x branch does not contain such checks, so ParquetLoader works well.

This PR simply avoids evoking job.toString() in ParquetLoader.

@lw-lin
Copy link
Contributor Author

lw-lin commented Feb 14, 2016

@julienledem @rdblue @liancheng @danielcweeks @aniket486 would you mind taking a look at this when you have time? This has been blocking [Parquet-401: Deprecate Log and move to SLF4J Logger][PR#319]. Thanks!


@Override
public void setLocation(String location, Job job) throws IOException {
if (DEBUG) LOG.debug("LoadFunc.setLocation(" + location + ", " + job + ")");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job.getId or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Added ed getJobId() and getJobName().

Would you mind taking another look at this?

@rdblue
Copy link
Contributor

rdblue commented Feb 22, 2016

+1, thanks @proflin!

@asfgit asfgit closed this in c44f982 Feb 22, 2016
@lw-lin lw-lin deleted the PARQUET-529--Avoid-evoking-job.toString()-in-ParquetLoader branch March 8, 2016 05:05
piyushnarang pushed a commit to piyushnarang/parquet-mr that referenced this pull request Jun 15, 2016
When ran under hadoop2 environment and log level setting to `DEBUG`, ParquetLoader would evoke `job.toString()` in several methods, which might cause the whole application to stop due to :

```
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

	at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
	at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
	at java.lang.String.valueOf(String.java:2847)
	at java.lang.StringBuilder.append(StringBuilder.java:128)
	at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260)
	at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54)
    ...
```

The reason is that in the hadoop 2.x branch, `org.apache.hadoop.mapreduce.Job.toString()` has added an `ensureState(JobState.RUNNING)` check; see [map-reduce: Job.java#452](http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.3.0/org/apache/hadoop/mapreduce/Job.java#452). In contrast, the hadoop 1.x branch does not contain such checks, so `ParquetLoader` works well.

This PR simply avoids evoking `job.toString()` in `ParquetLoader`.

Author: proflin <proflin.me@gmail.com>
Author: Liwei Lin <proflin.me@gmail.com>

Closes apache#326 from proflin/PARQUET-529--Avoid-evoking-job.toString()-in-ParquetLoader and squashes the following commits:

f464c7b [proflin] Add jobToString
5d4c750 [proflin] PARQUET-529: Avoid evoking job.toString() in ParquetLoader.java
bb4283a [Liwei Lin] Merge branch 'master' of https://github.com/proflin/parquet-mr
839b458 [proflin] Merge remote-tracking branch 'refs/remotes/apache/master'
rdblue pushed a commit to rdblue/parquet-mr that referenced this pull request Jul 13, 2016
When ran under hadoop2 environment and log level setting to `DEBUG`, ParquetLoader would evoke `job.toString()` in several methods, which might cause the whole application to stop due to :

```
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

	at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
	at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
	at java.lang.String.valueOf(String.java:2847)
	at java.lang.StringBuilder.append(StringBuilder.java:128)
	at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260)
	at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54)
    ...
```

The reason is that in the hadoop 2.x branch, `org.apache.hadoop.mapreduce.Job.toString()` has added an `ensureState(JobState.RUNNING)` check; see [map-reduce: Job.java#452](http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.3.0/org/apache/hadoop/mapreduce/Job.java#452). In contrast, the hadoop 1.x branch does not contain such checks, so `ParquetLoader` works well.

This PR simply avoids evoking `job.toString()` in `ParquetLoader`.

Author: proflin <proflin.me@gmail.com>
Author: Liwei Lin <proflin.me@gmail.com>

Closes apache#326 from proflin/PARQUET-529--Avoid-evoking-job.toString()-in-ParquetLoader and squashes the following commits:

f464c7b [proflin] Add jobToString
5d4c750 [proflin] PARQUET-529: Avoid evoking job.toString() in ParquetLoader.java
bb4283a [Liwei Lin] Merge branch 'master' of https://github.com/proflin/parquet-mr
839b458 [proflin] Merge remote-tracking branch 'refs/remotes/apache/master'
rdblue pushed a commit to rdblue/parquet-mr that referenced this pull request Jan 6, 2017
When ran under hadoop2 environment and log level setting to `DEBUG`, ParquetLoader would evoke `job.toString()` in several methods, which might cause the whole application to stop due to :

```
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

	at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
	at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
	at java.lang.String.valueOf(String.java:2847)
	at java.lang.StringBuilder.append(StringBuilder.java:128)
	at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260)
	at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54)
    ...
```

The reason is that in the hadoop 2.x branch, `org.apache.hadoop.mapreduce.Job.toString()` has added an `ensureState(JobState.RUNNING)` check; see [map-reduce: Job.java#452](http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.3.0/org/apache/hadoop/mapreduce/Job.java#452). In contrast, the hadoop 1.x branch does not contain such checks, so `ParquetLoader` works well.

This PR simply avoids evoking `job.toString()` in `ParquetLoader`.

Author: proflin <proflin.me@gmail.com>
Author: Liwei Lin <proflin.me@gmail.com>

Closes apache#326 from proflin/PARQUET-529--Avoid-evoking-job.toString()-in-ParquetLoader and squashes the following commits:

f464c7b [proflin] Add jobToString
5d4c750 [proflin] PARQUET-529: Avoid evoking job.toString() in ParquetLoader.java
bb4283a [Liwei Lin] Merge branch 'master' of https://github.com/proflin/parquet-mr
839b458 [proflin] Merge remote-tracking branch 'refs/remotes/apache/master'
rdblue pushed a commit to rdblue/parquet-mr that referenced this pull request Jan 10, 2017
When ran under hadoop2 environment and log level setting to `DEBUG`, ParquetLoader would evoke `job.toString()` in several methods, which might cause the whole application to stop due to :

```
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

	at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
	at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
	at java.lang.String.valueOf(String.java:2847)
	at java.lang.StringBuilder.append(StringBuilder.java:128)
	at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260)
	at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54)
    ...
```

The reason is that in the hadoop 2.x branch, `org.apache.hadoop.mapreduce.Job.toString()` has added an `ensureState(JobState.RUNNING)` check; see [map-reduce: Job.java#452](http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.3.0/org/apache/hadoop/mapreduce/Job.java#452). In contrast, the hadoop 1.x branch does not contain such checks, so `ParquetLoader` works well.

This PR simply avoids evoking `job.toString()` in `ParquetLoader`.

Author: proflin <proflin.me@gmail.com>
Author: Liwei Lin <proflin.me@gmail.com>

Closes apache#326 from proflin/PARQUET-529--Avoid-evoking-job.toString()-in-ParquetLoader and squashes the following commits:

f464c7b [proflin] Add jobToString
5d4c750 [proflin] PARQUET-529: Avoid evoking job.toString() in ParquetLoader.java
bb4283a [Liwei Lin] Merge branch 'master' of https://github.com/proflin/parquet-mr
839b458 [proflin] Merge remote-tracking branch 'refs/remotes/apache/master'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants