-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-11051][Core] Do not allow local checkpointing after the RDD is materialized and checkpointed #9072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… and checkpointed.
|
Test build #43560 has finished for PR 9072 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this is a good restriction, but cc @tdas regarding the semantics.
I feel like this code construct is getting hard to read. I don't feel strongly, but is if (checkpointData.isDefined && isCheckpointed) not simpler? then there are just two branches, one return point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to checkpointData.isDefined && isCheckpointed, we still need to check isLocallyCheckpointed, because when checkpointData.isDefined is true, it is possibly a local checkpointing.
|
retest this please. |
|
Test build #43565 has finished for PR 9072 at commit
|
|
retest this please. |
|
Test build #43566 has finished for PR 9072 at commit
|
|
Actually, best for @andrewor14 to take a look, who implemented local checkpointing and has more context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename this to isCheckpointedAndMaterialized and update the java docs? Right now it's a little confusing because it can be interpreted as "marked for checkpointing" only, which is actually what the existing java docs say.
|
@viirya thanks for the fix. I suggested an alternative which I think is a little clearer. Could you also add a regression test for this in |
|
@andrewor14 Thanks. I will update this patch and add the test later today. |
|
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be private[spark]
|
Test build #43813 has finished for PR 9072 at commit
|
|
Test build #43829 has finished for PR 9072 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add this? I don't think this is used except in tests
|
@viirya this looks pretty close. I don't think we need to add the method in Java. Once you remove that and update the java docs I will merge this. Thanks for fixing this issue. |
|
@andrewor14 Thanks. I've updated this. Please see if there is any problem. |
|
Test build #43873 has finished for PR 9072 at commit
|
|
ping @andrewor14 any other comments? |
|
Yup LGTM I'm merging this into master 1.5. Thanks for fixing this @viirya ! |
… materialized and checkpointed JIRA: https://issues.apache.org/jira/browse/SPARK-11051 When a `RDD` is materialized and checkpointed, its partitions and dependencies are cleared. If we allow local checkpointing on it and assign `LocalRDDCheckpointData` to its `checkpointData`. Next time when the RDD is materialized again, the error will be thrown. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #9072 from viirya/no-localcheckpoint-after-checkpoint. (cherry picked from commit a1413b3) Signed-off-by: Andrew Or <andrew@databricks.com>
JIRA: https://issues.apache.org/jira/browse/SPARK-11051
When a
RDDis materialized and checkpointed, its partitions and dependencies are cleared. If we allow local checkpointing on it and assignLocalRDDCheckpointDatato itscheckpointData. Next time when the RDD is materialized again, the error will be thrown.