Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Feb 8, 2023

What changes were proposed in this pull request?

This PR proposes to use FileSystem.exists in FileSystemBasedCheckpointFileManager.exists, which is consistent with other methods in FileSystemBasedCheckpointFileManager.

This PR also removes the test case QueryExecutionErrorsSuite.FAILED_RENAME_PATH: rename when destination path already exists because the test relies on incorrect custom file system instance with non-symmetric implementation between FileSystemBasedCheckpointFileManager.exists vs FileSystem.exists.
(See detailed explanation from #39936 (comment))

Why are the changes needed?

Other methods in FileSystemBasedCheckpointFileManager already uses FileSystem.exists for all cases checking existence of the path.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

@HeartSaVioR HeartSaVioR changed the title [MINOR][SS] Use fs.exists in FileSystemBasedCheckpointFileManager.exists [MINOR][SS] Use FileSystem.exists in FileSystemBasedCheckpointFileManager.exists Feb 8, 2023
@HeartSaVioR
Copy link
Contributor Author

cc. @cloud-fan @viirya

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's a small change, could you file a JIRA for trace-ability, please, @HeartSaVioR ?

@HeartSaVioR HeartSaVioR changed the title [MINOR][SS] Use FileSystem.exists in FileSystemBasedCheckpointFileManager.exists [SPARK-42379][SS] Use FileSystem.exists in FileSystemBasedCheckpointFileManager.exists Feb 8, 2023
@HeartSaVioR
Copy link
Contributor Author

Done.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the UT failure?

Screenshot 2023-02-07 at 9 38 25 PM

  test("FAILED_RENAME_PATH: rename when destination path already exists") {
    withTempPath { p =>
      withSQLConf(
        "spark.sql.streaming.checkpointFileManagerClass" ->
          classOf[FileSystemBasedCheckpointFileManager].getName,

@HeartSaVioR
Copy link
Contributor Author

I'd argue that the test case itself is incorrect. It's heavily bound to the internal implementation - it relies on custom filesystem instance which returns true for all exists() calls, but it does not apply to the getFileStatus() hence it will end up with "file does not exist".

That said, test only passes if filesystem behaves incorrectly. If we change the custom filesystem instance to also provide dummy status for all getFileStatus() calls, the test case will fail.

I don't know how we can test the exception in E2E query. As the exception class denotes, it mainly occurs from concurrent writes for same offset/commit log file. We should have created offset log file concurrently, between "determining start microbatch" and "writing offset log after planning", which is super hard from outside of the query.

I'd suggest just remove the test. WDYT?

@HyukjinKwon
Copy link
Member

I am fine w/ removing it.

@cloud-fan
Copy link
Contributor

+1 to remove it.

@HeartSaVioR
Copy link
Contributor Author

Test removed. Will merge once the build passes.

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@viirya
Copy link
Member

viirya commented Feb 8, 2023

Sounds reasonable to remove it.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with the final commit.

@HeartSaVioR
Copy link
Contributor Author

Thanks, merging to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants