HDFS-17801. EC: Reading support retryCurrentNode to avoid transient errors cause application level failures. #7762

hfutatzhanghb · 2025-06-25T07:54:04Z

Description of PR

Refer to HDFS-17801.
Under the 3-replication read implementation, when an IOException occurs, there is the retryCurrentNode mechanism.
This is very useful to avoid application level failures due to transient errors (e.g. Datanode could have closed the connection because the client is idle for too long). Please refer to below codes :

hadoop/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java

Lines 824 to 828 in 6eae158

    
                   /* possibly retry the same node so that transient errors don't 
        
                    * result in application level failures (e.g. Datanode could have 
        
                    * closed the connection because the client is idle for too long). 
        
                    */ 
        
                   sourceFound = seekToBlockSource(pos);

We should make EC read also support this mechanism.

BTW, this issue is motivated by the failure of our cluster's applications failure when we change the data from 3-rep to EC policy.

How was this patch tested?

Add an unit test.

hadoop-yetus · 2025-06-25T10:41:15Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 0s		Maven dependency ordering for branch
+1 💚	mvninstall	26m 7s		trunk passed
+1 💚	compile	3m 1s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 40s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 44s		trunk passed
+1 💚	mvnsite	1m 17s		trunk passed
+1 💚	javadoc	1m 16s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 36s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 1s		trunk passed
+1 💚	shadedclient	22m 19s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 21s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 4s		the patch passed
+1 💚	compile	2m 51s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 51s		the patch passed
+1 💚	compile	2m 37s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 37s		the patch passed
-1 ❌	blanks	0m 0s	/blanks-eol.txt	The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️	checkstyle	0m 37s	/results-checkstyle-hadoop-hdfs-project.txt	hadoop-hdfs-project: The patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29)
+1 💚	mvnsite	1m 5s		the patch passed
+1 💚	javadoc	1m 0s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 5s		the patch passed
+1 💚	shadedclient	22m 22s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 6s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	58m 44s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 29s		The patch does not generate ASF License warnings.
		166m 5s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/1/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux f5a19e6c3c97 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `b91d1b9`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/1/testReport/
Max. process+thread count	3558 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-06-25T11:06:29Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 23s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	5m 46s		Maven dependency ordering for branch
+1 💚	mvninstall	19m 12s		trunk passed
+1 💚	compile	2m 58s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 45s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 46s		trunk passed
+1 💚	mvnsite	1m 18s		trunk passed
+1 💚	javadoc	1m 10s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 2s		trunk passed
+1 💚	shadedclient	22m 10s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 23s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 3s		the patch passed
+1 💚	compile	2m 52s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 52s		the patch passed
+1 💚	compile	2m 38s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 38s		the patch passed
-1 ❌	blanks	0m 0s	/blanks-eol.txt	The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️	checkstyle	0m 34s	/results-checkstyle-hadoop-hdfs-project.txt	hadoop-hdfs-project: The patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29)
+1 💚	mvnsite	1m 5s		the patch passed
+1 💚	javadoc	0m 56s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 0s		the patch passed
+1 💚	shadedclient	21m 44s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 6s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	56m 14s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 29s		The patch does not generate ASF License warnings.
		155m 31s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/2/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 6e05c81f298f 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `6c144bb`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/2/testReport/
Max. process+thread count	3561 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-06-25T15:33:08Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 21s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 5s		Maven dependency ordering for branch
+1 💚	mvninstall	19m 32s		trunk passed
+1 💚	compile	3m 0s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 44s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 42s		trunk passed
+1 💚	mvnsite	1m 14s		trunk passed
+1 💚	javadoc	1m 10s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 34s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 1s		trunk passed
+1 💚	shadedclient	21m 58s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 21s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 7s		the patch passed
+1 💚	compile	2m 52s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 52s		the patch passed
+1 💚	compile	2m 40s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 40s		the patch passed
-1 ❌	blanks	0m 0s	/blanks-eol.txt	The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️	checkstyle	0m 36s	/results-checkstyle-hadoop-hdfs-project.txt	hadoop-hdfs-project: The patch generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29)
+1 💚	mvnsite	1m 7s		the patch passed
+1 💚	javadoc	0m 56s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 27s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
-1 ❌	spotbugs	1m 22s	/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.html	hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚	shadedclient	21m 48s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 6s		hadoop-hdfs-client in the patch passed.
-1 ❌	unit	55m 36s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 29s		The patch does not generate ASF License warnings.
		155m 35s

Reason	Tests
SpotBugs	module:hadoop-hdfs-project/hadoop-hdfs-client
	org.apache.hadoop.hdfs.DFSStripedInputStream.getRetryCurrentReaderFlags() may expose internal representation by returning DFSStripedInputStream.retryCurrentReaderFlags At DFSStripedInputStream.java:by returning DFSStripedInputStream.retryCurrentReaderFlags At DFSStripedInputStream.java:[line 592]
Failed junit tests	hadoop.hdfs.tools.TestDFSAdmin

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/3/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux f7dc824c520d 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `f52be54`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/3/testReport/
Max. process+thread count	3655 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-06-25T18:44:14Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 21s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 47s		Maven dependency ordering for branch
+1 💚	mvninstall	20m 55s		trunk passed
+1 💚	compile	3m 0s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 41s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 45s		trunk passed
+1 💚	mvnsite	1m 17s		trunk passed
+1 💚	javadoc	1m 10s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 33s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 2s		trunk passed
+1 💚	shadedclient	21m 42s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 23s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 4s		the patch passed
+1 💚	compile	2m 49s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 49s		the patch passed
+1 💚	compile	2m 38s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 38s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 37s		the patch passed
+1 💚	mvnsite	1m 9s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 27s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 3s		the patch passed
+1 💚	shadedclient	24m 34s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 9s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	52m 29s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 28s		The patch does not generate ASF License warnings.
		157m 11s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/4/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux d500f27620c2 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `bd73e6c`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/4/testReport/
Max. process+thread count	1226 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hfutatzhanghb · 2025-06-26T01:35:41Z

Hi, @Hexiaoqiao @KeeProMise @zhangshuyan0 . Could you please help review this PR when you are free ? Thanks a lot.

hadoop-yetus · 2025-06-26T05:21:22Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 53s		Maven dependency ordering for branch
+1 💚	mvninstall	20m 21s		trunk passed
+1 💚	compile	3m 0s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 43s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 41s		trunk passed
+1 💚	mvnsite	1m 20s		trunk passed
+1 💚	javadoc	1m 12s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 38s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 4s		trunk passed
+1 💚	shadedclient	21m 45s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 22s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 5s		the patch passed
+1 💚	compile	2m 49s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 49s		the patch passed
+1 💚	compile	2m 38s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 38s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 36s	/results-checkstyle-hadoop-hdfs-project.txt	hadoop-hdfs-project: The patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29)
+1 💚	mvnsite	1m 7s		the patch passed
+1 💚	javadoc	0m 59s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 24s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 3s		the patch passed
+1 💚	shadedclient	21m 58s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 4s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	55m 26s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 30s		The patch does not generate ASF License warnings.
		156m 55s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/5/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux c64a1d5a99bb 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `b89b1d0`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/5/testReport/
Max. process+thread count	4012 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/5/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

…rrors cause application level failures.

hadoop-yetus · 2025-06-26T08:14:56Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 13s		Maven dependency ordering for branch
+1 💚	mvninstall	19m 13s		trunk passed
+1 💚	compile	2m 56s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 50s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 44s		trunk passed
+1 💚	mvnsite	1m 20s		trunk passed
+1 💚	javadoc	1m 13s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 29s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 3s		trunk passed
+1 💚	shadedclient	21m 59s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 23s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 3s		the patch passed
+1 💚	compile	2m 51s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 51s		the patch passed
+1 💚	compile	2m 37s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 37s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 35s	/results-checkstyle-hadoop-hdfs-project.txt	hadoop-hdfs-project: The patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29)
+1 💚	mvnsite	1m 3s		the patch passed
+1 💚	javadoc	0m 59s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 26s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 6s		the patch passed
+1 💚	shadedclient	22m 4s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 8s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	56m 22s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 29s		The patch does not generate ASF License warnings.
		156m 26s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/6/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 9ecc81046355 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `a44f887`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/6/testReport/
Max. process+thread count	3726 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/6/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copilot

Pull Request Overview

This PR introduces transient-error resilience to EC read paths by adding a retry mechanism modeled after the 3-replication read implementation. Key changes include:

Adding unit tests to simulate long idle periods and validate the retry behavior.
Updating StripeReader logic to conditionally reset chunk states and use additional reader info checks.
Introducing and integrating a retryCurrentReaderFlags mechanism in DFSStripedInputStream to control reader retries.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedInputStream.java	Added new tests for EC read retry behavior.
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripeReader.java	Enhanced handling of missing blocks and integrated logic to account for transient errors.
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java	Introduced retryCurrentReaderFlags for controlling reader retry behavior in case of transient errors.

Comments suppressed due to low confidence (2)

hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripeReader.java:179

Consider adding a comment explaining the rationale behind comparing countOfNullReaderInfos with parityBlkNum, clarifying how this check differentiates between transient errors and genuine missing blocks.

      if (countOfNullReaderInfos(readerInfos) < parityBlkNum) {

hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java:233

Add an inline comment to explain why a reader is skipped only when its corresponding retry flag is false, which will help improve code clarity and maintainability.

      if (!retryCurrentReaderFlags[readerIndex]) {

hadoop-yetus · 2025-06-26T11:07:15Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	6m 1s		Maven dependency ordering for branch
+1 💚	mvninstall	19m 21s		trunk passed
+1 💚	compile	3m 4s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 45s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 47s		trunk passed
+1 💚	mvnsite	1m 19s		trunk passed
+1 💚	javadoc	1m 11s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 35s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	2m 59s		trunk passed
+1 💚	shadedclient	22m 0s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 22s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 5s		the patch passed
+1 💚	compile	2m 54s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 54s		the patch passed
+1 💚	compile	2m 39s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 39s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 37s		the patch passed
+1 💚	mvnsite	1m 5s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 1s		the patch passed
+1 💚	shadedclient	21m 39s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 6s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	55m 57s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 40s		The patch does not generate ASF License warnings.
		155m 52s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/7/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux c712c2fff536 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `299330d`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/7/testReport/
Max. process+thread count	3646 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/7/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Hexiaoqiao

LGTM. Thanks for your contribution. cc @zhangshuyan0 Would you mind to review again?

haiyang1987 · 2025-06-27T13:51:31Z

Thanks @hfutatzhanghb report.
Maybe we have discussed similar issues before in this PR #5829.
Do you suggest taking a look at it? thanks~

hfutatzhanghb · 2025-06-28T02:10:57Z

Thanks @hfutatzhanghb report. Maybe we have discussed similar issues before in this PR #5829. Do you suggest taking a look at it? thanks~

@haiyang1987 Thank you for reminding this. I will check it.

hfutatzhanghb · 2025-06-30T02:05:44Z

Hi, @Hexiaoqiao @haiyang1987 @zhangshuyan0 . Sorry for disturbing you . I will close this PR since #5829 have fixed this problem and already used in production envirnment. We can try to push #5829 forward and merge that pr into trunk. Thank you all！

hadoop-yetus · 2025-07-01T09:36:29Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 22s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	5m 57s		Maven dependency ordering for branch
+1 💚	mvninstall	19m 18s		trunk passed
+1 💚	compile	2m 59s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	compile	2m 38s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	0m 47s		trunk passed
+1 💚	mvnsite	1m 16s		trunk passed
+1 💚	javadoc	1m 11s		trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 33s		trunk passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	2m 58s		trunk passed
+1 💚	shadedclient	21m 39s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 23s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 3s		the patch passed
+1 💚	compile	2m 52s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javac	2m 52s		the patch passed
+1 💚	compile	2m 37s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	2m 37s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 35s		the patch passed
+1 💚	mvnsite	1m 11s		the patch passed
+1 💚	javadoc	0m 57s		the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚	javadoc	1m 23s		the patch passed with JDK Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 1s		the patch passed
+1 💚	shadedclient	21m 33s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 6s		hadoop-hdfs-client in the patch passed.
+1 💚	unit	55m 34s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 28s		The patch does not generate ASF License warnings.
		154m 19s

Subsystem	Report/Notes
Docker	ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/8/artifact/out/Dockerfile
GITHUB PR	#7762
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 5da6a78a303c 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `299330d`
Default Java	Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/8/testReport/
Max. process+thread count	3779 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7762/8/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

github-actions bot added HDFS trunk labels Jun 25, 2025

hfutatzhanghb force-pushed the HDFS-17801 branch from b91d1b9 to 6c144bb Compare June 25, 2025 08:29

hfutatzhanghb force-pushed the HDFS-17801 branch from 6c144bb to f52be54 Compare June 25, 2025 12:56

hfutatzhanghb force-pushed the HDFS-17801 branch from f52be54 to bd73e6c Compare June 25, 2025 16:05

hfutatzhanghb force-pushed the HDFS-17801 branch from bd73e6c to b89b1d0 Compare June 26, 2025 02:43

hfutatzhanghb force-pushed the HDFS-17801 branch from b89b1d0 to a44f887 Compare June 26, 2025 05:37

HDFS-17801. EC: Reading support retryCurrentNode to avoid transient e…

5fc7a04

…rrors cause application level failures.

KeeProMise requested a review from Copilot June 26, 2025 08:16

Copilot AI reviewed Jun 26, 2025

View reviewed changes

remove unused import.

299330d

hfutatzhanghb force-pushed the HDFS-17801 branch from a44f887 to 299330d Compare June 26, 2025 08:30

Hexiaoqiao reviewed Jun 27, 2025

View reviewed changes

hfutatzhanghb closed this Jun 30, 2025

hfutatzhanghb reopened this Jul 1, 2025

hfutatzhanghb closed this Jul 1, 2025

	/* possibly retry the same node so that transient errors don't
	* result in application level failures (e.g. Datanode could have
	* closed the connection because the client is idle for too long).
	*/
	sourceFound = seekToBlockSource(pos);

HDFS-17801. EC: Reading support retryCurrentNode to avoid transient errors cause application level failures. #7762

HDFS-17801. EC: Reading support retryCurrentNode to avoid transient errors cause application level failures. #7762

Uh oh!

Conversation

hfutatzhanghb commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of PR

How was this patch tested?

Uh oh!

hadoop-yetus commented Jun 25, 2025

Uh oh!

hadoop-yetus commented Jun 25, 2025

Uh oh!

hadoop-yetus commented Jun 25, 2025

Uh oh!

hadoop-yetus commented Jun 25, 2025

Uh oh!

hfutatzhanghb commented Jun 26, 2025

Uh oh!

hadoop-yetus commented Jun 26, 2025

Uh oh!

hadoop-yetus commented Jun 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

hadoop-yetus commented Jun 26, 2025

Uh oh!

Hexiaoqiao left a comment

Choose a reason for hiding this comment

Uh oh!

haiyang1987 commented Jun 27, 2025

Uh oh!

hfutatzhanghb commented Jun 28, 2025

Uh oh!

hfutatzhanghb commented Jun 30, 2025

Uh oh!

hadoop-yetus commented Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hfutatzhanghb commented Jun 25, 2025 •

edited

Loading