Skip to content

Conversation

@billierinaldi
Copy link
Contributor

So far, I have tested this patch by opening an output stream, writing, and syncing (via hflush or hsync). Then I broke the lease on the file and tried to write and sync again to the original output stream, obtaining an expected exception. After that, I closed the file and verified that the file contained the data from the first write but not the second write.

@steveloughran
Copy link
Contributor

hey billie, good to have you submitting code again. Looks like it doesn't merge as we've broken things already.

Which abfs endpoint did you run all the existing tests against?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be recurrent merge pain point: too many patches adding more things to every rest call.

Proposed: how about adding a RestOperationContext struct which gets passed down, leaseId would go in there, and later other stuff (statistics, trace context, etc) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PathIOException with path; make error string a const to use when matching in tests. Consider also a new LeaseRequiredException if that helps testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are on SLF4J now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch to SLF4J logging style; include full stack @ debug level

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be tied in to the FileSystem instance lifecycle too: an FS instance should really have a weak ref to all leases created under it, and fs.close to stop them all

@billierinaldi
Copy link
Contributor Author

Thanks for the review, @steveloughran! I'll work on addressing your comments.

@snvijaya
Copy link
Contributor

snvijaya commented Apr 6, 2020

As there are clients using ABFS accounts with HNS enabled and not enabled, we usually publish test results from both type of accounts. [Test config details; https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/site/markdown/testing_azure.md#testing-the-azure-abfs-client]

When you update the PR next, could you also please add unit tests around SelfRenewingLease and append, flush Abfsclient methods with lease valid/invalid cases.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

  • Afraid you will need to rebase...that might make those test failures go away
  • there's the usual 'declare your test endpoint' policy
  • and someone who understands the abfs client better than me will need to comment about the low level stuff.
    Lifecycle changes seem good though.

Looking in Guava, I see a ListeningScheduledExecutorService. Do we actually have to have 1 thread per lease, or would it be better to have an executor in the ABFS client, and each lease simply schedules work in there at the given rate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer you use our normal RetryPolicy if possible

@billierinaldi
Copy link
Contributor Author

Thanks for reviewing again, @steveloughran! I appreciate your helpful comments and will look into those.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just had a quick look @ this and made some more comments.

shall we try and get this in in 2020?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using DurationInfo in the try with resources (logging @ debug) to track how long acquire/release took. I can imagine it can take a while to acquire. Indeed, do we have to worry about timeouts, heartbeats, etc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o.a.h.utils.IOUtils.close methods do this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenericTestUtils lets you assert something is in the error message. Its critical to rethrow (maybe wrapped) the exception if it is not the one you were expecting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and add a message to raise if the condition is met.

note, it's ok to use AssertJ for your asserts, we are adopting it more broadly and enjoying its diagnostics.

@billierinaldi
Copy link
Contributor Author

I have only tested with HNS so far, so I am going to try testing without HNS now.

@steveloughran steveloughran added enhancement fs/azure changes related to azure; submitter must declare test endpoint labels Jan 15, 2021
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. some minor nits on imports &c

  • need some documentation, including the whole lease mechanism (how to use...)
  • I worry about raising exceptions on close(). A lot of code doesn't expect it. Is it just outputstream.close() when there is data buffered to be written? If so, that's ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer you use HadoopExecutors here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HadoopExecutors.shutdown has some error handling here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to lower group

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use LambdaTestUtils; return a string with that error message in the closure for it to be used in the exception. Ideally add out.toString() too. eg.

intercept(ioe, ERR_LEASE_EXPIRED, () -> {
  out..write(1);
  out.hsync();
  return "expected exception but got " + out;
  });

@billierinaldi
Copy link
Contributor Author

I think I have addressed the last round of review comments, including adding some documentation on single writer directories to abfs.md and adding more informative javadocs to the lease methods in AzureBlobFileSystem. Let me know if there is anything unclear or needing more explanation in the docs.

I am still working on getting my environment set up to run dev-support/testrun-scripts/runtests.sh, so I don't have test results to post yet, but hopefully I will have those soon.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 0m 0s test4tests The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 58s trunk passed
+1 💚 compile 0m 39s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 checkstyle 0m 28s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 shadedclient 16m 26s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 31s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+0 🆗 spotbugs 1m 0s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 58s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 javac 0m 26s the patch passed
-0 ⚠️ checkstyle 0m 17s /diff-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 3 new + 9 unchanged - 0 fixed = 12 total (was 9)
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 0s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 27s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 findbugs 0m 59s the patch passed
_ Other Tests _
+1 💚 unit 1m 31s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
77m 27s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/6/artifact/out/Dockerfile
GITHUB PR #1925
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint
uname Linux f37994748d0b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 97f843d
Default Java Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/6/testReport/
Max. process+thread count 607 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/6/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is coming along nicely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd keep that t text in the superclass text, in case a deep tree causes the nested cause not to be listed.

but: use toString() (implicitly) rather than getMessage, because some exceptions (NPW) have a null message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just rethrow it or wrap in an assertion error. we need that full stack trace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little architecture question. Would this be better in the Store than the FS? I don't know, and it is higher level than the Rest API, isn't it? Which implies this is the right place.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good. production code is done; just some minor nits about the tests being informative on future failures, which means: just rethrow exceptions when not expected, assertTrue/assertFalse to include some message about what failed. thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good writeup

  1. is there any validation here, that if a path in the local FS is to be leased then the executor count must be >1?
  2. what if I'm working with >1 FS? Will this configuration be per-fs? Or does it take a list of paths which can be full URIs to paths in a store? That's what we ended up doing with s3a authoritative paths0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any validation here, that if a path in the local FS is to be leased then the executor count must be >1?

Yes, in SelfRenewingLease it throws an exception if there are < 1 lease threads.

what if I'm working with >1 FS? Will this configuration be per-fs? Or does it take a list of paths which can be full URIs to paths in a store?

I believe the single writer dirs config accepts a list of full URIs -- I will double check -- but they all share the same pool of lease threads.

I am also looking into whether it makes sense to make the lease duration configurable. This would allow configuration of a finite or infinite lease duration, and in the infinite lease case we could avoid frequent calls to the Azure API to renew the lease. (For an infinite lease, if the client stopped without releasing the lease, the lease would have to be explicitly broken for a different writer to obtain a new lease on the file. It's a tradeoff, and I could imagine both finite and infinite lease options being useful.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction, single writer dirs accepts a list of paths, not URIs.

@steveloughran
Copy link
Contributor

OK, I'm happy with this; test changes are in, and the next step is to merge and see what happens to people using the feature.

Billie: +1 from me; if you are happy with it yourself then merge into trunk at your leisure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Background threads that will renew lease every 67% of lease i.e. 10 seconds for 15 second lease and 40 seconds for 60 second lease will add extra cost to customers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sneha, I didn't realize the lease renewals would be charged as write ops. It might be a poor experience for a user to have unexpected charges related to this lease configuration.

Copy link
Contributor

@snehavarma snehavarma Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Billie just wanted to add that it might not be charged as write ops but instead come under other ops or metadata ops. Either way it will be extra cost just not as high as write transaction charges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if infinite lease is sufficient for your use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think infinite leases are sufficient for my use case. I would be okay with removing lease renewal from this patch and leaving finite leases for future work in HADOOP-17590, but I am not sure what the best way to handle the configuration properties would be. It sounds like you are proposing a boolean fs.azure.write.enforcelease that would control whether lease ops are applied for all files, and all files would have the same finite lease duration, is that right? I am wondering how to make that work together with the fs.azure.singlewriter.directories property in this patch. Would we want to specify a special set of directories that uses infinite leases? Or do we need to figure out a way to specify lease duration for each directory that supports lease ops?

Copy link
Contributor

@snehavarma snehavarma Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, We should specify a special set of directories that uses infinite leases. By default we would keep 60 seconds as lease duration for all files.

Or do we need to figure out a way to specify lease duration for each directory that supports lease ops?
This will not scale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, does the property name fs.azure.singlewriter.directories still make sense, or should it be changed to something else such as fs.azure.infinite-lease-directories?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fs.azure.singlewriter.directories would confuse the users. We can name it something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling for cases when append may take more time than lease expiry needs to be added incase there is a finite lease.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code should not be required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. Do you mean that we shouldn't have an acquireLease method in the store because leases will be acquired automatically?

Copy link
Contributor

@snehavarma snehavarma Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if the file is being created then infinite lease can be automatically taken, for other cases yes you may need the acquire lease code till we integrate bundling of lease with append.

RenewLease code might be something you can completely get rid of

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 20s #1925 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #1925
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/9/console
versions git=2.17.1
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 44s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 35s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 27s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 33s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 1s trunk passed
+1 💚 shadedclient 14m 5s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 14m 24s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 18s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 2 new + 9 unchanged - 0 fixed = 11 total (was 9)
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 0s the patch passed
+1 💚 shadedclient 13m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 56s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
73m 54s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/10/artifact/out/Dockerfile
GITHUB PR #1925
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint
uname Linux 82ac18194489 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 822615e
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/10/testReport/
Max. process+thread count 697 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/10/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

DefaultValue = DEFAULT_FS_AZURE_INFINITE_LEASE_DIRECTORIES)
private String azureInfiniteLeaseDirs;

@IntegerConfigurationValidatorAnnotation(ConfigurationKey = FS_AZURE_LEASE_THREADS,
Copy link
Contributor

@snehavarma snehavarma Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these? i.e. Lease threads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will still be useful to issue the acquire and release operations in a thread pool for now. Possibly this could be removed if all acquire and release operations are moved into create and flush-with-close in the future.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 29s trunk passed
+1 💚 compile 0m 38s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 28s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 32s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 0s trunk passed
+1 💚 shadedclient 14m 6s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 14m 24s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
+1 💚 checkstyle 0m 17s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 23s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 0s the patch passed
+1 💚 shadedclient 13m 52s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 56s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
73m 21s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/11/artifact/out/Dockerfile
GITHUB PR #1925
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint
uname Linux 5e2ba26844f8 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9fc4f08
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/11/testReport/
Max. process+thread count 618 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

DefaultValue = DEFAULT_FS_AZURE_APPEND_BLOB_DIRECTORIES)
private String azureAppendBlobDirs;

@StringConfigurationValidatorAnnotation(ConfigurationKey = FS_AZURE_INFINITE_LEASE_KEY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the feature for both namespace and flatnamespace enabled accounts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the unit test with HNS and flat namespace storage accounts, so I think it will work. I have not done extensive testing with HNS disabled, however.

@apache apache deleted a comment from hadoop-yetus Mar 21, 2021
@apache apache deleted a comment from hadoop-yetus Mar 21, 2021
@apache apache deleted a comment from hadoop-yetus Mar 21, 2021
@apache apache deleted a comment from hadoop-yetus Mar 21, 2021
@apache apache deleted a comment from hadoop-yetus Mar 21, 2021
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One question about the while() loop during lease acquistion -is busy wait really the right approach here? I think you should use Thread.yield() if you aren't going to switch to any of the classic concurrency classes

LEASE_ACQUIRE_MAX_RETRIES, LEASE_ACQUIRE_RETRY_INTERVAL, TimeUnit.SECONDS);
acquireLease(retryPolicy, 0, 0);

while (leaseID == null && exception == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a CPU-heavy loop. I know it makes for a more responsive app, but it's a busy wait. Any way to replace with some concurrency class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We will have the Future at that point, so we could wait for it to complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a new change to address this. I am also looking into figuring out if I can mock an acquire lease failure to test this out a bit better.

try {
if (RetryPolicy.RetryAction.RetryDecision.RETRY
== retryPolicy.shouldRetry(null, numRetries, 0, true).action) {
LOG.debug("Failed acquire lease on {}, retrying: {}", path, throwable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed to

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 59s trunk passed
+1 💚 compile 0m 38s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 27s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 32s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 0m 58s trunk passed
+1 💚 shadedclient 14m 19s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 14m 37s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 28s the patch passed
+1 💚 javadoc 0m 22s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 20s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 0m 58s the patch passed
+1 💚 shadedclient 13m 41s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 57s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
74m 0s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/12/artifact/out/Dockerfile
GITHUB PR #1925
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint
uname Linux 30dff569d868 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b6803cf
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/12/testReport/
Max. process+thread count 681 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/12/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 2s codespell was not available.
+0 🆗 markdownlint 0m 2s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 33m 32s trunk passed
+1 💚 compile 0m 38s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 26s trunk passed
+1 💚 mvnsite 0m 37s trunk passed
+1 💚 javadoc 0m 33s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 28s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 0s trunk passed
+1 💚 shadedclient 14m 26s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 14m 44s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 17s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 20s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 3s the patch passed
+1 💚 shadedclient 14m 9s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 57s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
75m 13s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/13/artifact/out/Dockerfile
GITHUB PR #1925
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint
uname Linux d7a6ea0529b7 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 4fdfc08
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/13/testReport/
Max. process+thread count 718 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1925/13/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

} catch (Exception e) {
LOG.debug("Got exception waiting for acquire lease future. Checking if lease ID or "
+ "exception have been set", e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this raises an exception, is there any way the while loop will exit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in this section it will retry if it is below the max retries and otherwise set the exception variable. So by max retries we should either have a lease ID or an exception set, and the while loop will exit. In the unit test, I mocked two failures followed by a success as well as persistent failure and verified it had the correct behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood

@steveloughran
Copy link
Contributor

@billierinaldi -I'm happy with this. There may be some surprises once you go live, but there's nothing obvious to me right now.

+1.

merge when ready either from the button or the terminal. If you plan to backport to the 3.3.x line, cherry pick in to branch-3.3 and do a new test run. thanks

@billierinaldi billierinaldi merged commit c1fde4f into apache:trunk Apr 12, 2021
@billierinaldi billierinaldi deleted the abfs-lease-ops branch April 12, 2021 23:48
billierinaldi added a commit that referenced this pull request Apr 20, 2021
kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
* HADOOP-16948. Support single writer dirs.

* HADOOP-16948. Fix findbugs and checkstyle problems.

* HADOOP-16948. Fix remaining checkstyle problems.

* HADOOP-16948. Add DurationInfo, retry policy for acquiring lease, and javadocs

* HADOOP-16948. Convert ABFS client to use an executor for lease ops

* HADOOP-16948. Fix ABFS lease test for non-HNS

* HADOOP-16948. Fix checkstyle and javadoc

* HADOOP-16948. Address review comments

* HADOOP-16948. Use daemon threads for ABFS lease ops

* HADOOP-16948. Make lease duration configurable

* HADOOP-16948. Add error messages to test assertions

* HADOOP-16948. Remove extra isSingleWriterKey call

* HADOOP-16948. Use only infinite lease duration due to cost of renewal ops

* HADOOP-16948. Remove acquire/renew/release lease methods

* HADOOP-16948. Rename single writer dirs to infinite lease dirs

* HADOOP-16948. Fix checkstyle

* HADOOP-16948. Wait for acquire lease future

* HADOOP-16948. Add unit test for acquire lease failure
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
).

Contributed by Billie Rinaldi.

(cherry picked from commit c1fde4f)

Change-Id: I10bd8ff89649ced3498693c1386da29dc3cc8f40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement fs/azure changes related to azure; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants