Add client-side async API for replication (IAsyncReplicator, ReplicationClient) #1182

NehanPathan · 2025-09-10T10:57:10Z

@paulirwin @NightOwl888

Fixes:
Fixes #1181

Summary of the changes:
Implemented IAsyncReplicator and async methods in ReplicationClient for non-blocking replication operations.

Description

This PR introduces an async Task-based API for the replication client, allowing operations such as checking for updates, obtaining files, and releasing sessions to be executed asynchronously.

Key changes:

Added IAsyncReplicator interface.
Updated HttpReplicator to implement async versions of the operations (CheckForUpdateAsync, ObtainFileAsync, ReleaseAsync, PublishAsync).
Added async helper methods in HttpReplicator to wrap HTTP requests using HttpClient.
ReplicationClient now exposes async methods that call into IAsyncReplicator.

This avoids synchronous HTTP calls that could deadlock or cause performance issues, while keeping IReplicationHandler synchronous, as Lucene.NET APIs currently do not have async equivalents.

Additional context:
This implementation has been tested and works successfully when using UpdateNowAsync in the GSoC project by referencing the Lucene.NET repository in the GSoC extensions project.

…lient-side async API, related to #XXXX)

paulirwin

Thanks for the PR! I have some changes that need to be made, and a larger one about a refactoring that is up for discussion. Please share your thoughts on the refactoring before embarking on it!

src/Lucene.Net.Replicator/Http/HttpClientBase.cs

paulirwin · 2025-09-10T12:41:21Z

src/Lucene.Net.Replicator/Http/HttpReplicator.cs

+        /// <returns>
+        /// A <see cref="SessionToken"/> if updates are available; otherwise, <c>null</c>.
+        /// </returns>
+        public async Task<SessionToken?> CheckForUpdateAsync(string currentVersion, CancellationToken cancellationToken = default)


Please add #nullable enable to the top of this file to resolve these warnings, and then if that exposes other nullability issues in the file, please fix them.

paulirwin · 2025-09-10T12:46:15Z

src/Lucene.Net.Replicator/IAsyncReplicator.cs

+        /// </summary>
+        /// <param name="currentVersion">Current version of the index.</param>
+        /// <param name="cancellationToken">Cancellation token.</param>
+        Task<SessionToken?> CheckForUpdateAsync(string currentVersion, CancellationToken cancellationToken = default);


Add #nullable enable to this file to resolve these warnings.

paulirwin · 2025-09-10T12:46:46Z

src/Lucene.Net.Replicator/ReplicationClient.cs

+            if (asyncReplicator is null)
+                throw new InvalidOperationException("AsyncReplicator not initialized.");
+
+            SessionToken? session = null;


Please add #nullable enable to this file to resolve these warnings, and fix any nullability issues that might arise as a result.

@paulirwin @NightOwl888
can you pls suggest better way to adjust whole file after adding #nullable enable
beacause once i add that and try to resolve fields
it actually 2x or 3x my warnings after every build ie from 32 -> 84 -> 153 warnings

@NehanPathan

#nullable enable can be turned off again using #nullable restore. While it is preferable to solve all of the nullable issues rather than ignore them, this at least lets us put them off until they can be considered separately from ongoing work. I would highly recommend that the new code we are adding is fenced between #nullable enable and #nullable restore blocks, which will save us the work of having to do it all later.

paulirwin · 2025-09-10T12:52:21Z

src/Lucene.Net.Replicator/ReplicationClient.cs

+        /// <param name="handler"></param>
+        /// <param name="factory"></param>
+        /// <exception cref="ArgumentNullException"></exception>
+        public ReplicationClient(IAsyncReplicator asyncReplicator, IReplicationHandler handler, ISourceDirectoryFactory factory)


So, this is a brittle design. It means that you have to know which constructor was used in order to know which methods to call and not get an exception at runtime. I think we should break this type apart.

Let me know your thoughts on this:

Refactor out a common ReplicationClientBase abstract class with any common methods/fields

Make this type inherit from the base class. Leave // LUCENENET: comments where methods were refactored into the base class, to aid future porting efforts. Move the async methods into item 3 below.

Introduce a new AsyncReplicationClient that inherits from ReplicationClientBase, and only has the async methods. This way you can know that it will have an async replicator field.

Thoughts?

I suppose it is unlikely a user will require both async and synchronous methods at the same time. And it would clarify the StartAsyncUpdateLoop vs StartUpdateThread() logic. Although, I think both of those could be potentially combined.

I am hesitant to get on board with having completely separate implementations, though. That is not typically how concrete components in the BCL evolve. Usually, the synchronous and asynchronous methods exist side by side.

agree with @NightOwl888 suggestion
I guess we should stick with one ReplicationClient that implements both IReplicator and IAsyncReplicator

initially maintaining or testing it quite seems tough
but i guess our future target is asyn only
So may be
what we can do rightnow
Mark sync APIs with [Obsolete("Prefer async methods to avoid deadlocks.")] to gently guide users toward async.
Long term: async becomes the default; sync kept only for backward compatibility.

That said, I’m also flexible — if you both feel strongly about splitting into separate sync/async clients right now to minimize maintenance and testing overhead, I can adapt to that approach too.

Mark sync APIs with [Obsolete("Prefer async methods to avoid deadlocks.")] to gently guide users toward async.
Long term: async becomes the default; sync kept only for backward compatibility.

No, synchronous programming isn't going away any time soon. It is just in the space of HTTP servers, that it is no longer considered a good practice. Also, keep in mind that when debugging we will be comparing our implementation against the synchronous Java code. So, we should try to minimize changes to these methods.

ReplicationClient is not sealed. It is designed to be inherited to provide additional functionality. So, in that regard, it is the abstraction and can provide the base implementation of both the synchronous and asynchronous sides. The question is since ReplicationClient can already be mocked by inheritance, do we need to add interfaces?

Thanks for clarifying! You’re right — marking sync methods [Obsolete] was premature. Sync APIs are still important for backward compatibility and debugging against Java. Since ReplicationClient can already be extended and mocked via inheritance, we don’t need separate interfaces for now.

paulirwin · 2025-09-10T12:58:32Z

src/Lucene.Net.Replicator/ReplicationClient.cs

+            EnsureOpen();
+
+            // Acquire the same update lock to prevent concurrent updates
+            updateLock.Lock();


Question for @NightOwl888: is this async-safe?

According to ChatGPT, no this is not. Calling async code while using a synchronous lock can lead to deadlocks.

Do note that in other places where the call is to Task.Run(), this is fine to do.

Options

Use SemaphoreSlim instead of ReentrantLock. Note that SemaphoreSlim is not re-entrant, though.

Safe across await.

Prevents reentrant calls, which may actually be a feature (forces clear lock discipline).

But if you really need recursion/reentrancy, this won’t do.

Implement an async-compatible reentrant lock

This is tricky but possible.

You’d track:

Owning Thread or Task.Id

A recursion count

A SemaphoreSlim for waiting

On EnterAsync, if the current context already owns the lock, increment recursion count and continue immediately. Otherwise, wait.

On Exit, decrement recursion count and release only when it reaches 0.

Using SemaphoreSlim is certainly easier, but I have no idea whether reentrancy is actually required by the design, or if that is just something that happened by accident because ReentrantLock was the primitive that was chosen for synchronization. It would be possible to refactor the design to provide protected methods that are not protected by SemaphoreSlim so subclasses can avoid reentrancy. It would be harder to implement subclasses, but possible. I am not sure whether that covers general callers, though.

In the long run, implementing an AsyncReentrantLock could prove to be very useful for converting other parts of Lucene to be async, and would cause less pain for places where it is unknown whether reentrancy is a requirement.

Here is what such an implementation could look like. Note that I haven't tested this, it is straight out of ChatGPT. There may be a smarter way to do it than using a lock, but if we keep it, we would need to change lock (this) to UninterruptableMonitor.Enter() and UninterruptableMonitor.Exit() in a try/catch, to ensure it doesn't throw on thread interrupt.

AsyncReentrantLock - click to expand

using System; using System.Collections.Concurrent; using System.Collections.Generic; using System.Threading; using System.Threading.Tasks; /// <summary> /// A lightweight async-compatible reentrant lock. /// </summary> public sealed class AsyncReentrantLock { private readonly SemaphoreSlim _semaphore = new(1, 1); // Owner + recursion count private int? _ownerThreadId; private int _recursionCount; // Track waiting requests private int _waitingCount; /// <summary> Number of threads/tasks currently waiting for the lock. </summary> public int WaitingCount => Volatile.Read(ref _waitingCount); /// <summary> True if the lock is currently held. </summary> public bool IsLocked => _ownerThreadId != null; /// <summary> Thread ID of the owner, or null if none. </summary> public int? OwnerThreadId => _ownerThreadId; /// <summary> Recursion depth of the current owner. </summary> public int RecursionCount => _recursionCount; // ----------------------- // Sync API // ----------------------- public void Lock(CancellationToken cancellationToken = default) { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId == currentId) { _recursionCount++; return; } } Interlocked.Increment(ref _waitingCount); try { _semaphore.Wait(cancellationToken); } finally { Interlocked.Decrement(ref _waitingCount); } lock (this) { _ownerThreadId = currentId; _recursionCount = 1; } } // ----------------------- // Async API // ----------------------- public async Task LockAsync(CancellationToken cancellationToken = default) { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId == currentId) { _recursionCount++; return; } } Interlocked.Increment(ref _waitingCount); try { await _semaphore.WaitAsync(cancellationToken).ConfigureAwait(false); } finally { Interlocked.Decrement(ref _waitingCount); } lock (this) { _ownerThreadId = currentId; _recursionCount = 1; } } // ----------------------- // Unlock // ----------------------- public void Unlock() { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId != currentId) throw new SynchronizationLockException("Current thread does not own the lock."); _recursionCount--; if (_recursionCount == 0) { _ownerThreadId = null; _semaphore.Release(); } } } }

AsyncReentrantLockTests - click to expand

using NUnit.Framework; using System; using System.Threading; using System.Threading.Tasks; [TestFixture] public class AsyncReentrantLockTests { [Test] public void SyncLock_IsReentrant() { var l = new AsyncReentrantLock(); l.Lock(); Assert.That(l.IsLocked, Is.True); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Lock(); Assert.That(l.RecursionCount, Is.EqualTo(2)); l.Unlock(); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Unlock(); Assert.That(l.IsLocked, Is.False); } [Test] public async Task AsyncLock_IsReentrant() { var l = new AsyncReentrantLock(); await l.LockAsync(); Assert.That(l.IsLocked, Is.True); await l.LockAsync(); Assert.That(l.RecursionCount, Is.EqualTo(2)); l.Unlock(); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Unlock(); Assert.That(l.IsLocked, Is.False); } [Test] public void ThrowsIfUnlockedByOtherThread() { var l = new AsyncReentrantLock(); l.Lock(); var ex = Assert.Throws<SynchronizationLockException>(() => { var t = new Thread(() => l.Unlock()); t.Start(); t.Join(); }); Assert.That(ex.Message, Does.Contain("does not own")); } [Test] public async Task LockAsync_BlocksUntilReleased() { var l = new AsyncReentrantLock(); l.Lock(); var task = Task.Run(async () => { await Task.Delay(100); l.Unlock(); }); await l.LockAsync(); // should block until Unlock is called Assert.That(l.IsLocked, Is.True); l.Unlock(); await task; } [Test] public async Task WaitingCount_IsAccurate() { var l = new AsyncReentrantLock(); l.Lock(); var t1 = l.LockAsync(); var t2 = l.LockAsync(); await Task.Delay(50); // let them enqueue Assert.That(l.WaitingCount, Is.EqualTo(2)); l.Unlock(); // release once await Task.WhenAll(t1, t2); l.Unlock(); // release final owner } }

Yes, my bad — I was just trying to stay aligned with the sync version at that time.

But after looking more carefully, I think we don’t actually need ReentrantLock for the sync version either. From what I see, inside doUpdate or doUpdateAsync there aren’t any reentrant scenarios, so it should be safe to use SemaphoreSlim in both cases.

@NightOwl888 @paulirwin
Please correct me if I’m missing something or if there are still chances of reentrancy in the sync version.

NightOwl888 · 2025-09-10T19:19:59Z

src/Lucene.Net.Replicator/ReplicationClient.cs

+        /// <param name="handler"></param>
+        /// <param name="factory"></param>
+        /// <exception cref="ArgumentNullException"></exception>
+        public ReplicationClient(IAsyncReplicator asyncReplicator, IReplicationHandler handler, ISourceDirectoryFactory factory)


I suppose it is unlikely a user will require both async and synchronous methods at the same time. And it would clarify the StartAsyncUpdateLoop vs StartUpdateThread() logic. Although, I think both of those could be potentially combined.

I am hesitant to get on board with having completely separate implementations, though. That is not typically how concrete components in the BCL evolve. Usually, the synchronous and asynchronous methods exist side by side.

NightOwl888 · 2025-09-10T20:01:20Z

src/Lucene.Net.Replicator/ReplicationClient.cs

+            EnsureOpen();
+
+            // Acquire the same update lock to prevent concurrent updates
+            updateLock.Lock();


According to ChatGPT, no this is not. Calling async code while using a synchronous lock can lead to deadlocks.

Do note that in other places where the call is to Task.Run(), this is fine to do.

Options

Use SemaphoreSlim instead of ReentrantLock. Note that SemaphoreSlim is not re-entrant, though.

Safe across await.

Prevents reentrant calls, which may actually be a feature (forces clear lock discipline).

But if you really need recursion/reentrancy, this won’t do.

Implement an async-compatible reentrant lock

This is tricky but possible.

You’d track:

Owning Thread or Task.Id

A recursion count

A SemaphoreSlim for waiting

On EnterAsync, if the current context already owns the lock, increment recursion count and continue immediately. Otherwise, wait.

On Exit, decrement recursion count and release only when it reaches 0.

Using SemaphoreSlim is certainly easier, but I have no idea whether reentrancy is actually required by the design, or if that is just something that happened by accident because ReentrantLock was the primitive that was chosen for synchronization. It would be possible to refactor the design to provide protected methods that are not protected by SemaphoreSlim so subclasses can avoid reentrancy. It would be harder to implement subclasses, but possible. I am not sure whether that covers general callers, though.

In the long run, implementing an AsyncReentrantLock could prove to be very useful for converting other parts of Lucene to be async, and would cause less pain for places where it is unknown whether reentrancy is a requirement.

Here is what such an implementation could look like. Note that I haven't tested this, it is straight out of ChatGPT. There may be a smarter way to do it than using a lock, but if we keep it, we would need to change lock (this) to UninterruptableMonitor.Enter() and UninterruptableMonitor.Exit() in a try/catch, to ensure it doesn't throw on thread interrupt.

AsyncReentrantLock - click to expand

using System; using System.Collections.Concurrent; using System.Collections.Generic; using System.Threading; using System.Threading.Tasks; /// <summary> /// A lightweight async-compatible reentrant lock. /// </summary> public sealed class AsyncReentrantLock { private readonly SemaphoreSlim _semaphore = new(1, 1); // Owner + recursion count private int? _ownerThreadId; private int _recursionCount; // Track waiting requests private int _waitingCount; /// <summary> Number of threads/tasks currently waiting for the lock. </summary> public int WaitingCount => Volatile.Read(ref _waitingCount); /// <summary> True if the lock is currently held. </summary> public bool IsLocked => _ownerThreadId != null; /// <summary> Thread ID of the owner, or null if none. </summary> public int? OwnerThreadId => _ownerThreadId; /// <summary> Recursion depth of the current owner. </summary> public int RecursionCount => _recursionCount; // ----------------------- // Sync API // ----------------------- public void Lock(CancellationToken cancellationToken = default) { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId == currentId) { _recursionCount++; return; } } Interlocked.Increment(ref _waitingCount); try { _semaphore.Wait(cancellationToken); } finally { Interlocked.Decrement(ref _waitingCount); } lock (this) { _ownerThreadId = currentId; _recursionCount = 1; } } // ----------------------- // Async API // ----------------------- public async Task LockAsync(CancellationToken cancellationToken = default) { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId == currentId) { _recursionCount++; return; } } Interlocked.Increment(ref _waitingCount); try { await _semaphore.WaitAsync(cancellationToken).ConfigureAwait(false); } finally { Interlocked.Decrement(ref _waitingCount); } lock (this) { _ownerThreadId = currentId; _recursionCount = 1; } } // ----------------------- // Unlock // ----------------------- public void Unlock() { int currentId = Thread.CurrentThread.ManagedThreadId; lock (this) { if (_ownerThreadId != currentId) throw new SynchronizationLockException("Current thread does not own the lock."); _recursionCount--; if (_recursionCount == 0) { _ownerThreadId = null; _semaphore.Release(); } } } }

AsyncReentrantLockTests - click to expand

using NUnit.Framework; using System; using System.Threading; using System.Threading.Tasks; [TestFixture] public class AsyncReentrantLockTests { [Test] public void SyncLock_IsReentrant() { var l = new AsyncReentrantLock(); l.Lock(); Assert.That(l.IsLocked, Is.True); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Lock(); Assert.That(l.RecursionCount, Is.EqualTo(2)); l.Unlock(); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Unlock(); Assert.That(l.IsLocked, Is.False); } [Test] public async Task AsyncLock_IsReentrant() { var l = new AsyncReentrantLock(); await l.LockAsync(); Assert.That(l.IsLocked, Is.True); await l.LockAsync(); Assert.That(l.RecursionCount, Is.EqualTo(2)); l.Unlock(); Assert.That(l.RecursionCount, Is.EqualTo(1)); l.Unlock(); Assert.That(l.IsLocked, Is.False); } [Test] public void ThrowsIfUnlockedByOtherThread() { var l = new AsyncReentrantLock(); l.Lock(); var ex = Assert.Throws<SynchronizationLockException>(() => { var t = new Thread(() => l.Unlock()); t.Start(); t.Join(); }); Assert.That(ex.Message, Does.Contain("does not own")); } [Test] public async Task LockAsync_BlocksUntilReleased() { var l = new AsyncReentrantLock(); l.Lock(); var task = Task.Run(async () => { await Task.Delay(100); l.Unlock(); }); await l.LockAsync(); // should block until Unlock is called Assert.That(l.IsLocked, Is.True); l.Unlock(); await task; } [Test] public async Task WaitingCount_IsAccurate() { var l = new AsyncReentrantLock(); l.Lock(); var t1 = l.LockAsync(); var t2 = l.LockAsync(); await Task.Delay(50); // let them enqueue Assert.That(l.WaitingCount, Is.EqualTo(2)); l.Unlock(); // release once await Task.WhenAll(t1, t2); l.Unlock(); // release final owner } }

src/Lucene.Net.Replicator/Http/HttpClientBase.cs

…safe); fix all warnings and nullable issues

…REAM_CANCELLATIONTOKEN defines for .NET 8+ and fix all warnings

NehanPathan · 2025-09-11T10:19:46Z

@paulirwin @NightOwl888
I’ve tried to cover most of the suggested changes, and the nullable warnings have also been resolved.
Please let me know if I made any mistakes — I apologize in advance, as there were quite a few nullable warnings to address.

NehanPathan · 2025-10-06T15:26:49Z

@NightOwl888
2 weeks ago, I accidentally merged the master branch into feature/async-replicator-client.
I’ve performed a force push on this branch to remove the merge commit from apache:master.

Reason: The merge brought in changes from master that we do not want in this feature branch. All intended feature commits are still intact.

This keeps the branch history clean and focused on the feature work.

If any changes from master are important to include, please let me know, and we can merge them properly.

NehanPathan · 2025-10-06T15:46:34Z

I’m seeing the check-editorconfig CI fail due to a “final newline expected” error. Locally, git diff --check doesn’t show any issues, so I suspect it might be related to line endings (CRLF vs LF) or some subtle trailing whitespace.

Could you please guide me on the safest way to fix this, such as how to identify files with extra trailing whitespace or missing newlines, so the CI passes without affecting other files?

NightOwl888 · 2025-10-06T15:51:02Z

@NehanPathan

That error means that after all of the content in the file, there is no newline character. So, it just needs to be added.

I know it is a bit confusing - remove all trailing whitespace except at the end of the file, add a line break.

Add IAsyncReplicator interface and async support in HttpReplicator (c…

e40254c

…lient-side async API, related to #XXXX)

paulirwin self-requested a review September 10, 2025 12:41

paulirwin requested changes Sep 10, 2025

View reviewed changes

NightOwl888 requested changes Sep 10, 2025

View reviewed changes

NehanPathan added 2 commits September 11, 2025 15:38

Replace ReentrantLock with SemaphoreSlim in ReplicationClient (async-…

298f041

…safe); fix all warnings and nullable issues

Add FEATURE_HTTPCONTENT_READASSTREAM and FEATURE_HTTPCONTENT_READASST…

8135524

…REAM_CANCELLATIONTOKEN defines for .NET 8+ and fix all warnings

NehanPathan force-pushed the feature/async-replicator-client branch from 25f65ff to 8135524 Compare October 6, 2025 15:19

remove extra trailing

42e3d32

NehanPathan mentioned this pull request Oct 6, 2025

feat: Optimize SmartCn Dictionaries and Add Dictionary Loading Tests #1154

Merged

4 tasks

file fix: add final newline and remove trailing whitespace

f80c97b

paulirwin mentioned this pull request Oct 27, 2025

Change HttpClientBase.ExecutePost to match upstream better and drop Newtonsoft.Json dependency #1212

Merged

4 tasks

paulirwin self-assigned this Oct 27, 2025

paulirwin added the notes:improvement An enhancement to an existing feature label Oct 27, 2025

Add client-side async API for replication (IAsyncReplicator, ReplicationClient) #1182

Are you sure you want to change the base?

Add client-side async API for replication (IAsyncReplicator, ReplicationClient) #1182

Uh oh!

Conversation

NehanPathan commented Sep 10, 2025

Description

Uh oh!

paulirwin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Options

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Options

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NehanPathan commented Sep 11, 2025

Uh oh!

NehanPathan commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NehanPathan commented Oct 6, 2025

Uh oh!

NightOwl888 commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NehanPathan commented Oct 6, 2025 •

edited

Loading

NightOwl888 commented Oct 6, 2025 •

edited

Loading