Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a GetChunks extension method for batches #230

Merged
merged 2 commits into from
Feb 8, 2022

Conversation

juchom
Copy link
Contributor

@juchom juchom commented Jan 31, 2022

Pull Request

What does this PR do?

This pull request is here to make code a little bit easier to follow for batches.

For exemple if we look at AddDocumentsInBatchesAsync we have this two piece of code.

public async Task<IEnumerable<TaskInfo>> AddDocumentsInBatchesAsync<T>(IEnumerable<T> documents, int batchSize = 1000, string primaryKey = default, CancellationToken cancellationToken = default)
{
async Task AddActionAsync(List<T> items, List<TaskInfo> tasks, CancellationToken localCancellationToken)
{
tasks.Add(await this.AddDocumentsAsync(items, primaryKey, localCancellationToken).ConfigureAwait(false));
}
var result = await BatchOperationAsync(documents, batchSize, AddActionAsync).ConfigureAwait(false);
return result;
}

private static async Task<List<TaskInfo>> BatchOperationAsync<T>(IEnumerable<T> items, int batchSize, Func<List<T>, List<TaskInfo>, CancellationToken, Task> action, CancellationToken cancellationToken = default)
{
var itemsList = new List<T>(items);
var numberOfBatches = Math.Ceiling((double)itemsList.Count / batchSize);
var result = new List<TaskInfo>();
for (var i = 0; i < numberOfBatches; i++)
{
var actualSize = Math.Min(batchSize, itemsList.Count - (i * batchSize));
var batch = itemsList.GetRange(i * batchSize, actualSize);
await action.Invoke(batch, result, cancellationToken).ConfigureAwait(false);
}
return result;
}

To follow the code we have to start 159 with a call to the second snippet. We go to the second snippet where we do so computation to get chunks, then we call a function and we have to go up to the first snippet and see what is happening with this local function.

With this refactoring the AddDocumentsInBatchesAsync is self explaining.

The second snippet is now replaced by an extension method that returns a chunk on demand.

With the new code you see what the method is doing on first read

public async Task<IEnumerable<TaskInfo>> AddDocumentsInBatchesAsync<T>(IEnumerable<T> documents, int batchSize = 1000, string primaryKey = default, CancellationToken cancellationToken = default)
{
var tasks = new List<TaskInfo>();
foreach (var chunk in documents.GetChunks(batchSize))
{
tasks.Add(await this.AddDocumentsAsync(chunk, primaryKey, cancellationToken).ConfigureAwait(false));
}
return tasks;
}

This is a part from a previous PR #213. I will make several from #213 because there is too much things in this one.

@brunoocasali brunoocasali self-requested a review January 31, 2022 18:49
@brunoocasali
Copy link
Member

Hey @juchom I'll take some time to review your PR, but I didn’t forget you!

Anyway, thanks for your contribution, I'll let you know if I need anything!

@brunoocasali brunoocasali added the enhancement New feature or request label Feb 4, 2022
brunoocasali
brunoocasali previously approved these changes Feb 4, 2022
Copy link
Member

@brunoocasali brunoocasali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @juchom🥇

Your description really helped me to understand what was going on there, the improvement made a real difference, thanks! 🎉

@brunoocasali
Copy link
Member

bors merge

bors bot added a commit that referenced this pull request Feb 4, 2022
230: Create a GetChunks extension method for batches r=brunoocasali a=juchom

# Pull Request

## What does this PR do?
This pull request is here to make code a little bit easier to follow for batches.

For exemple if we look at `AddDocumentsInBatchesAsync` we have this two piece of code.

https://github.com/meilisearch/meilisearch-dotnet/blob/3f14f432a658deb8a83ab505179b34559e9c1252/src/Meilisearch/Index.cs#L152-L161

https://github.com/meilisearch/meilisearch-dotnet/blob/3f14f432a658deb8a83ab505179b34559e9c1252/src/Meilisearch/Index.cs#L732-L745

To follow the code we have to start 159 with a call to the second snippet. We go to the second snippet where we do so computation to get chunks, then we call a function and we have to go up to the first snippet and see what is happening with this local function.

With this refactoring the `AddDocumentsInBatchesAsync` is self explaining.

The second snippet is now replaced by an extension method that returns a chunk on demand.

With the new code you see what the method is doing on first read

https://github.com/meilisearch/meilisearch-dotnet/blob/4264a71913db643356c128670de09122b6c7e7f4/src/Meilisearch/Index.cs#L152-L161

This is a part from a previous PR #213. I will make several from #213 because there is too much things in this one.

Co-authored-by: Julien Chomarat <j.chomarat@linoa.com>
@bors
Copy link
Contributor

bors bot commented Feb 4, 2022

Build failed:

@brunoocasali
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Feb 8, 2022
@bors
Copy link
Contributor

bors bot commented Feb 8, 2022

try

Build failed:

@brunoocasali
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Feb 8, 2022
@brunoocasali brunoocasali self-requested a review February 8, 2022 13:41
@bors
Copy link
Contributor

bors bot commented Feb 8, 2022

try

Build succeeded:

Copy link
Member

@brunoocasali brunoocasali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FINALLY! 🔥 🔥

Thanks a lot @juchom

@brunoocasali
Copy link
Member

bors merge

@bors
Copy link
Contributor

bors bot commented Feb 8, 2022

Build succeeded:

@bors bors bot merged commit 39ca0d8 into meilisearch:main Feb 8, 2022
@juchom juchom deleted the jc/chunk branch May 18, 2022 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants