Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modified code for byte_size parameter #1846

Closed

Conversation

kamalsharma9001
Copy link
Contributor

  1. Sending byte size limit parameter to bulk processor
  2. Added a condition to check if total byte size of bulk request has reached byte size limit and accordingly sending bulk request

1. Sending byte size limit parameter to bulk processor
2. Added a condition to check if total byte size of bulk request has reached byte
 size limit and accordingly sending bulk request
@dadoonet dadoonet self-assigned this Mar 27, 2024
@dadoonet dadoonet added the bug For confirmed bugs label Mar 27, 2024
@dadoonet dadoonet added this to the 2.10 milestone Mar 27, 2024
Copy link
Owner

@dadoonet dadoonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR.
I gave a first quick look.
The most important thing to fix is the build. The project does not compile anymore.

I don't think we need to change private final List<T> operations = new ArrayList<>(); by private final List<Object> operations = new ArrayList<>();

I'll do another review when I have some spare time and might propose something.

@@ -422,9 +423,13 @@ public void index(String index, String id, Doc doc, String pipeline) {
@Override
public void indexRawJson(String index, String id, String json, String pipeline) {
logger.trace("JSon indexed : {}", json);
//modified code starts
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this line?

bulkProcessor.add(new ElasticsearchIndexOperation(index, id, pipeline, json));
//modified code ends
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean comment written at line 428?

}



Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the 2 lines?

@@ -44,9 +44,9 @@ public ElasticsearchBulkResponse bulk(ElasticsearchBulkRequest request) {
StringBuilder bulkRequest = new StringBuilder();
// Header
bulkRequest.append("{\"")
.append(r.getOperation().toString().toLowerCase(Locale.ROOT))
.append(((ElasticsearchOperation) r).getOperation().toString().toLowerCase(Locale.ROOT))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that needed?

@@ -24,17 +24,17 @@

public abstract class FsCrawlerBulkRequest<T extends FsCrawlerOperation<T>> {

private final List<T> operations = new ArrayList<>();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fr.pilato.elasticsearch.crawler.fs.framework.bulk.FsCrawlerBulkProcessor.java, we were not able to access "getJson" method written in "ElasticsearchIndexOperation" class in "add" method written in FsCrawlerBulkProcessor.java. To achieve this, we changed data type of argument passed to "Object" and later on casted the object for respective class to access it's methods. If you can help in achieving the same in any other way without modifying data type, then it will be helpful

@dadoonet
Copy link
Owner

You don't need to create another PR for this. You just need to update your branch and push the branch to your repository. It will be automatically managed by GitHub.

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@kamalsharma9001
Copy link
Contributor Author

made changes as per your comments. Please check

@dadoonet
Copy link
Owner

This looks overall good. Let me check later this week or next week.

@dadoonet
Copy link
Owner

Thanks a lot! I created #1852 from your branch and did then some refactoring + adding some tests.
Please have a look and feel free to comment the PR ;)

Closing this one then.

@dadoonet dadoonet closed this Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For confirmed bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants