-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added max_s3_upload_threads_per_task option to Redshift #334
Added max_s3_upload_threads_per_task option to Redshift #334
Conversation
@dmikurube @hiroyuki-sato |
24cd4e4
to
9121b60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to know why this PR changed some part (i.e default value) from the original.
I don't have any knowledge which is better yet.
@@ -83,6 +83,10 @@ public interface RedshiftPluginTask extends AwsCredentialsTaskWithPrefix, Plugin | |||
@ConfigDefault("true") | |||
public boolean getDeleteS3TempFile(); | |||
|
|||
@Config("max_s3_upload_threads_per_task") | |||
@ConfigDefault("null") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of the original PR is 5
. Why you change this value?
https://github.com/embulk/embulk-output-jdbc/pull/311/files#diff-3ad638974347d49a3df0afc40dc2fa00382b91105aa42eeec25468daccbfc434R54
Any update after this comment in trocco servcie?
#311
As newCachedThreadPool would create new thread for execution if and only if current thread are all occupied, it seems tenable to pass a maximum of thread to newCachedThreadPool. However, due to the execution mechanism of embulk is ansync we may have troubles in how to determine the max of it. The most simple way to resolve is just make the maximum of each thread pool as a argument. After some performance test, it seems fine to make the default value of thread_maximum as 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've kept the original behavior by default, due to the requested in the following comment.
It changes the thread behavior even if thread_maximum is not specified.
newCachedThreadPool() and newFixedThreadPool(5) behave different.
Can you make the it Optional<Interger>, and keep the behavior when the option is not specified?
Any update after this comment in trocco servcie?
No, the maximum number of threads remains 5
in the trocco.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Thanks. Do you think whether is is better to change 5
by default in future release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a problem with the change, but I don't think it needs to be changed.
I think users would change the configuration as needed.
this.executorService = Executors.newCachedThreadPool(); | ||
this.executorService = maxS3UploadThreadsPerTask != null | ||
? Executors.newFixedThreadPool(maxS3UploadThreadsPerTask) | ||
: Executors.newCachedThreadPool(); | ||
this.uploadAndCopyFutures = new ArrayList<Future<Void>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original PR change thread pool from newThreadPool
to Executors.newFixedThreadPool(threadMaximum)
.
Why you change this part from the origianl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the same reason as above, if max_s3_upload_threads_per_task
is not specified, continue to use newCachedThreadPool
to keep the original behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change basically looks good to me, as it keeps the original behavior, and it follows the feedbacks in the previous pull request.
Two points :
- Wanted to hear other comments from @embulk/jdbc-maintainers
- If some more change requests are expected to come from trocco, we may want to consider re-organizing the maintainer team and the maintenance policies. We now have some discussions on Zulip with @kekekenta, and let's keep going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -83,6 +83,10 @@ public interface RedshiftPluginTask extends AwsCredentialsTaskWithPrefix, Plugin | |||
@ConfigDefault("true") | |||
public boolean getDeleteS3TempFile(); | |||
|
|||
@Config("max_s3_upload_threads_per_task") | |||
@ConfigDefault("null") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Thanks. Do you think whether is is better to change 5
by default in future release?
this.executorService = Executors.newCachedThreadPool(); | ||
this.executorService = maxS3UploadThreadsPerTask != null | ||
? Executors.newFixedThreadPool(maxS3UploadThreadsPerTask) | ||
: Executors.newCachedThreadPool(); | ||
this.uploadAndCopyFutures = new ArrayList<Future<Void>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, This PR LGTM. If it is better to change max_s3_upload_threads_per_task
default value, let's change default in future release. For preparation, adding log message this value will change in future release...
is one of choice.
otherwise, Let's keep current behavior.
@t3t5u Thanks! LGTM👍 |
Thanks for your contribution. Merging this for v0.10.5... |
The changes are essentially the same as in this PR.
I've fixed the points commented on in the above review.