-
Notifications
You must be signed in to change notification settings - Fork 4.1k
STORM-2506: Print mapping between Task ID and Kafka Partitions #2109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM. The dummy values in the tests seem fine. The proposed extra changes sound reasonable. The commit message is referring to another (internal?) issue. Could you update it to refer to STORM-2506? |
| public void refresh() { | ||
| try { | ||
| LOG.info(taskId(_taskIndex, _totalTasks) + "Refreshing partition manager connections"); | ||
| LOG.info(taskId(_taskIndex, _totalTasks, _taskID) + " Refreshing partition manager connections"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for noticing and fixing these spacing issues in the log lines!
| public void onPartitionsAssigned(Collection<TopicPartition> partitions) { | ||
| LOG.info("Partitions reassignment. [consumer-group={}, consumer={}, topic-partitions={}]", | ||
| kafkaSpoutConfig.getConsumerGroupId(), kafkaConsumer, partitions); | ||
| LOG.info("Partitions reassignment. [task-ID={}, consumer-group={}, consumer={}, topic-partitions={}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor thing, just pointing it out, though it's probably fine this way: the "task ID" is referred to in 3 different styles in these log lines through this PR:
taskIdTask-IDtask-ID
I think each is consistent within their respective log lines. Just wondering if there's any value to it being consistent across them. Also wonder if there's any existing convention in other log lines in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how I have derived the various styles :
- taskId: Task.java has component ID named as
componentId, so usedtaskIdas the variable name for task ID. - Task-ID: This style is only being used in the print statement and is consistent with the existing style.
- task-ID: Only used once, consistent with the other variable names in the log statement here. This can be renamed to
task-Id. - I have used
taskIDas the variable name in rest of the files becausetaskIdis the name of a function in same set of files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Srishty. Regarding taskId() being a function name as the justification for taskID being the variable name: in at least one case the name of the function is bad and should be changed. i.e., KafkaUtils.java's taskId() should be taskPrefix() or something else. It's not the "Task ID".
|
These failed build errors should go away once #2112 (my checkstyle-fixing PR) is merged and you rebase. |
|
@erikdw @vinodkc @revans2 @srdo I am encountering the following checkstyle error while building my code : This error, I figured is because of the AbbreviationAsWordInName setting in storm_checkstyle.xml. The variable taskID has 2 consecutive capital letters and this creates a warning (which is a violation and hence the build does not go through). The default setting according to the checkstyle documentation is 3. Is there a specific reason we are setting it to 1? |
|
@srishtyagrawal : we didn't choose to set it to 1, we just inherited the default of the google style in google_checks.xml. I'm personally fine with 3, you can make that change too. I would note that the checkstyle thing is hitting constants ( We might want to do a comparison of all the defaults against what we've inherited from The GOOG. |
|
@erikdw and @srishtyagrawal I know that we are still working through issues with checkstyle being new, but I get a little nervous with making too many changes to the "standard". The more changes we make to our style the harder it is to explain to people what that style is and the harder it is to setup an IDE to conform with it. As it is right now we can say (and we should update the docs for this) that we conform to the google standard except that we use 4 spaces for indentation and have a max line length of 140. But if we start adding in more changes it gets harder to explain and we end up being like HBase where some impossible to read file is the documentation. If it is not too much of a problem would we rename {{taskID}} to {{taskId}}? |
|
@erikdw Thanks for pointing out the defaults in google_checks.xml. I am not sure what the reason is behind deviating from the checkstyle defaults. @revans2 variable
An alternative would be to increase the number of violations. |
|
@revans2 : agree that we should be careful in making changes away from the base "google_checks.xml", and I did already argue above that we should be using But let me make a few points:
|
|
@erikdw Isn't naming variables e.g. https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L61 isn't causing a violation for example. |
|
@srdo : ah, I forgot that So, in that list that I posted above of the identifiers that are violating the abbreviation standard, the following ones are all probably improperly declared as just Those seem easy enough to fix! So I withdraw my objection to |
|
@erikdw's suggestion on renaming the function |
|
technically and I think canonically, there's no problem with a method and a variable having the same exact name. What's bad in the existing code is that |
|
Thanks @erikdw for clarifying that. I was assuming that it must be a bad coding practice to have same name for a method as well as variable. Even |
|
@srishtyagrawal : thanks! So there's still a test failure: I wonder if this is a transient issue, or if it's caused by your change of the debug log line for emitting tuples? Notably, this is an integration test, so it's not run by a simple |
|
@erikdw thanks for pointing that out. Rebasing on top of the recent changes in master helped with the integration tests. |
|
Green ✔️, woot! |
|
@srdo please let me know if further changes are required. |
|
+1 |
|
@srishtyagrawal If you think it would be helpful? I don't really have an opinion about that. |
|
@srdo I was thinking of it from the perspective of consistency (between the trident KafkaSpout and normal KafkaSpout), not sure how useful it will be (don't know much about trident). |
|
@srishtyagrawal I think you are right, it probably makes sense to add. It looks like the Trident spout supports the same subscription methods as the regular spout, so it would probably be helpful to have the task id logged for the same reasons. |
|
Thanks :) |
|
@harshach can you please help with merging this PR? It has been sitting for 2 weeks now. |
|
+1 |
b286dd6 to
3ab11ea
Compare
|
@HeartSaVioR thanks for the approval. I have rebased the changes on top of the latest master branch. Can you please merge this PR? |
|
@srishtyagrawal |
|
@HeartSaVioR : thanks a lot! @srishtyagrawal is on leave for a month. Are there any 1.x releases planned before late July? |
|
Not sure, but we may need to have bug fix version soon since there're some opened pull requests on storm-kafka-client which sound like critical. |
|
I'll try to resolve the conflict on top of this patch when I have free time. If someone volunteers and raise a new PR it would be great. |
|
@HeartSaVioR : I'll try to take a stab later tonight at it. I'll let you know if I find time. |
|
@erikdw Nice! Thank you. |
|
@HeartSaVioR : I didn't have time tonight. I'll try to get to it tomorrow. |
|
@HeartSaVioR : FYI, I resolved the conflicts in my local repo, I'm gonna build and then send a new PR, will link from here. |
|
@HeartSaVioR : FYI, here's the PR for the backport: |
Link to the ticket