Skip to content

Conversation

@srishtyagrawal
Copy link

@srdo
Copy link
Contributor

srdo commented May 12, 2017

LGTM. The dummy values in the tests seem fine. The proposed extra changes sound reasonable.

The commit message is referring to another (internal?) issue. Could you update it to refer to STORM-2506?

@srishtyagrawal
Copy link
Author

@erikdw @revans2 @harshach Please review and let me know if any changes are required.

public void refresh() {
try {
LOG.info(taskId(_taskIndex, _totalTasks) + "Refreshing partition manager connections");
LOG.info(taskId(_taskIndex, _totalTasks, _taskID) + " Refreshing partition manager connections");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for noticing and fixing these spacing issues in the log lines!

public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
LOG.info("Partitions reassignment. [consumer-group={}, consumer={}, topic-partitions={}]",
kafkaSpoutConfig.getConsumerGroupId(), kafkaConsumer, partitions);
LOG.info("Partitions reassignment. [task-ID={}, consumer-group={}, consumer={}, topic-partitions={}]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing, just pointing it out, though it's probably fine this way: the "task ID" is referred to in 3 different styles in these log lines through this PR:

  • taskId
  • Task-ID
  • task-ID

I think each is consistent within their respective log lines. Just wondering if there's any value to it being consistent across them. Also wonder if there's any existing convention in other log lines in the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I have derived the various styles :

  • taskId: Task.java has component ID named as componentId, so used taskId as the variable name for task ID.
  • Task-ID: This style is only being used in the print statement and is consistent with the existing style.
  • task-ID: Only used once, consistent with the other variable names in the log statement here. This can be renamed to task-Id.
  • I have used taskID as the variable name in rest of the files because taskId is the name of a function in same set of files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Srishty. Regarding taskId() being a function name as the justification for taskID being the variable name: in at least one case the name of the function is bad and should be changed. i.e., KafkaUtils.java's taskId() should be taskPrefix() or something else. It's not the "Task ID".

@erikdw
Copy link
Contributor

erikdw commented May 12, 2017

These failed build errors should go away once #2112 (my checkstyle-fixing PR) is merged and you rebase.

@srishtyagrawal
Copy link
Author

@erikdw @vinodkc @revans2 @srdo

I am encountering the following checkstyle error while building my code :

<error line="46" severity="warning" message="Abbreviation in name &apos;taskID&apos; must contain no more than &apos;2&apos; consecutive capital letters." source="com.puppycrawl.tools.checkstyle.checks.naming.AbbreviationAsWordInNameCheck"/>

This error, I figured is because of the AbbreviationAsWordInName setting in storm_checkstyle.xml. The variable taskID has 2 consecutive capital letters and this creates a warning (which is a violation and hence the build does not go through). The default setting according to the checkstyle documentation is 3. Is there a specific reason we are setting it to 1?

@erikdw
Copy link
Contributor

erikdw commented May 17, 2017

@srishtyagrawal : we didn't choose to set it to 1, we just inherited the default of the google style in google_checks.xml. I'm personally fine with 3, you can make that change too.

I would note that the checkstyle thing is hitting constants (final <type>) too, from my analysis of the checkstyle-result.xml files. That is a result of another deviation from the default, where ignoreFinal is false for google_checks.xml, instead of true as is the default. We should definitely change that to true!

We might want to do a comparison of all the defaults against what we've inherited from The GOOG.

@revans2
Copy link
Contributor

revans2 commented May 17, 2017

@erikdw and @srishtyagrawal I know that we are still working through issues with checkstyle being new, but I get a little nervous with making too many changes to the "standard".

The more changes we make to our style the harder it is to explain to people what that style is and the harder it is to setup an IDE to conform with it. As it is right now we can say (and we should update the docs for this) that we conform to the google standard except that we use 4 spaces for indentation and have a max line length of 140. But if we start adding in more changes it gets harder to explain and we end up being like HBase where some impossible to read file is the documentation. If it is not too much of a problem would we rename {{taskID}} to {{taskId}}?

@srishtyagrawal
Copy link
Author

@erikdw Thanks for pointing out the defaults in google_checks.xml. I am not sure what the reason is behind deviating from the checkstyle defaults.

@revans2 variable taskID has been used in the following three files:

  1. KafkaUtils.java: This file already has a function named taskId. I can rename taskID to be taskid or task-id. Although the variables in this file follow Camel case naming standards.
  2. StaticCoordinator.java: taskID can be changed to taskId in this file.
  3. ZkCoordinator.java: uses taskId function, can change the variable name to taskid or task-id.

An alternative would be to increase the number of violations.

@erikdw
Copy link
Contributor

erikdw commented May 17, 2017

@revans2 : agree that we should be careful in making changes away from the base "google_checks.xml", and I did already argue above that we should be using taskId for the variable (that was implied when I said we should rename that really badly named taskId() method to taskPrefix()). So that should unblock Srishty for now.

But let me make a few points:

  • The "standard" we are using here is just the Google standard, it's not some industry-wide thing that everyone strictly follows. I'm not sure whether we should be trying to follow it strictly, or just using it as a jumping off point for our own standard.

  • I'm pretty sure most would agree that ignoreFinal should be false (unlike the google default), because otherwise your constants won't be be allowed to be named FOO_BAR, they'd be named Foo_Bar or FooBar, neither of which stand out as being a constant. So that's at least one more deviation I strongly believe we should be making. And I'm sure there are more, the Google "standard" seems to be a bit wacky IMHO.

    • As @srdo pointed out below, this point is invalid, because constants in Java should be specified with both final and static.
  • If your statement about "documentation" and "HBase" (I'm not familiar with that issue) is about how we're obscuring what we've changed from the Google defaults, I do have sympathy for that view. But that's because it's also going to be harder to upgrade our version of checkstyle later, since we'll need our changes to be applied to a newer google_checks.xml file. Though I will note that it's quite easy to craft a command to show the changes we've made so far:

    % diff <(curl -s https://raw.githubusercontent.com/checkstyle/checkstyle/checkstyle-7.7/src/main/resources/google_checks.xml) <(curl -s https://raw.githubusercontent.com/apache/storm/master/storm-buildtools/storm_checkstyle.xml)
    1a2,19
    > 
    > <!--
    >  Licensed to the Apache Software Foundation (ASF) under one or more
    >  contributor license agreements.  See the NOTICE file distributed with
    >  this work for additional information regarding copyright ownership.
    >  The ASF licenses this file to You under the Apache License, Version 2.0
    >  (the "License"); you may not use this file except in compliance with
    >  the License.  You may obtain a copy of the License at
    > 
    >      http://www.apache.org/licenses/LICENSE-2.0
    > 
    >  Unless required by applicable law or agreed to in writing, software
    >  distributed under the License is distributed on an "AS IS" BASIS,
    >  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    >  See the License for the specific language governing permissions and
    >  limitations under the License.
    > -->
    > 
    6a25,36
    >     The original file came from here:
    >       https://raw.githubusercontent.com/checkstyle/checkstyle/checkstyle-7.7/src/main/resources/google_checks.xml
    >     It has been slightly modified for use in Apache Storm, as follows:
    >       * 4 space indents instead of 2
    >       * line-length limit is 140 instead of 100
    >     Once checkstyle has the ability to override selected configuration elements from within the Maven
    >     pom.xml file, then we can remove this file in favor of overriding the provided google_checks.xml file.
    >     See this issue to track that functionality:
    >       https://github.com/checkstyle/checkstyle/issues/2873
    >  -->
    > 
    > <!--
    43c73
    <             <property name="max" value="100"/>
    ---
    >             <property name="max" value="140"/>
    55c85
    <             <property name="maxLineLength" value="100"/>
    ---
    >             <property name="maxLineLength" value="140"/>
    158c188
    <             <property name="basicOffset" value="2"/>
    ---
    >             <property name="basicOffset" value="4"/>
    160c190
    <             <property name="caseIndent" value="2"/>
    ---
    >             <property name="caseIndent" value="4"/>
    163c193
    <             <property name="arrayInitIndent" value="2"/>
    ---
    >             <property name="arrayInitIndent" value="4"/>
    

@srdo
Copy link
Contributor

srdo commented May 17, 2017

@erikdw Isn't naming variables e.g. FOO_BAR usually reserved for constants that are both static and final? I wouldn't usually expect non-static finals to be all caps. The ignoreStatic default of true seems to catch static final constants, so I don't think we need to change ignoreFinal?

https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L61 isn't causing a violation for example.

@erikdw
Copy link
Contributor

erikdw commented May 17, 2017

@srdo : ah, I forgot that final was different from static + final, and that static + final is really what a constant should be.

So, in that list that I posted above of the identifiers that are violating the abbreviation standard, the following ones are all probably improperly declared as just final instead of static + final:

% (find . -name 'checkstyle-result.xml' -exec cat {} \;) | grep 'consecutive capital letters' | cut -d';' -f2 | cut -d'&' -f1 | sort | uniq -c | sort | grep -v '[a-z]'
   1 ATTEMPTS_INTERVAL_TIME
   1 BLOBSTORE_MAX_KEY_SEQUENCE_SUBTREE
   1 CHANNEL_ALIVE_INTERVAL_MS
   1 CLI
   1 COORD_STREAM
   1 DRPC
   1 INITIAL_SEQUENCE_NUMBER
   1 INT_CAPACITY
   1 MAX_NUM
   1 MAX_RETRY_ATTEMPTS
   1 MAX_ROUNDS
   1 ONE_HUNDRED
   1 OR
   1 PQ_SIZE
   1 PREPARE_ID
   1 SCM
   1 SPOUT_ID
   1 TOPOLOGY_WORKER_DEFAULT_MEMORY_ALLOCATION
   2 BLOBSTORE_SUBTREE
   2 STORM_CODE_SUFFIX
   2 STORM_CONF_SUFFIX
   2 STORM_JAR_SUFFIX
   7 LOG

Those seem easy enough to fix!

So I withdraw my objection to ignoreFinal being true, thanks for pointing out my mistake.

@srishtyagrawal
Copy link
Author

@erikdw's suggestion on renaming the function taskId() to taskPrefix() seems reasonable to me. I can then rename the variable name from taskID to taskId.

@erikdw
Copy link
Contributor

erikdw commented May 17, 2017

technically and I think canonically, there's no problem with a method and a variable having the same exact name. What's bad in the existing code is that taskId() means something entirely different from... "Task ID". It's just a bad name.

@srishtyagrawal
Copy link
Author

Thanks @erikdw for clarifying that. I was assuming that it must be a bad coding practice to have same name for a method as well as variable. Even taskPrefix is a method as well as variable name in KafkaUtils.java.

@srishtyagrawal
Copy link
Author

@erikdw @srdo fixed the code according to your suggestions.

@erikdw
Copy link
Contributor

erikdw commented May 18, 2017

@srishtyagrawal : thanks! So there's still a test failure:

Tests run: 37, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2,126.119 sec <<< FAILURE! - in TestSuite

testTumbleCount(org.apache.storm.st.tests.window.TumblingWindowTest)  Time elapsed: 222.363 sec  <<< FAILURE!

java.lang.AssertionError: Expecting min 5 bolt emits, found: 4 

I wonder if this is a transient issue, or if it's caused by your change of the debug log line for emitting tuples?

Notably, this is an integration test, so it's not run by a simple mvn test. Looking at the .travis.yml, it seems like this code and thus this script is responsible for running the integration test that is failing. I'm not sure if you can just run it on your laptop directly.

@srishtyagrawal
Copy link
Author

@erikdw thanks for pointing that out. Rebasing on top of the recent changes in master helped with the integration tests.

@erikdw
Copy link
Contributor

erikdw commented May 18, 2017

Green ✔️, woot!

@srishtyagrawal
Copy link
Author

srishtyagrawal commented May 18, 2017

@srdo please let me know if further changes are required.

@srdo
Copy link
Contributor

srdo commented May 18, 2017

+1

@srishtyagrawal
Copy link
Author

@revans2 can you please merge this PR?

@srdo do we need task-ID values in logs while printing partition to task mapping for trident? I can file another STORM ticket for that.

@srdo
Copy link
Contributor

srdo commented May 24, 2017

@srishtyagrawal If you think it would be helpful? I don't really have an opinion about that.

@srishtyagrawal
Copy link
Author

@srdo I was thinking of it from the perspective of consistency (between the trident KafkaSpout and normal KafkaSpout), not sure how useful it will be (don't know much about trident).

@srdo
Copy link
Contributor

srdo commented May 24, 2017

@srishtyagrawal I think you are right, it probably makes sense to add. It looks like the Trident spout supports the same subscription methods as the regular spout, so it would probably be helpful to have the task id logged for the same reasons.

@srishtyagrawal
Copy link
Author

@srdo
Copy link
Contributor

srdo commented May 24, 2017

Thanks :)

@srishtyagrawal
Copy link
Author

@harshach can you please help with merging this PR? It has been sitting for 2 weeks now.

@HeartSaVioR
Copy link
Contributor

+1

@srishtyagrawal
Copy link
Author

@HeartSaVioR thanks for the approval. I have rebased the changes on top of the latest master branch. Can you please merge this PR?

@asfgit asfgit merged commit 3ab11ea into apache:master Jun 21, 2017
@HeartSaVioR
Copy link
Contributor

@srishtyagrawal
I merged this to master branch, but seems like there're some conflicts against 1.x branch. If you want to adopt this to 1.x version line, you may need to raise a new PR against 1.x branch. WDYT?

@erikdw
Copy link
Contributor

erikdw commented Jun 22, 2017

@HeartSaVioR : thanks a lot! @srishtyagrawal is on leave for a month. Are there any 1.x releases planned before late July?

@HeartSaVioR
Copy link
Contributor

Not sure, but we may need to have bug fix version soon since there're some opened pull requests on storm-kafka-client which sound like critical.

@HeartSaVioR
Copy link
Contributor

I'll try to resolve the conflict on top of this patch when I have free time. If someone volunteers and raise a new PR it would be great.

@erikdw
Copy link
Contributor

erikdw commented Jun 22, 2017

@HeartSaVioR : I'll try to take a stab later tonight at it. I'll let you know if I find time.

@HeartSaVioR
Copy link
Contributor

@erikdw Nice! Thank you.

@erikdw
Copy link
Contributor

erikdw commented Jun 22, 2017

@HeartSaVioR : I didn't have time tonight. I'll try to get to it tomorrow.

@erikdw
Copy link
Contributor

erikdw commented Jun 23, 2017

@HeartSaVioR : FYI, I resolved the conflicts in my local repo, I'm gonna build and then send a new PR, will link from here.

@erikdw
Copy link
Contributor

erikdw commented Jul 5, 2017

@HeartSaVioR : FYI, here's the PR for the backport:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants