Skip to content

Conversation

@zizhong
Copy link
Member

@zizhong zizhong commented May 18, 2018

Without this the
RecSetRecordInt("proxy.node.config.draining", 1, REC_SOURCE_DEFAULT); on Main.cc:288 will fail,
because

    // Add the record but do not set the 'registered' flag, as this
    // record really hasn't been registered yet.  Also, in order to
    // add the record, we need to have a rec_type, so if the user
    // calls RecSetRecord on a record we haven't registered yet, we
    // should fail out here.
    if ((rec_type == RECT_NULL) || (data_type == RECD_NULL)) {
      err = REC_ERR_FAIL;
      goto Ldone;
    }

from P_RecCore.cc:422.

This bug is breaking the drain feature and it's needed for #2918.

@zizhong zizhong requested review from bryancall, maskit and scw00 May 18, 2018 01:37
@zwoop zwoop changed the title register proxy.node.config.draining Register proxy.node.config.draining May 18, 2018
Copy link
Contributor

@zwoop zwoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right way to do this honestly. I think we should register this setting where this feature gets initialized. Yes, there are some proxy.node metrics in RecordsConfig.cc already, but that really wasn't the way this was supposed to be done afaik.

zwoop
zwoop previously requested changes May 18, 2018
Copy link
Contributor

@zwoop zwoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we have this:

cmd/traffic_manager/traffic_manager.cc:  RecRegisterStatInt(RECT_NODE, "proxy.node.config.draining", 0, RECP_NON_PERSISTENT);

So if this is not sufficient, we should fix the above. Or at a minimum the above should be removed (we shouldn't register the metric twice).

@zwoop
Copy link
Contributor

zwoop commented May 18, 2018

@maskit something odd is going on here, can you take a look please?

@maskit
Copy link
Member

maskit commented May 18, 2018

The metric was originally just a variable but the variable was replaced with the setting on #2962, which means it worked without traffic_manager but now it requires traffic_manager (traffic_manager has to register the setting first on startup).

Can we register all the 6 settings on traffic_server side, or do these need to be registered on traffic_manager side? Only moving the one for draining doesn't make sense.

Also, we should be able to remove "proxy.node.config.restart_required.cop", right?

@zwoop
Copy link
Contributor

zwoop commented Jun 6, 2018

[approve ci autest]

@zwoop zwoop added this to the 9.0.0 milestone Jun 6, 2018
@zizhong zizhong force-pushed the register_drain_record branch 2 times, most recently from 3959755 to 1959697 Compare July 16, 2018 19:22
@zizhong
Copy link
Member Author

zizhong commented Jul 16, 2018

@zwoop @maskit Thanks for the insightful comments.
This is to fix the autest failure since autest only runs traffic_server and this stat is registered in traffic_manager. I feel the latest change makes much more sense. Can you guys take another look? Thank you!

@zizhong
Copy link
Member Author

zizhong commented Jul 16, 2018

[approve ci autest]

@zizhong
Copy link
Member Author

zizhong commented Jul 16, 2018

     - < Proxy-Connection: keep-alive
           ? ^                   ^    ^
           
     + > Proxy-Connection: Keep-Alive
           ? ^                   ^    ^

Saw this causes autest failed. @zwoop @dragon512 any insights?

@maskit
Copy link
Member

maskit commented Jul 17, 2018

I'm OK with this. However, I'd rather register all the metrics consistently on traffic_server side on startup if traffic_server wasn't managed by traffic_manager.

I'm not sure why autest is failing but the diff is really confusing. It looks like 'keep-alive' vs 'Keep-Alive' but they are a request header and a response header.

@zizhong
Copy link
Member Author

zizhong commented Jul 17, 2018

@maskit I looked into other stats registered in traffic_manager. They can only be changed via traffic_ctl and traffic_manager. Wouldn't make much sense to register them in traffic_server. This stat is different. It can be also changed by the signal.

@zizhong zizhong force-pushed the register_drain_record branch from 1959697 to ad3a5e4 Compare July 17, 2018 04:53
@maskit
Copy link
Member

maskit commented Jul 17, 2018

@zizhong

They can only be changed via traffic_ctl and traffic_manager.

Right, that's true as far as I know.

Wouldn't make much sense to register them in traffic_server.

I think differently. I see the stats as interfaces of traffic_server for other processes that manages traffic_server. If we registered them in traffic_server, we would be able to manage traffic_server with different tools, such as unit tests and homemade management tools.

Also, this is probably the only point that traffic_server depends on traffic_manager. So if we could make traffic_server their owner the relationship between traffic_server and traffic_manager would be a bit simple, though I don't think it has to be done on this PR because it wouldn't be a small change and I'm not sure whether this is really possible.

@maskit
Copy link
Member

maskit commented Aug 17, 2018

@SolidWallOfCode @vmamidi I heard you guys are working on around traffic_manager and RPC. What do you think about this from your point of view?

@SolidWallOfCode
Copy link
Member

I agree with @maskit. I think it's a good idea to make traffic_server as self contained as possible. If we get the RPC fixed up to some standard mechanism we can expect the subsequent creation of other management tools that work directly with traffic_server and use only that executable. I am in favor of moving in that direction.

@zizhong
Copy link
Member Author

zizhong commented Aug 17, 2018

@SolidWallOfCode @maskit Thanks for the suggestions! I'll make the change accordingly.

@zizhong zizhong force-pushed the register_drain_record branch from ad3a5e4 to 20187ff Compare October 22, 2018 22:09
@zizhong zizhong changed the title Register proxy.node.config.draining register stats in traffic_server if running in standalone mode Oct 22, 2018
@zizhong zizhong force-pushed the register_drain_record branch from 20187ff to 04fc4dd Compare October 22, 2018 22:11
@zizhong
Copy link
Member Author

zizhong commented Oct 22, 2018

@maskit @zwoop PR updated. can you help to take another look? Thanks!
Once this is merged, the failed test case of #2918 will pass.

Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

We could remove them from traffic_manager but I don't think that need to be done on this PR. This fix is the minimal and simplest fix for the draining issue.

@zizhong zizhong merged commit 7c42fdd into apache:master Oct 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants