Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in logs after enabling RPC limits and discovery_stale_max #4673

Closed
nickwales opened this issue Sep 13, 2018 · 3 comments · Fixed by #5683
Closed

Errors in logs after enabling RPC limits and discovery_stale_max #4673

nickwales opened this issue Sep 13, 2018 · 3 comments · Fixed by #5683
Labels
type/bug Feature does not function as expected

Comments

@nickwales
Copy link
Contributor

Since adding rpc limits and discovery_stale_max to our agent configurations we have started seeing error level log entries like this: error: rpc error making call: reflect: reflect.Value.SetString using unaddressable value from=172.17.0.16:54114

We are monitoring the RPC exceeded errors in datadog and are not seeing anything.

Looking back we didn't see this prior to the configuration change.

Happens every few minutes on each node, so its impacting only a fraction of total calls.

Steps to reproduce this issue, eg:

  1. configure agent with {... "limits":{ "rpc_rate": 5000}, "discovery_max_stale": "100ms", ...}
  2. Still trying to determine exactly what the calls that are being made have come from
  3. View error

Consul info for both Client and Server

Client info
agent:
	check_monitors = 0
	check_ttls = 46
	checks = 47
	services = 47
build:
	prerelease =
	revision = fb848fc4
	version = 1.0.7
consul:
	known_servers = 3
	server = false
runtime:
	arch = amd64
	cpu_count = 16
	goroutines = 115
	max_procs = 16
	os = linux
	version = go1.10
serf_lan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 562
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 51
	member_time = 30054
	members = 516
	query_queue = 0
	query_time = 1
Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease =
	revision = fb848fc4
	version = 1.0.7
consul:
	bootstrap = false
	known_datacenters = 23
	leader = true
	leader_addr = 10.124.2.250:8300
	server = true
raft:
	applied_index = 778161348
	commit_index = 778161348
	fsm_pending = 0
	last_contact = 0
	last_log_index = 778161348
	last_log_term = 1500
	last_snapshot_index = 778154840
	last_snapshot_term = 1500
	latest_configuration = [{Suffrage:Voter ID:2a8d118c-d894-fdb5-ed71-dca61f07ad78 Address:10.124.22.186:8300} {Suffrage:Voter ID:347fe2ab-4d43-1b29-688c-0dad59904637 Address:10.124.15.182:8300} {Suffrage:Voter ID:890f37a2-4c34-d74b-955a-c2d6406b47aa Address:10.124.2.250:8300}]
	latest_configuration_index = 693226915
	num_peers = 2
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Leader
	term = 1500
runtime:
	arch = amd64
	cpu_count = 8
	goroutines = 2237
	max_procs = 8
	os = linux
	version = go1.10
serf_lan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 562
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 51
	member_time = 30054
	members = 516
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 1
	failed = 2
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 5753
	members = 83
	query_queue = 0
	query_time = 1

Operating system and Environment details

Ubuntu 16.04
Client: m5.4xlarge
Server: c4.2xlarge

Log Fragments

2018/09/13 23:11:40 [ERR] consul: "Health.ServiceNodes" RPC failed to server 10.124.15.182:8300: rpc error making call: reflect: reflect.Value.SetString using unaddressable value
2018/09/13 23:11:40 [DEBUG] manager: cycled away from server "ip-10-124-15-182"
2018/09/13 23:11:40 [ERR] http: Request GET /v1/health/service/<service_name>?passing&tag=live, error: rpc error making call: reflect: reflect.Value.SetString using unaddressable value from=127.0.0.1:49734
@pierresouchay
Copy link
Contributor

Where do you see this error?
Agents or servers?
Do you also see those errors with higher thresholds (for instance, discovery_max_stale=5s and rpc_limit=10000)?

Discovery max stale very low might double the RPCs if followers are always stalled by 100 ms, still there is probably a bug there, the error message is weird

@pierresouchay
Copy link
Contributor

@nickwales did you put the same rpc limit on the server as well?

@nickwales
Copy link
Contributor Author

This is on the agents, we are monitoring and not seeing any agents doing more than the 5000, so I don't think its that.

There are no rpc limits on the servers.

@pearkes pearkes added the type/bug Feature does not function as expected label Oct 8, 2018
mkeeler added a commit that referenced this issue Apr 18, 2019
Fixes #4673
Supercedes: #5677 

There was an error decoding `map[string]string` values due to Go strings being immutable. This was fixes in our go-msgpack fork.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected
Projects
None yet
3 participants