Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior with Consul UI #17805

Open
pvyaka01 opened this issue Jun 17, 2023 · 1 comment
Open

Inconsistent behavior with Consul UI #17805

pvyaka01 opened this issue Jun 17, 2023 · 1 comment

Comments

@pvyaka01
Copy link

pvyaka01 commented Jun 17, 2023

Overview of the Issue

UI shows both services down when one of the services in a service-mesh is brought down. Health checks for second healthy service are passing, though. It appears to be random. This happens with both 1.15.2 and 1.15.3

Reproduction Steps

Consul-UI-15 3-Problem

Consul info for both Client and Server

Client info
Output from client 'consul info' command here

consul info
agent:
check_monitors = 2
check_ttls = 0
checks = 7
services = 5
build:
prerelease =
revision = 7ce982c
version = 1.15.3
version_metadata =
consul:
acl = disabled
known_servers = 1
server = false
runtime:
arch = amd64
cpu_count = 2
goroutines = 98
max_procs = 2
os = linux
version = go1.20.4
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 15
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 74
members = 14
query_queue = 0
query_time = 4

Client agent HCL config

client_addr = "0.0.0.0"
datacenter = "xxxxxxxx"
enable_local_script_checks = true
data_dir = "/opt/consul"
encrypt = "xxxxxxxxxxxxx"
retry_join = ["<consul_server_ip>"]
log_level = "INFO"
ports { "grpc" = 8502 }

server = false
enable_syslog = true

Server info
Output from server 'consul info' command here

consul info
agent:
check_monitors = 2
check_ttls = 0
checks = 5
services = 4
build:
prerelease =
revision = 7ce982c
version = 1.15.3
version_metadata =
consul:
acl = enabled
bootstrap = false
known_datacenters = 1
leader = true
leader_addr = :8300
server = true
raft:
applied_index = 23987
commit_index = 23987
fsm_pending = 0
last_contact = 0
last_log_index = 23987
last_log_term = 8
last_snapshot_index = 16390
last_snapshot_term = 6
latest_configuration = [{Suffrage:Voter ID:d0688261-1319-acd9-9d2a-80c28983f63f Address:<server_ip>:8300}]
latest_configuration_index = 0
num_peers = 0
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 8
runtime:
arch = amd64
cpu_count = 4
goroutines = 710
max_procs = 4
os = linux
version = go1.20.4
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 15
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 74
members = 14
query_queue = 0
query_time = 4
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1

Server agent HCL config

client_addr = "0.0.0.0"
datacenter = "xxxxxxxxx"
enable_local_script_checks = true
data_dir = "/opt/consul"
encrypt = "xxxxxxxxxxx"
log_level = "warn"
ports { "grpc" = 8502 }
server = true
enable_syslog = true
enable_central_service_config = true
dns_config {
enable_truncate = true
only_passing = true
}
connect {
enabled = true
}

telemetry {
disable_hostname = true,
prometheus_retention_time = "72h"
}
ui_config {
enabled = true
metrics_provider = "prometheus"
metrics_proxy {
base_url = "http://localhost:8428"
}
}
acl = {
enabled = true
default_policy = "allow"
enable_token_persistence = true
}

Operating system and Environment details

Red Hat Enterprise Linux release 8.6 (Ootpa)

Log Fragments

Server:
2023-06-17T19:40:44.198Z [WARN] agent.server.memberlist.lan: memberlist: Refuting an alive message for 'aa2uaimscsl1004' (10.226.245.99:8301) meta:([255 222 0 17 164 114 111 108 101 166 99 111 110 115 117 108 162 105 100 218 0 36 100 48 54 56 56 50 54 49 45 49 51 49 57 45 97 99 100 57 45 57 100 50 97 45 56 48 99 50 56 57 56 51 102 54 51 102 165 98 117 105 108 100 175 49 46 49 53 46 51 58 55 99 101 57 56 50 99 101 165 102 116 95 102 115 161 49 165 102 116 95 115 105 161 49 163 118 115 110 161 50 168 114 97 102 116 95 118 115 110 161 51 167 118 115 110 95 109 105 110 161 50 167 118 115 110 95 109 97 120 161 51 164 112 111 114 116 164 56 51 48 48 173 103 114 112 99 95 116 108 115 95 112 111 114 116 164 56 53 48 51 169 98 111 111 116 115 116 114 97 112 161 49 173 119 97 110 95 106 111 105 110 95 112 111 114 116 164 56 51 48 50 162 100 99 175 105 100 110 116 121 95 97 119 115 45 97 99 112 45 50 167 115 101 103 109 101 110 116 160 169 103 114 112 99 95 112 111 114 116 164 56 53 48 50 164 97 99 108 115 161 49] VS [255 222 0 16 173 119 97 110 95 106 111 105 110 95 112 111 114 116 164 56 51 48 50 162 100 99 175 105 100 110 116 121 95 97 119 115 45 97 99 112 45 50 163 118 115 110 161 50 168 114 97 102 116 95 118 115 110 161 51 165 102 116 95 115 105 161 49 164 114 111 108 101 166 99 111 110 115 117 108 167 115 101 103 109 101 110 116 160 165 102 116 95 102 115 161 49 165 98 117 105 108 100 175 49 46 49 53 46 51 58 55 99 101 57 56 50 99 101 164 112 111 114 116 164 56 51 48 48 162 105 100 218 0 36 100 48 54 56 56 50 54 49 45 49 51 49 57 45 97 99 100 57 45 57 100 50 97 45 56 48 99 50 56 57 56 51 102 54 51 102 167 118 115 110 95 109 105 110 161 50 167 118 115 110 95 109 97 120 161 51 169 103 114 112 99 95 112 111 114 116 164 56 53 48 50 173 103 114 112 99 95 116 108 115 95 112 111 114 116 164 56 53 48 51 164 97 99 108 115 161 49]), vsn:([1 5 2 2 5 4] VS [1 5 2 2 5 4])
2023-06-17T19:40:50.700Z [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
2023-06-17T19:51:02.044Z [WARN] agent: [core][Server #4] grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2023-06-17T19:51:25.884Z [WARN] agent: [core][Server #4] grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

Client Agent
2023-06-17T19:45:14.199Z [ERROR] agent.envoy: Error receiving new DeltaDiscoveryRequest; closing request channel: error="rpc error: code = Canceled desc = context canceled"
2023-06-17T19:45:40.173Z [ERROR] agent.http: Request error: method=GET url=/v1/acl/token/self from=10.226.244.17:55474 error="ACL support disabled"

@pvyaka01
Copy link
Author

By the way, this happened with even 13.x, guess I'm missing some "breaking changes" while upgrading from 1.11.2. Thoughts pls?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant