-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loki uses wrong AdvertiseAddr for memebership #5610
Comments
similar problem here - would like to get your nomad config to compare it with mine |
As far as I understood it would be impossible to use Consul Connect here since membership alg seems like need to know all peers addresses and Consul Connect hide it behind a single endpoint. So right now I just trying to make it work with bridge networking and port mapping. It works with host networking. |
are you using SSD mode or monolythic? what do you see when you access /ring? which flags are you using? your configuration looks fine, except that if all the three nodes are using the same |
I actually just started working on setting up a test Loki cluster in a Nomad environment and I am running into the exact same issue! Config excerpt, from the Nomad Job's template stanza: common:
ring:
instance_addr: {{ env "NOMAD_IP_loki_memberlist" }} Nomad substitutes Excerpt from Loki's config dump (the IP address is different for each instance of Loki): common:
<snip>
ring:
<snip>
instance_interface_names:
- eth0
- en0
- lo
instance_port: 0
instance_addr: 10.x.x.21
instance_availability_zone: ""
<snip>
instance_interface_names:
- eth0
- en0
- lo
instance_port: 0
instance_addr: 10.x.x.21
querier:
query_timeout: 1m0s
tail_max_duration: 1h0m0s
<snip>
ingester:
lifecycler:
ring:
kvstore:
store: memberlist
<snip>
unregister_on_shutdown: true
readiness_check_ring_health: true
address: 10.x.x.21
port: 0
id: 04b682673488 Excerpt from Loki logs:
Example Nomad Job: https://gist.github.com/ddreier/3d9c93a555aa36058ae1cf907b98ca51 |
monolithic for PoC phase.
IPs are correct in the ring endpoint.
What do you mean by flags?
This config is from one node, instance_addr and name are different and correct on the others. The issue is that Loki still trying to use the internal network as per debug logs I added. |
@ddreier And you can't do internal_network -> host_network -> bridge -> internal_network for yourself. At least without some iptables tuning. |
@Oloremo thanks, I was able to eventually get my POC up and running with setting the network_mode to host. Will just have to continue that practice for now until we can configure which IP address Loki Advertises. |
There appears to be an undocumented instance_interface_names option in the frontend section (at least in 2.4.2). Here are a list off all instance_interface_names available when you run Loki with -print-config-stderr:
As I am not running Loki in a container or Kubernetes, this flag in frontend defaults to eth0, en0, lo if not set. The only interface I am using in this list is lo. Setting instance_interface_names in frontend to the actual NIC device name made querier frontend work like a charm. No more delays, timeouts in SSD mode |
Hi! This issue has been automatically marked as stale because it has not had any We use a stalebot among other tools to help manage the state of issues in this project. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a We may also:
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, |
Glad to hear that you could find a work-around! But FYI, we've also added a new common:
- ring:
+ instance_interface_names:
- instance_interface_names: but now you could instead have +common:
+ instance_interface_names: biggest difference being that the common instance_interface_names is applied also to the frontend, which doesn't happen for the configuration inside the ring (since the frontend isn't a ring). |
@kavirajk any updates? Still unsure how we could run Loki with bridged networking. |
Hi! This issue has been automatically marked as stale because it has not had any We use a stalebot among other tools to help manage the state of issues in this project. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a We may also:
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, |
not stale |
I would like to add here that it can easily be reproduced using docker containers with network bridge (not k8s) when trying to deploy a distributed deployment when each service (loki target) is running on it's own instance. |
I'm having the same issue, I'm trying to run Loki under docker in two different hosts, Loki always advertises the internal docker IP which is not reachable from the other member. |
I had the same issue. It was annoying as same setup for Mimir works fine. And then I just copied advertise address/port configuration from Mimir into Loki ( E.g.
|
wait, Loki doesn't list Docs issue?.. |
That actually didn't fix the issue :/ |
Ok, so here the part of config with all the advertise addresses and ports replaced
and for nomad network/services I have
All those |
Can confirm that using the undocumented Basically without this |
Describe the bug
I'm trying to setup 3 nodes Loki 2.4.2 cluster in Hashicorp Nomad environment using the bridge networking. So inside the loki container is an internal Nomad network(172.26.64.x) that is unreachable from the outside.
I map and expose ports 3100, 7946, and 9096 so they're reachable if you would access them via real node IP:port.
I also configured the ring config to set the right advertise addr:
Full config from one node:
https://gist.github.com/Oloremo/f64be59cea85bc9e01fe262b9b158006
But in the logs I see that Loki trying to access the internal network:
Full logs: https://gist.github.com/Oloremo/e8de36fb505b74241b59234dccdf149b
So I think Loki ignores ring configuration and still trying to guess the network?
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Setting the
instance_addr
should remove network guessing.Environment:
The text was updated successfully, but these errors were encountered: