-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgres sql failure on 1.9.0 and 1.10.0 #993
Comments
Thanks for the detailed bug report @AyWa ! Taking a 👀 . Im actually out of town this weekend but will try to provide an update in the next couple days. |
Thank you, it is not really urgent, I will also try to reproduce outside our environment (maybe just docker compose or simple k8s) |
Cool, in the mean time I bumped the postgres example (using docker-compose) and the integration tests to use postgres 12 instead of the old postgres 10 to see if that exposes anything |
FYI i've traced it to the migration check that automatically runs on Flipt startup: https://github.com/flipt-io/flipt/blob/main/storage/sql/migrator.go#L42 I wonder if something changed in the underlying 'github.com/golang-migrate/migrate/database/postgres' library between versions that is causing this error. Going to keep digging in this coming week |
This bug seems to be little tricky. In docker compose everything works like:
However, on k8s, I can reproduce.
I think this working example is correct, because if I set image For now I tried in my local k8s ( |
So I tried something else today: If I deploy the db and flipt in the same namespace and so the db connection string can be change to: So I think there is an issue in the docker image about the DNS resolve like |
That looks promising thank you for digging in! I will be able to take a deeper dive tomorrow am. I think the problem may lie in the migrator library I linked earlier. |
Likely related to #963 as well |
👋 This is likely to do with how see: segmentio/kafka-go#285 and golang/go#35067 Meanwhile, try dropping the Not sure why the resolver is changing for flipt. I thought it was a change to static compilation. But I think I was wrong there. |
Thanks @GeorgeMac !! I think this may have broken when I switched how Flipt is built for release. Pre #927 I was building locally, using musl to cross-compile to linux (statically linked). In #927 I changed over to using github actions to build/release (see: https://github.com/flipt-io/flipt/pull/927/files#diff-42e26dc67aed8aa3edb2472b4403288c1699fb6dc47419b9a475f0f224fe4689). I wonder if I need to set the netdns flags now that Im not using musl-cross? |
I created #1001 to address. I'm actually having a hard time reproducing/verifying the fix however because of the move to building everything in CI. I need to build on a linux machine (to reproduce how GH Actions is building) and the only linux machine I have available locally is a Ubuntu VM running in UTM, however I installed the ARM Ubuntu version (I'm running on an m1 mac) so goreleaser cant build it for x64 when running in the VM. Also, goreleaser doesnt allow me to push a snapshot build to Dockerhub, only release builds.. so I cant very easily create a test docker image to deploy to k8s/kind to test. 😠 Couple ideas:
What do you think @GeorgeMac @AyWa ? |
I am actually running on a m1 mac too, but I could try to build the image on a vm tomorrow and see if it is working |
My suggestion would probably be in the 1/2 space. The other option might be to play with an alternative base image to replicate it. |
First things first. Here is an adjusted version of # https://goreleaser.com/docker/
FROM golang:1.17.13-buster AS build
RUN apt clean && apt update && apt install -y gcc-x86-64-linux-gnu
RUN curl https://raw.githubusercontent.com/creationix/nvm/master/install.sh | bash
RUN \. "$HOME/.nvm/nvm.sh" && nvm install --lts
RUN \. "$HOME/.nvm/nvm.sh" && nvm use --lts
RUN go install github.com/goreleaser/goreleaser@v1.6.2
RUN go install github.com/go-task/task/v3/cmd/task@latest
WORKDIR /flipt
ADD . /flipt
RUN \. "$HOME/.nvm/nvm.sh" && task prep -f
ENV ANALYTICS_KEY=foo
ENV CC=x86_64-linux-gnu-gcc
ENV GOARCH=amd64
RUN goreleaser --rm-dist --snapshot build
FROM alpine:3.16.2
LABEL maintainer="dev@flipt.io"
LABEL org.opencontainers.image.name="flipt"
LABEL org.opencontainers.image.source="https://github.com/flipt-io/flipt"
RUN apk add --no-cache postgresql-client \
openssl \
ca-certificates
RUN mkdir -p /etc/flipt && \
mkdir -p /var/opt/flipt
COPY --from=build /flipt/dist/flipt_linux_amd64/flipt /
COPY config/migrations/ /etc/flipt/config/migrations/
COPY config/*.yml /etc/flipt/config/
RUN addgroup flipt && \
adduser -S -D -g '' -G flipt -s /bin/sh flipt && \
chown -R flipt:flipt /etc/flipt /var/opt/flipt
EXPOSE 8080
EXPOSE 9000
USER flipt
CMD ["./flipt"] Then I did the following to validate the error can be reproduced and is fixed by
The issue disappears for me 👍
|
Thanks @GeorgeMac for validating the fix! Great idea of building in the container. @AyWa I went ahead and backported the fix for 1.9 and created v1.9.1 as well as forward fixed for 1.10 (v1.10.1). Would you mind giving either v1.9.1 or v1.10.1 a try and see if they now work for you? |
Sure I will try to test locally and in our stg cluster tomorrow. Thx for the quick fix !! |
Can confirm the fix in 1.9.1 when updating the yaml provided by @AyWa to use v1.9.1 of the image:
|
It is perfectly working !!! thank you |
Describe the Bug
We are running flipt in k8s in multiples environment:
we are in
1.8.3
and everything is working well, but when we try to update to1.9.0
or1.10.0
(even on a new db), we got error.getting db driver for: postgres: dial tcp: lookup postgres.database.svc.cluster.local: device or resource busy
Version Info
1.8.2
-> working1.8.3
-> working1.9.0
-> error1.10.0
-> errorour posgres sql is
postgres:12
To Reproduce
we just start the docker image with the env variable
FLIPT_DB_URL
topostgres://yy@postgres.database.svc.cluster.local:5432/flipt?sslmode=disable
I went to the list of change, but I could not see any change related to postgres. Did we miss something or we need to change something ?
log
The text was updated successfully, but these errors were encountered: