Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flagger loadtester webhook with wrk is ignoring metrics. #366

Closed
Cobaramin opened this issue Nov 13, 2019 · 4 comments
Closed

Flagger loadtester webhook with wrk is ignoring metrics. #366

Cobaramin opened this issue Nov 13, 2019 · 4 comments

Comments

@Cobaramin
Copy link

Becuase of this issue in rakyll/hey that used in flagger-loadtester I try to use flagger with wrk with my own custom loadtester images (just and old image with wrk installed) in stead of rakyll/hey

Then I try following Linkerd canary deploy tutorial
with Canary's webhooks command wrk -d 1m -t 1 -c 1 http://podinfo-canary.test:9898/ with request-success-rate metric. it's work perfectly but after using this below command in flagger-loadtester-xxxx pod, rolling update still working :(

curl http://podinfo-canary.test:9898/status/500

I just wonder why canary deploy still working with 500 status on my custom loadtester image.

this is my Dockerfile script

FROM alpine:3.10.3 as build
RUN apk add --update alpine-sdk perl
RUN cd /tmp \
    && git clone -b 4.0.2 https://github.com/wg/wrk
RUN cd /tmp/wrk \
    && make

FROM alpine:3.10.3

RUN addgroup -S app \
    && adduser -S -g app app \
    && apk --no-cache add ca-certificates curl jq coreutils dpkg-dev dpkg make bash libgcc

COPY --from=build /tmp/wrk/wrk /usr/local/bin/

WORKDIR /home/app

RUN curl -sSLo hey "https://storage.googleapis.com/hey-release/hey_linux_amd64" && \
chmod +x hey && mv hey /usr/local/bin/hey

# verify hey works
RUN hey -n 1 -c 1 https://flagger.app > /dev/null && echo $? | grep 0

RUN curl -sSL "https://get.helm.sh/helm-v2.15.1-linux-amd64.tar.gz" | tar xvz && \
chmod +x linux-amd64/helm && mv linux-amd64/helm /usr/local/bin/helm && \
chmod +x linux-amd64/tiller && mv linux-amd64/tiller /usr/local/bin/tiller && \
rm -rf linux-amd64

RUN curl -sSL "https://get.helm.sh/helm-v3.0.0-rc.2-linux-amd64.tar.gz" | tar xvz && \
chmod +x linux-amd64/helm && mv linux-amd64/helm /usr/local/bin/helmv3 && \
rm -rf linux-amd64

RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
wget -qO /usr/local/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64 && \
chmod +x /usr/local/bin/grpc_health_probe

RUN curl -sSL "https://github.com/bojand/ghz/releases/download/v0.39.0/ghz_0.39.0_Linux_x86_64.tar.gz" | tar xz -C /tmp && \
mv /tmp/ghz /usr/local/bin && chmod +x /usr/local/bin/ghz && rm -rf /tmp/ghz-web

ADD https://raw.githubusercontent.com/grpc/grpc-proto/master/grpc/health/v1/health.proto /tmp/ghz/health.proto


RUN ls /tmp

COPY ./scripts/loadtester .
RUN chown -R app:app ./

USER app

RUN curl -sSL "https://github.com/rimusz/helm-tiller/archive/v0.9.3.tar.gz" | tar xvz && \
helm init --client-only && helm plugin install helm-tiller-0.9.3 && helm plugin list

ENTRYPOINT ["./loadtester"]

This is Flagger's Canary configuration

apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
  name: podinfo
  namespace: test
spec:
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    name: podinfo
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  service:
    # ClusterIP port number
    port: 9898
    # container port number or name (optional)
    targetPort: 9898
  canaryAnalysis:
    # schedule interval (default 60s)
    interval: 30s
    # max number of failed metric checks before rollback
    threshold: 5
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 5
    # Linkerd Prometheus checks
    metrics:
    - name: request-success-rate
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      threshold: 99
      interval: 1m
    - name: request-duration
      # maximum req duration P99
      # milliseconds
      threshold: 500
      interval: 30s
    # testing (optional)
    webhooks:
      - name: acceptance-test
        type: pre-rollout
        url: http://flagger-loadtester.flagger/
        timeout: 30s
        metadata:
          type: bash
          cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token"
      - name: load-test
        type: rollout
        url: http://flagger-loadtester.flagger/
        metadata:
          # cmd: "hey -z 2m -q 10 -c 2 http://podinfo-canary.test:9898/"
          cmd: "wrk -d 1m -t 1 -c 1 http://podinfo-canary.test:9898/"
@stefanprodan
Copy link
Member

A single call to the 500 endpoint will not bring the success rate percentage under 99%. Use wrk to hit that endpoint or a do watch -n 1 curl

@Cobaramin
Copy link
Author

Cobaramin commented Nov 15, 2019

@stefanprodan Actually, i have try with watch -n 1 curl already but it's still not working. but I didn't try #368 yet.

@stefanprodan
Copy link
Member

stefanprodan commented Nov 15, 2019

It all depends on how many requests/sec wrk generates, I guess one error per second doesn’t make one percent of the total traffic. Use wrk from inside the pod to generate errors. You could set the success rate threshold to 100 and test it like that.

@stefanprodan
Copy link
Member

Going to close this, please open a new issue if rollback doesn’t work for you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants