Flagger loadtester webhook with wrk is ignoring metrics. #366

Cobaramin · 2019-11-13T05:39:55Z

Becuase of this issue in rakyll/hey that used in flagger-loadtester I try to use flagger with wrk with my own custom loadtester images (just and old image with wrk installed) in stead of rakyll/hey

Then I try following Linkerd canary deploy tutorial
with Canary's webhooks command wrk -d 1m -t 1 -c 1 http://podinfo-canary.test:9898/ with request-success-rate metric. it's work perfectly but after using this below command in flagger-loadtester-xxxx pod, rolling update still working :(

curl http://podinfo-canary.test:9898/status/500

I just wonder why canary deploy still working with 500 status on my custom loadtester image.

this is my Dockerfile script

FROM alpine:3.10.3 as build
RUN apk add --update alpine-sdk perl
RUN cd /tmp \
    && git clone -b 4.0.2 https://github.com/wg/wrk
RUN cd /tmp/wrk \
    && make

FROM alpine:3.10.3

RUN addgroup -S app \
    && adduser -S -g app app \
    && apk --no-cache add ca-certificates curl jq coreutils dpkg-dev dpkg make bash libgcc

COPY --from=build /tmp/wrk/wrk /usr/local/bin/

WORKDIR /home/app

RUN curl -sSLo hey "https://storage.googleapis.com/hey-release/hey_linux_amd64" && \
chmod +x hey && mv hey /usr/local/bin/hey

# verify hey works
RUN hey -n 1 -c 1 https://flagger.app > /dev/null && echo $? | grep 0

RUN curl -sSL "https://get.helm.sh/helm-v2.15.1-linux-amd64.tar.gz" | tar xvz && \
chmod +x linux-amd64/helm && mv linux-amd64/helm /usr/local/bin/helm && \
chmod +x linux-amd64/tiller && mv linux-amd64/tiller /usr/local/bin/tiller && \
rm -rf linux-amd64

RUN curl -sSL "https://get.helm.sh/helm-v3.0.0-rc.2-linux-amd64.tar.gz" | tar xvz && \
chmod +x linux-amd64/helm && mv linux-amd64/helm /usr/local/bin/helmv3 && \
rm -rf linux-amd64

RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
wget -qO /usr/local/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64 && \
chmod +x /usr/local/bin/grpc_health_probe

RUN curl -sSL "https://github.com/bojand/ghz/releases/download/v0.39.0/ghz_0.39.0_Linux_x86_64.tar.gz" | tar xz -C /tmp && \
mv /tmp/ghz /usr/local/bin && chmod +x /usr/local/bin/ghz && rm -rf /tmp/ghz-web

ADD https://raw.githubusercontent.com/grpc/grpc-proto/master/grpc/health/v1/health.proto /tmp/ghz/health.proto


RUN ls /tmp

COPY ./scripts/loadtester .
RUN chown -R app:app ./

USER app

RUN curl -sSL "https://github.com/rimusz/helm-tiller/archive/v0.9.3.tar.gz" | tar xvz && \
helm init --client-only && helm plugin install helm-tiller-0.9.3 && helm plugin list

ENTRYPOINT ["./loadtester"]

This is Flagger's Canary configuration

apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
  name: podinfo
  namespace: test
spec:
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    name: podinfo
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  service:
    # ClusterIP port number
    port: 9898
    # container port number or name (optional)
    targetPort: 9898
  canaryAnalysis:
    # schedule interval (default 60s)
    interval: 30s
    # max number of failed metric checks before rollback
    threshold: 5
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 5
    # Linkerd Prometheus checks
    metrics:
    - name: request-success-rate
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      threshold: 99
      interval: 1m
    - name: request-duration
      # maximum req duration P99
      # milliseconds
      threshold: 500
      interval: 30s
    # testing (optional)
    webhooks:
      - name: acceptance-test
        type: pre-rollout
        url: http://flagger-loadtester.flagger/
        timeout: 30s
        metadata:
          type: bash
          cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token"
      - name: load-test
        type: rollout
        url: http://flagger-loadtester.flagger/
        metadata:
          # cmd: "hey -z 2m -q 10 -c 2 http://podinfo-canary.test:9898/"
          cmd: "wrk -d 1m -t 1 -c 1 http://podinfo-canary.test:9898/"

The text was updated successfully, but these errors were encountered:

stefanprodan · 2019-11-13T07:34:21Z

A single call to the 500 endpoint will not bring the success rate percentage under 99%. Use wrk to hit that endpoint or a do watch -n 1 curl

Cobaramin · 2019-11-15T06:54:49Z

@stefanprodan Actually, i have try with watch -n 1 curl already but it's still not working. but I didn't try #368 yet.

stefanprodan · 2019-11-15T07:31:24Z

It all depends on how many requests/sec wrk generates, I guess one error per second doesn’t make one percent of the total traffic. Use wrk from inside the pod to generate errors. You could set the success rate threshold to 100 and test it like that.

stefanprodan · 2019-11-28T08:43:02Z

Going to close this, please open a new issue if rollback doesn’t work for you

stefanprodan mentioned this issue Nov 13, 2019

Add wrk to load tester tools #368

Merged

stefanprodan closed this as completed Nov 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flagger loadtester webhook with wrk is ignoring metrics. #366

Flagger loadtester webhook with wrk is ignoring metrics. #366

Cobaramin commented Nov 13, 2019

stefanprodan commented Nov 13, 2019

Cobaramin commented Nov 15, 2019 •

edited

Loading

stefanprodan commented Nov 15, 2019 •

edited

Loading

stefanprodan commented Nov 28, 2019

Flagger loadtester webhook with wrk is ignoring metrics. #366

Flagger loadtester webhook with wrk is ignoring metrics. #366

Comments

Cobaramin commented Nov 13, 2019

stefanprodan commented Nov 13, 2019

Cobaramin commented Nov 15, 2019 • edited Loading

stefanprodan commented Nov 15, 2019 • edited Loading

stefanprodan commented Nov 28, 2019

Cobaramin commented Nov 15, 2019 •

edited

Loading

stefanprodan commented Nov 15, 2019 •

edited

Loading