Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and flag missed notifications from RabbitMQ #4317

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
6d1f792
[wip] Handling missed notifications via dead letters.
elland Oct 28, 2024
405ff35
fixup! [wip] Handling missed notifications via dead letters.
pcapriotti Oct 29, 2024
c04d4e0
Add cassandra to cannon's environment
pcapriotti Oct 29, 2024
e824b04
Refactor rabbitMQWebSocketApp
pcapriotti Oct 29, 2024
6c1a8ce
Read from rabbitmq and websocket concurrently
pcapriotti Oct 30, 2024
0faf96c
Fix and clean up BGWorker/cannon/gundeck/test.
elland Oct 31, 2024
cfda2da
gundeck: add rabbit/cassandra to charts/hack.
elland Nov 4, 2024
e022d9a
Gundeck: add rabbitmq creds to gundeck-integration.yaml.
elland Nov 5, 2024
3ba9878
ci: More rabbit settings.
elland Nov 5, 2024
74ca434
cql: Update local schema file.
elland Nov 5, 2024
dc8c5c6
Reverted notification QOS.
elland Nov 5, 2024
4b42b1a
Restore modservice settings.
elland Nov 5, 2024
04cc154
Cannon: Avoid deadlock in RabbitMqConsumerApp.
elland Nov 5, 2024
37f2e7d
Cannon: handle deadlocks in the consumer
elland Nov 6, 2024
8c6919c
integrations: Removed redundant test.
elland Nov 6, 2024
98dfe2f
wip: Cannon config for CI.
elland Nov 6, 2024
5c9b3db
[temp] Add debug logs for cannon.
elland Nov 6, 2024
23d552c
bgworker: Add more logs
elland Nov 7, 2024
a5cf223
wip: Try using modified backends for the tests
elland Nov 7, 2024
42a312c
wip: Try without modified backend for remaining failing tests
elland Nov 7, 2024
3e66288
BGWorker: Drop transient messages from the dlx queue.
elland Nov 7, 2024
125b8cc
integration: Revert 'debug' log level to 'info' for cannon
elland Nov 7, 2024
68643a0
integration: Add transient msg handling test for DLX.
elland Nov 7, 2024
73ea202
integration: Disable `failureContext` for failing test, add FUTUREWORK.
elland Nov 7, 2024
6eaa1ff
bgworker: Refactor connection creation to use a single function
elland Nov 11, 2024
a57c3fa
bgworker: Improve error handling in BackendDeadUserNotificationWatcher
elland Nov 11, 2024
2c46c2e
bgworker: Add health tracking.
elland Nov 11, 2024
b493832
fix: delete dead import, thanks hls.
elland Nov 11, 2024
ea56ef0
Merge branch 'WPB-10308-use-rabbti-mq-classic-queues-for-notification…
elland Nov 11, 2024
02c5fa6
Added comment.
elland Nov 11, 2024
a8d3540
Removed unnecessary timeout in test.
elland Nov 11, 2024
255b841
Fixed weird reversal of makefile cmd rename.
elland Nov 11, 2024
26deb72
De-weed.
elland Nov 11, 2024
9d8a4ce
Hi ci.
elland Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ install: init
./hack/bin/cabal-run-all-tests.sh
./hack/bin/cabal-install-artefacts.sh all

.PHONY: clean-rabbit
clean-rabbit:
.PHONY: rabbit-clean
rabbit-clean:
rabbitmqadmin -f pretty_json list queues vhost name \
| jq -r '.[] | "rabbitmqadmin delete queue name=\(.name) --vhost=\(.vhost)"' \
| bash
Expand Down
20 changes: 20 additions & 0 deletions cassandra-schema.cql
Original file line number Diff line number Diff line change
Expand Up @@ -1729,6 +1729,26 @@ CREATE TABLE gundeck_test.meta (
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE gundeck_test.missed_notifications (
user_id uuid,
client_id text,
PRIMARY KEY (user_id, client_id)
) WITH CLUSTERING ORDER BY (client_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE gundeck_test.push (
ptoken text,
app text,
Expand Down
6 changes: 6 additions & 0 deletions charts/background-worker/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ data:
host: federator
port: 8080

cassandra:
endpoint:
host: {{ .cassandra.host }}
port: 9042
keyspace: gundeck

{{- with .rabbitmq }}
rabbitmq:
host: {{ .host }}
Expand Down
2 changes: 2 additions & 0 deletions charts/background-worker/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ config:
# tlsCaSecretRef:
# name: <secret-name>
# key: <ca-attribute>
cassandra:
host: aws-cassandra

backendNotificationPusher:
pushBackoffMinWait: 10000 # in microseconds, so 10ms
Expand Down
37 changes: 29 additions & 8 deletions charts/cannon/templates/configmap.yaml
Original file line number Diff line number Diff line change
@@ -1,25 +1,46 @@
apiVersion: v1
data:
{{- with .Values }}
cannon.yaml: |
logFormat: {{ .Values.config.logFormat }}
logLevel: {{ .Values.config.logLevel }}
logNetStrings: {{ .Values.config.logNetStrings }}
logFormat: {{ .config.logFormat }}
logLevel: {{ .config.logLevel }}
logNetStrings: {{ .config.logNetStrings }}

cannon:
host: 0.0.0.0
port: {{ .Values.service.externalPort }}
port: {{ .service.externalPort }}
externalHostFile: /etc/wire/cannon/externalHost/host.txt

gundeck:
host: gundeck
port: 8080

cassandra:
endpoint:
host: {{ .config.cassandra.host }}
port: 9042
keyspace: gundeck

{{- with .config.rabbitmq }}
rabbitmq:
host: {{ .host }}
port: {{ .port }}
vHost: {{ .vHost }}
enableTls: {{ .enableTls }}
insecureSkipVerifyTls: {{ .insecureSkipVerifyTls }}
{{- if .tlsCaSecretRef }}
caCert: /etc/wire/gundeck/rabbitmq-ca/{{ .tlsCaSecretRef.key }}
{{- end }}
{{- end }}

drainOpts:
gracePeriodSeconds: {{ .Values.config.drainOpts.gracePeriodSeconds }}
millisecondsBetweenBatches: {{ .Values.config.drainOpts.millisecondsBetweenBatches }}
minBatchSize: {{ .Values.config.drainOpts.minBatchSize }}
gracePeriodSeconds: {{ .config.drainOpts.gracePeriodSeconds }}
millisecondsBetweenBatches: {{ .config.drainOpts.millisecondsBetweenBatches }}
minBatchSize: {{ .config.drainOpts.minBatchSize }}

disabledAPIVersions: {{ toJson .config.disabledAPIVersions }}
{{- end }}

disabledAPIVersions: {{ toJson .Values.config.disabledAPIVersions }}

kind: ConfigMap
metadata:
Expand Down
29 changes: 29 additions & 0 deletions charts/cannon/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,35 @@ config:
logLevel: Info
logFormat: StructuredJSON
logNetStrings: false
rabbitmq:
host: rabbitmq
port: 5672
vHost: /
enableTls: false
insecureSkipVerifyTls: false
cassandra:
host: aws-cassandra
# To enable TLS provide a CA:
# tlsCa: <CA in PEM format (can be self-signed)>
#
# Or refer to an existing secret (containing the CA):
# tlsCaSecretRef:
# name: <secret-name>
# key: <ca-attribute>

redis:
host: redis-ephemeral-master
port: 6379
connectionMode: "master" # master | cluster
enableTls: false
insecureSkipVerifyTls: false
# To configure custom TLS CA, please provide one of these:
# tlsCa: <CA in PEM format (can be self-signed)>
#
# Or refer to an existing secret (containing the CA):
# tlsCaSecretRef:
# name: <secret-name>
# key: <ca-attribute>

# See also the section 'Controlling the speed of websocket draining during
# cannon pod replacement' in docs/how-to/install/configuration-options.rst
Expand Down
12 changes: 12 additions & 0 deletions charts/gundeck/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,18 @@ data:
tlsCa: /etc/wire/gundeck/cassandra/{{- (include "tlsSecretRef" . | fromYaml).key }}
{{- end }}

{{- with .rabbitmq }}
rabbitmq:
host: {{ .host }}
port: {{ .port }}
vHost: {{ .vHost }}
enableTls: {{ .enableTls }}
insecureSkipVerifyTls: {{ .insecureSkipVerifyTls }}
{{- if .tlsCaSecretRef }}
caCert: /etc/wire/gundeck/rabbitmq-ca/{{ .tlsCaSecretRef.key }}
{{- end }}
{{- end }}

redis:
host: {{ .redis.host }}
port: {{ .redis.port }}
Expand Down
19 changes: 19 additions & 0 deletions charts/gundeck/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ spec:
- name: "gundeck-config"
configMap:
name: "gundeck"
{{- if .Values.config.rabbitmq.tlsCaSecretRef }}
- name: "rabbitmq-ca"
secret:
secretName: {{ .Values.config.rabbitmq.tlsCaSecretRef.name }}
{{- end }}
{{- if eq (include "useCassandraTLS" .Values.config) "true" }}
- name: "gundeck-cassandra"
secret:
Expand Down Expand Up @@ -77,7 +82,21 @@ spec:
- name: "additional-redis-ca"
mountPath: "/etc/wire/gundeck/additional-redis-ca/"
{{- end }}
{{- if .Values.config.rabbitmq.tlsCaSecretRef }}
- name: "rabbitmq-ca"
mountPath: "/etc/wire/gundeck/rabbitmq-ca/"
{{- end }}
env:
- name: RABBITMQ_USERNAME
valueFrom:
secretKeyRef:
name: gundeck
key: rabbitmqUsername
- name: RABBITMQ_PASSWORD
valueFrom:
secretKeyRef:
name: gundeck
key: rabbitmqPassword
{{- if hasKey .Values.secrets "awsKeyId" }}
- name: AWS_ACCESS_KEY_ID
valueFrom:
Expand Down
2 changes: 2 additions & 0 deletions charts/gundeck/templates/secret.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ metadata:
type: Opaque
data:
{{- with .Values.secrets }}
rabbitmqUsername: {{ .rabbitmq.username | b64enc | quote }}
rabbitmqPassword: {{ .rabbitmq.password | b64enc | quote }}
{{- if hasKey . "awsKeyId" }}
awsKeyId: {{ .awsKeyId | b64enc | quote }}
{{- end }}
Expand Down
14 changes: 14 additions & 0 deletions charts/gundeck/templates/tests/gundeck-integration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ spec:
secret:
secretName: {{ include "redisTlsSecretName" .Values.config }}
{{- end }}
{{- if .Values.config.rabbitmq.tlsCaSecretRef }}
- name: "rabbitmq-ca"
secret:
secretName: {{ .Values.config.rabbitmq.tlsCaSecretRef.name }}
{{- end }}
containers:
- name: integration
# TODO: When deployed to staging (or real AWS env), _all_ tests should be run
Expand Down Expand Up @@ -72,6 +77,10 @@ spec:
- name: "redis-ca"
mountPath: "/etc/wire/gundeck/redis-ca/"
{{- end }}
{{- if .Values.config.rabbitmq.tlsCaSecretRef }}
- name: "rabbitmq-ca"
mountPath: "/etc/wire/gundeck/rabbitmq-ca/"
{{- end }}
env:
# these dummy values are necessary for Amazonka's "Discover"
- name: AWS_ACCESS_KEY_ID
Expand All @@ -82,6 +91,11 @@ spec:
value: "eu-west-1"
- name: TEST_XML
value: /tmp/result.xml
# RabbitMQ needs dummy credentials for the tests to run
- name: RABBITMQ_USERNAME
value: "guest"
- name: RABBITMQ_PASSWORD
value: "guest"
{{- if hasKey .Values.secrets "redisUsername" }}
- name: REDIS_USERNAME
valueFrom:
Expand Down
7 changes: 7 additions & 0 deletions charts/gundeck/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ config:
logLevel: Info
logFormat: StructuredJSON
logNetStrings: false
rabbitmq:
host: rabbitmq
port: 5672
adminPort: 15672
vHost: /
enableTls: false
insecureSkipVerifyTls: false
cassandra:
host: aws-cassandra
# To enable TLS provide a CA:
Expand Down
3 changes: 3 additions & 0 deletions charts/integration/templates/integration-integration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,9 @@ spec:
- name: rabbitmq-ca
mountPath: /etc/wire/background-worker/rabbitmq-ca

- name: rabbitmq-ca
mountPath: /etc/wire/gundeck/rabbitmq-ca

{{- if eq (include "useCassandraTLS" .Values.config) "true" }}
- name: "integration-cassandra"
mountPath: "/certs"
Expand Down
41 changes: 38 additions & 3 deletions hack/helm_vars/wire-server/values.yaml.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ brig:
enableTls: true
insecureSkipVerifyTls: false
tlsCaSecretRef:
name: rabbitmq-certificate
name: "rabbitmq-certificate"
key: "ca.crt"
authSettings:
userTokenTimeout: 120
Expand Down Expand Up @@ -205,7 +205,23 @@ cannon:
memory: 512Mi
drainTimeout: 0
config:
cassandra:
host: {{ .Values.cassandraHost }}
replicaCount: 1
disabledAPIVersions: []
rabbitmq:
port: 5671
adminPort: 15671
enableTls: true
insecureSkipVerifyTls: false
tlsCaSecretRef:
name: "rabbitmq-certificate"
key: "ca.crt"
secrets:
rabbitmq:
username: {{ .Values.rabbitmqUsername }}
password: {{ .Values.rabbitmqPassword }}

cargohold:
replicaCount: 1
imagePullPolicy: {{ .Values.imagePullPolicy }}
Expand Down Expand Up @@ -252,7 +268,7 @@ galley:
enableTls: true
insecureSkipVerifyTls: false
tlsCaSecretRef:
name: rabbitmq-certificate
name: "rabbitmq-certificate"
key: "ca.crt"
enableFederation: true # keep in sync with brig.config.enableFederation, cargohold.config.enableFederation and tags.federator!
settings:
Expand Down Expand Up @@ -373,6 +389,14 @@ gundeck:
name: "cassandra-jks-keystore"
key: "ca.crt"
{{- end }}
rabbitmq:
port: 5671
adminPort: 15671
enableTls: true
insecureSkipVerifyTls: false
tlsCaSecretRef:
name: "rabbitmq-certificate"
key: "ca.crt"
redis:
host: redis-ephemeral-master
connectionMode: master
Expand All @@ -395,6 +419,9 @@ gundeck:
awsKeyId: dummykey
awsSecretKey: dummysecret
redisPassword: very-secure-redis-master-password
rabbitmq:
username: {{ .Values.rabbitmqUsername }}
password: {{ .Values.rabbitmqPassword }}
tests:
{{- if .Values.uploadXml }}
config:
Expand Down Expand Up @@ -518,13 +545,21 @@ background-worker:
pushBackoffMinWait: 1000 # 1ms
pushBackoffMaxWait: 500000 # 0.5s
remotesRefreshInterval: 1000000 # 1s
cassandra:
host: {{ .Values.cassandraHost }}
replicaCount: 1
{{- if .Values.useK8ssandraSSL.enabled }}
tlsCaSecretRef:
name: "cassandra-jks-keystore"
key: "ca.crt"
{{- end }}
rabbitmq:
port: 5671
adminPort: 15671
enableTls: true
insecureSkipVerifyTls: false
tlsCaSecretRef:
name: rabbitmq-certificate
name: "rabbitmq-certificate"
key: "ca.crt"
secrets:
rabbitmq:
Expand Down
2 changes: 1 addition & 1 deletion hack/helmfile.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
# This helfile is used for the setup of two ephemeral backends on kubernetes
# This helmfile is used for the setup of two ephemeral backends on kubernetes
# during integration testing (including federation integration tests spanning
# over 2 backends)
# This helmfile is used via the './hack/bin/integration-setup-federation.sh' via
Expand Down
7 changes: 0 additions & 7 deletions integration/test/API/GalleyInternal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -114,13 +114,6 @@ patchTeamFeatureConfig domain team featureName payload = do
req <- baseRequest domain Galley Unversioned $ joinHttpPath ["i", "teams", tid, "features", fn]
submit "PATCH" $ req & addJSON p

-- https://staging-nginz-https.zinfra.io/api-internal/swagger-ui/galley/#/galley/post_i_features_multi_teams_searchVisibilityInbound
getFeatureStatusMulti :: (HasCallStack, MakesValue domain, MakesValue featureName) => domain -> featureName -> [String] -> App Response
getFeatureStatusMulti domain featureName tids = do
fn <- asString featureName
req <- baseRequest domain Galley Unversioned $ joinHttpPath ["i", "features-multi-teams", fn]
submit "POST" $ req & addJSONObject ["teams" .= tids]

patchTeamFeature :: (HasCallStack, MakesValue domain, MakesValue team) => domain -> team -> String -> Value -> App Response
patchTeamFeature domain team featureName payload = do
tid <- asString team
Expand Down
3 changes: 0 additions & 3 deletions integration/test/MLS/Util.hs
Original file line number Diff line number Diff line change
Expand Up @@ -835,9 +835,6 @@ createApplicationMessage convId cid messageContent = do
groupInfo = Nothing
}

setMLSCiphersuite :: ConvId -> Ciphersuite -> App ()
setMLSCiphersuite convId suite = modifyMLSState $ \mls -> mls {convs = Map.adjust (\conv -> conv {ciphersuite = suite}) convId mls.convs}

leaveConv ::
(HasCallStack) =>
ConvId ->
Expand Down
Loading
Loading