Skip to content

Commit

Permalink
Use separate services for storing job queues and cache (#7245)
Browse files Browse the repository at this point in the history
<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
These types of data have different characteristics and we have different
expectations on them:

* job queues are small and we'd rather not lose them (although losing
them is not fatal);

* cached chunks are large and we don't care if we lose them.

We currently store both in KeyDB, which has shown itself to not be
especially reliable. A few times we've had to clear the KeyDB store due
to data corruption, which destroyed the queues as well. While we'll
probably end up replacing KeyDB with something else, it would still be
useful to have the ability to just clear the cache volume without taking
out the job queues in the process.

As a solution to this, add a Redis service to be used only for the
queues (and potentially for other small data items). Using the original
Redis instead of KeyDB should also help with reliability (at least as
far as the job queues are concerned).

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
I checked the CVAT can still start using the development environment
instructions, the Compose file and the Helm chart.

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->
- ~~[ ] I have updated the documentation accordingly~~
- ~~[ ] I have added tests to cover my changes~~
- ~~[ ] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))~~
- ~~[ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
  • Loading branch information
SpecLad authored Dec 19, 2023
1 parent 80e1212 commit 48ab12b
Show file tree
Hide file tree
Showing 21 changed files with 225 additions and 137 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ RUN if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then \
COPY cvat/nginx.conf /etc/nginx/nginx.conf
COPY --chown=${USER} components /tmp/components
COPY --chown=${USER} supervisord/ ${HOME}/supervisord
COPY --chown=${USER} wait-for-it.sh manage.py backend_entrypoint.sh ${HOME}/
COPY --chown=${USER} wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh ${HOME}/
COPY --chown=${USER} utils/ ${HOME}/utils
COPY --chown=${USER} cvat/ ${HOME}/cvat
COPY --chown=${USER} rqscheduler.py ${HOME}
Expand Down
4 changes: 4 additions & 0 deletions changelog.d/20231215_165536_roman_split_redis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
### Changed

- Job queues are now stored in a dedicated Redis instance
(<https://github.com/opencv/cvat/pull/7245>)
68 changes: 29 additions & 39 deletions cvat/settings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,6 @@
ALLOWED_HOSTS = os.environ.get('ALLOWED_HOSTS', 'localhost,127.0.0.1').split(',')
INTERNAL_IPS = ['127.0.0.1']

redis_host = os.getenv('CVAT_REDIS_HOST', 'localhost')
redis_port = os.getenv('CVAT_REDIS_PORT', 6379)
redis_password = os.getenv('CVAT_REDIS_PASSWORD', '')

def generate_secret_key():
"""
Creates secret_key.py in such a way that multiple processes calling
Expand Down Expand Up @@ -289,62 +285,49 @@ class CVAT_QUEUES(Enum):
ANALYTICS_REPORTS = 'analytics_reports'
CLEANING = 'cleaning'

redis_inmem_host = os.getenv('CVAT_REDIS_INMEM_HOST', 'localhost')
redis_inmem_port = os.getenv('CVAT_REDIS_INMEM_PORT', 6379)
redis_inmem_password = os.getenv('CVAT_REDIS_INMEM_PASSWORD', '')

shared_queue_settings = {
'HOST': redis_inmem_host,
'PORT': redis_inmem_port,
'DB': 0,
'PASSWORD': urllib.parse.quote(redis_inmem_password),
}

RQ_QUEUES = {
CVAT_QUEUES.IMPORT_DATA.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '4h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.EXPORT_DATA.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '4h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.AUTO_ANNOTATION.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '24h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.WEBHOOKS.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '1h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.NOTIFICATIONS.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '1h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.QUALITY_REPORTS.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '1h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.ANALYTICS_REPORTS.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '1h',
'PASSWORD': urllib.parse.quote(redis_password),
},
CVAT_QUEUES.CLEANING.value: {
'HOST': redis_host,
'PORT': redis_port,
'DB': 0,
**shared_queue_settings,
'DEFAULT_TIMEOUT': '1h',
'PASSWORD': urllib.parse.quote(redis_password),
},
}

Expand Down Expand Up @@ -543,15 +526,22 @@ class CVAT_QUEUES(Enum):
'analytics_visibility': True,
}

redis_ondisk_host = os.getenv('CVAT_REDIS_ONDISK_HOST', 'localhost')
# The default port is not Redis's default port (6379).
# This is so that a developer can run both in-mem and on-disk Redis on their machine
# without running into a port conflict.
redis_ondisk_port = os.getenv('CVAT_REDIS_ONDISK_PORT', 6479)
redis_ondisk_password = os.getenv('CVAT_REDIS_ONDISK_PASSWORD', '')

CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
},
'media' : {
'media': {
'BACKEND' : 'django.core.cache.backends.redis.RedisCache',
"LOCATION": f"redis://:{urllib.parse.quote(redis_password)}@{redis_host}:{redis_port}",
"LOCATION": f"redis://:{urllib.parse.quote(redis_ondisk_password)}@{redis_ondisk_host}:{redis_ondisk_port}",
'TIMEOUT' : 3600 * 24, # 1 day
}
}
}

USE_CACHE = True
Expand Down
6 changes: 5 additions & 1 deletion docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,10 @@ services:
ports:
- '8181:8181'

cvat_redis:
cvat_redis_inmem:
ports:
- '6379:6379'

cvat_redis_ondisk:
ports:
- '6479:6379'
72 changes: 41 additions & 31 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,23 @@
x-backend-env: &backend-env
CLICKHOUSE_HOST: clickhouse
CVAT_POSTGRES_HOST: cvat_db
CVAT_REDIS_HOST: cvat_redis
CVAT_REDIS_INMEM_HOST: cvat_redis_inmem
CVAT_REDIS_INMEM_PORT: 6379
CVAT_REDIS_ONDISK_HOST: cvat_redis_ondisk
CVAT_REDIS_ONDISK_PORT: 6379
DJANGO_LOG_SERVER_HOST: vector
DJANGO_LOG_SERVER_PORT: 80
no_proxy: clickhouse,grafana,vector,nuclio,opa,${no_proxy:-}
SMOKESCREEN_OPTS: ${SMOKESCREEN_OPTS:-}

x-backend-deps: &backend-deps
cvat_redis_inmem:
condition: service_started
cvat_redis_ondisk:
condition: service_started
cvat_db:
condition: service_started

services:
cvat_db:
container_name: cvat_db
Expand All @@ -25,8 +36,22 @@ services:
networks:
- cvat

cvat_redis:
container_name: cvat_redis
cvat_redis_inmem:
container_name: cvat_redis_inmem
image: redis:7.2.3-alpine
restart: always
command: [
"redis-server",
"--save", "60", "100",
"--appendonly", "yes",
]
volumes:
- cvat_inmem_db:/data
networks:
- cvat

cvat_redis_ondisk:
container_name: cvat_redis_ondisk
image: eqalpha/keydb:x86_64_v6.3.2
restart: always
command: [
Expand All @@ -46,9 +71,10 @@ services:
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
- cvat_opa
<<: *backend-deps
cvat_opa:
condition:
service_started
environment:
<<: *backend-env
DJANGO_MODWSGI_EXTRA_ARGS: ''
Expand Down Expand Up @@ -79,13 +105,10 @@ services:
container_name: cvat_utils
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
- cvat_opa
depends_on: *backend-deps
environment:
<<: *backend-env
CVAT_REDIS_PASSWORD: ''
CVAT_REDIS_INMEM_PASSWORD: ''
NUMPROCS: 1
command: run utils
volumes:
Expand All @@ -99,9 +122,7 @@ services:
container_name: cvat_worker_import
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 2
Expand All @@ -117,9 +138,7 @@ services:
container_name: cvat_worker_export
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 2
Expand All @@ -135,10 +154,7 @@ services:
container_name: cvat_worker_annotation
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
- cvat_opa
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 1
Expand All @@ -154,10 +170,7 @@ services:
container_name: cvat_worker_webhooks
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
- cvat_opa
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 1
Expand All @@ -173,9 +186,7 @@ services:
container_name: cvat_worker_quality_reports
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 1
Expand All @@ -191,9 +202,7 @@ services:
container_name: cvat_worker_analytics_reports
image: cvat/server:${CVAT_VERSION:-dev}
restart: always
depends_on:
- cvat_redis
- cvat_db
depends_on: *backend-deps
environment:
<<: *backend-env
NUMPROCS: 2
Expand Down Expand Up @@ -371,6 +380,7 @@ volumes:
cvat_data:
cvat_keys:
cvat_logs:
cvat_inmem_db:
cvat_events_db:
cvat_cache_db:

Expand Down
5 changes: 5 additions & 0 deletions helm-chart/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ dependencies:
repository: https://helm.traefik.io/traefik
condition: traefik.enabled

- name: redis
version: "18.5.*"
repository: https://charts.bitnami.com/bitnami
condition: redis.enabled

- name: keydb
version: 0.48.0
repository: https://enapter.github.io/charts/
Expand Down
24 changes: 21 additions & 3 deletions helm-chart/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -62,18 +62,36 @@ Create the name of the service account to use
{{- end }}

{{- define "cvat.sharedBackendEnv" }}
{{- if .Values.redis.enabled }}
- name: CVAT_REDIS_INMEM_HOST
value: "{{ .Release.Name }}-redis-master"
{{- else }}
- name: CVAT_REDIS_INMEM_HOST
value: "{{ .Values.redis.external.host }}"
{{- end }}
- name: CVAT_REDIS_INMEM_PORT
value: "6379"
- name: CVAT_REDIS_INMEM_PASSWORD
valueFrom:
secretKeyRef:
name: "{{ tpl (.Values.redis.secret.name) . }}"
key: password

{{- if .Values.keydb.enabled }}
- name: CVAT_REDIS_HOST
- name: CVAT_REDIS_ONDISK_HOST
value: "{{ .Release.Name }}-keydb"
{{- else }}
- name: CVAT_REDIS_HOST
- name: CVAT_REDIS_ONDISK_HOST
value: "{{ .Values.keydb.external.host }}"
{{- end }}
- name: CVAT_REDIS_PASSWORD
- name: CVAT_REDIS_ONDISK_PORT
value: "6379"
- name: CVAT_REDIS_ONDISK_PASSWORD
valueFrom:
secretKeyRef:
name: "{{ tpl (.Values.keydb.secret.name) . }}"
key: password

{{- if .Values.postgresql.enabled }}
- name: CVAT_POSTGRES_HOST
value: "{{ .Release.Name }}-postgresql"
Expand Down
12 changes: 12 additions & 0 deletions helm-chart/templates/cvat-redis-secret.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{{- if .Values.redis.secret.create }}
apiVersion: v1
kind: Secret
metadata:
name: "{{ tpl (.Values.redis.secret.name) . }}"
namespace: {{ .Release.Namespace }}
labels:
{{- include "cvat.labels" . | nindent 4 }}
type: generic
stringData:
password: {{ .Values.redis.secret.password | toString | toYaml }}
{{- end }}
14 changes: 14 additions & 0 deletions helm-chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,20 @@ postgresql:
postgres_password: cvat_postgresql_postgres
replication_password: cvat_postgresql_replica

# https://artifacthub.io/packages/helm/bitnami/redis
redis:
enabled: true
external:
host: 127.0.0.1
architecture: standalone
auth:
existingSecret: "cvat-redis-secret"
existingSecretPasswordKey: password
secret:
create: true
name: cvat-redis-secret
password: cvat_redis
# TODO: persistence options

# https://artifacthub.io/packages/helm/enapter/keydb
keydb:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ description: 'Installing a development environment for different operating syste
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build \
cvat_opa cvat_db cvat_redis cvat_server
cvat_opa cvat_db cvat_redis_inmem cvat_redis_ondisk cvat_server
```
Note: this runs an extra copy of the CVAT server in order to supply rules to OPA.
Expand Down
Loading

0 comments on commit 48ab12b

Please sign in to comment.