Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firebreak/data hub test data contacts #5348

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
7c48b7d
Merge branch 'main' of https://github.com/uktrade/data-hub-api
bau123 Apr 8, 2024
921f1dd
Merge branch 'main' of https://github.com/uktrade/data-hub-api
bau123 Apr 9, 2024
03a4447
Update redis
marijnkampf Mar 11, 2024
2ffc6ae
Update open search
marijnkampf Mar 11, 2024
dba86b0
Update pingdom end point url
marijnkampf Mar 11, 2024
e9da205
Feature/dpm 174 data hub api investigate rq health check (#5266)
marijnkampf Mar 21, 2024
542a1b5
DPM 199 Data Hub API logging (#5304)
marijnkampf Apr 2, 2024
c5d7bb3
do not run collectstatic at run time on DBT Platform
acodeninja Apr 3, 2024
f34159a
add build configuration for DBT Platform
acodeninja Apr 3, 2024
9e852f4
remove debug setup
acodeninja Apr 8, 2024
565d4c5
Initial commit for factories
marijnkampf Apr 8, 2024
8ab6646
Added progress indicator
marijnkampf Apr 8, 2024
2e20d7f
Reduce default number for testing
marijnkampf Apr 8, 2024
810af9f
Update company factories calls so that they use existing objects when…
swenban Apr 8, 2024
6682b4a
Fix linting errors and improve query efficiency
swenban Apr 9, 2024
4efd9e5
Fix print linting errors
swenban Apr 9, 2024
59ac8ee
Initial commit for factories
marijnkampf Apr 8, 2024
7cc4b96
Added progress indicator
marijnkampf Apr 8, 2024
13787c3
Reduce default number for testing
marijnkampf Apr 8, 2024
161e68d
Update company factories calls so that they use existing objects when…
swenban Apr 8, 2024
1033048
Fix linting errors and improve query efficiency
swenban Apr 9, 2024
de6eaa3
Fix print linting errors
swenban Apr 9, 2024
1c3c20a
generate contact records
bau123 Apr 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .copilot/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
repository: data-hub-api
builder:
name: paketobuildpacks/builder-jammy-full
version: 0.3.339
43 changes: 43 additions & 0 deletions .copilot/image_build_run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run inside the container after all the other buildpacks have been applied
export ADMIN_OAUTH2_ENABLED="True"
export ADMIN_OAUTH2_BASE_URL=""
export ADMIN_OAUTH2_TOKEN_FETCH_PATH="/o/token/"
export ADMIN_OAUTH2_USER_PROFILE_PATH="/o/v1/user/me/"
export ADMIN_OAUTH2_AUTH_PATH="/o/authorize/"
export ADMIN_OAUTH2_CLIENT_ID="client-id"
export ADMIN_OAUTH2_CLIENT_SECRET="client-secret"
export ADMIN_OAUTH2_LOGOUT_PATH="/o/logout"
export ACTIVITY_STREAM_ACCESS_KEY_ID="some-id"
export ACTIVITY_STREAM_SECRET_ACCESS_KEY="some-secret"
export DATABASE_URL="postgresql://postgres:datahub@postgres/datahub"
export DEBUG="True"
export DJANGO_SECRET_KEY="changeme"
export DJANGO_SETTINGS_MODULE="config.settings.local"
export ES_INDEX_PREFIX="test_index"
export ES5_URL="http://localhost:9200"
export OPENSEARCH_URL="http://localhost:9200"
export OPENSEARCH_INDEX_PREFIX="test_index"
export PAAS_IP_ALLOWLIST="1.2.3.4"
export AWS_DEFAULT_REGION="eu-west-2"
export AWS_ACCESS_KEY_ID="foo"
export AWS_SECRET_ACCESS_KEY="bar"
export DEFAULT_BUCKET="baz"
export SSO_ENABLED="True"
export STAFF_SSO_BASE_URL="http://sso.invalid/"
export STAFF_SSO_AUTH_TOKEN="sso-token"
export DIT_EMAIL_DOMAINS="trade.gov.uk,digital.trade.gov.uk"
export DATA_HUB_FRONTEND_ACCESS_KEY_ID="frontend-key-id"
export DATA_HUB_FRONTEND_SECRET_ACCESS_KEY="frontend-key"
export ES_APM_ENABLED="False"
export ES_APM_SERVICE_NAME="datahub"
export ES_APM_SECRET_TOKEN=""
export ES_APM_SERVER_URL="http://localhost:8200"
export ES_APM_ENVIRONMENT="circleci"
export REDIS_BASE_URL="redis://localhost:6379"

python manage.py collectstatic --noinput
6 changes: 6 additions & 0 deletions .copilot/phases/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the build phase
6 changes: 6 additions & 0 deletions .copilot/phases/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the install phase
6 changes: 6 additions & 0 deletions .copilot/phases/post_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the post_build phase
6 changes: 6 additions & 0 deletions .copilot/phases/pre_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the pre_build phase
17 changes: 9 additions & 8 deletions config/settings/common_logging.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import sys
import sentry_sdk
from django_log_formatter_ecs import ECSFormatter
from django_log_formatter_asim import ASIMFormatter

from sentry_sdk.integrations.django import DjangoIntegration

from config.settings.common import *
Expand All @@ -13,30 +14,30 @@
'verbose': {
'format': '%(asctime)s [%(levelname)s] [%(name)s] %(message)s'
},
'ecs_formatter': {
'()': ECSFormatter,
"asim_formatter": {
"()": ASIMFormatter,
},
},
'handlers': {
'ecs': {
'asim': {
'class': 'logging.StreamHandler',
'formatter': 'ecs_formatter',
'formatter': 'asim_formatter',
'stream': sys.stdout,
},
},
'root': {
'level': 'INFO',
'handlers': ['ecs'],
'handlers': ['asim'],
},
'loggers': {
'django': {
'level': 'INFO',
'handlers': ['ecs'],
'handlers': ['asim'],
'propagate': False,
},
'django.db.backends': {
'level': 'ERROR',
'handlers': ['ecs'],
'handlers': ['asim'],
'propagate': False,
},
},
Expand Down
2 changes: 1 addition & 1 deletion config/urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
path('', include('datahub.oauth.admin.urls')),
*admin_oauth2_urls,
path('admin/', admin.site.urls),
path('ping.xml', ping, name='ping'),
path('pingdom/ping.xml', ping, name='ping'),
path('whoami/', who_am_i, name='who_am_i'),
]

Expand Down
122 changes: 92 additions & 30 deletions data_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,23 @@
pre_save,
)

from datahub.company.models.adviser import Advisor
from datahub.company.models.company import Company
from datahub.company.models.contact import Contact
from datahub.company.test.factories import (
AdviserFactory,
ArchivedCompanyFactory,
# ArchivedCompanyFactory,
CompanyFactory,
CompanyWithAreaFactory,
DuplicateCompanyFactory,
# CompanyWithAreaFactory,
ContactFactory,
ContactWithOwnAddressFactory,
ContactWithOwnAreaFactory,
SubsidiaryFactory,
)
from datahub.metadata.models import Team


class DisableSignals:
Expand Down Expand Up @@ -68,50 +77,103 @@ def reconnect(self, signal):
with DisableSignals():
start_time = time.time()

# Pre fetch Metadata
teams = list(Team.objects.all())

advisers = Advisor.objects.all()

contact = Contact.objects.all()

# In February 2024 there were 18,000 advisers, 500,000 companies, and 950,000 contacts.
# Alter number of adivsers below to create larger or smaller data set.
advisers = AdviserFactory.create_batch(200)
print(f'Generated {len(advisers)} advisers') # noqa
# Generate Advisers
print('Generating advisers') # noqa
for index in range(10):
AdviserFactory(dit_team=random.choice(teams))
if index % 10 == 0:
print('.', end='') # noqa
advisers = Advisor.objects.all()

print(f'Generated {advisers.count} advisers') # noqa

# # Generate base companies
print('\nGenerating Companies') # noqa
for index, adviser in enumerate(advisers):
companies = CompanyFactory.create_batch(
random.randint(1, 25),
CompanyFactory.create_batch(
random.randint(0, 25),
created_by=adviser,
modified_by=adviser,
modified_by=random.choice(advisers),
)
if index % 10 == 0:
print('.', end='') # noqa

# The ratios of the below types of companies do not reflect the live database.
companies.extend(
SubsidiaryFactory.create_batch(
random.randint(1, 5),
created_by=adviser,
modified_by=adviser,
),

def generateContacts(advisers, min, max):
print('\nGenerating contacts on advisers')
for index, adviser in enumerate(advisers):
ContactFactory.create_batch(
random.randint(min, max),
created_by=random.choice(advisers),
modified_by=random.choice(advisers),
)

print('\nGenerating contacts on advisers with a different address from company')
for index, adviser in enumerate(advisers):
ContactWithOwnAddressFactory.create_batch(
random.randint(min, max),
created_by=random.choice(advisers),
modified_by=random.choice(advisers),
)

print('\nGenerating contacts on advisers with a different address from the contact company that includes an '
'area')
for index, adviser in enumerate(advisers):
ContactWithOwnAreaFactory.create_batch(
random.randint(min, max),
created_by=random.choice(advisers),
modified_by=random.choice(advisers),
)

print('\nGenerating Company variations') # noqa
companies = Company.objects.all()
# The ratios of the below types of companies do not reflect the live database.
# Generate different type of companies
for index, adviser in enumerate(advisers):
SubsidiaryFactory.create_batch(
random.randint(0, 25),
created_by=adviser,
modified_by=random.choice(advisers),
global_headquarters=random.choice(companies),
)
CompanyWithAreaFactory.create_batch(
random.randint(0, 1),
created_by=adviser,
modified_by=random.choice(advisers),
)
companies.extend(
CompanyWithAreaFactory.create_batch(
random.randint(0, 1),
created_by=adviser,
modified_by=adviser,
),
ArchivedCompanyFactory.create_batch(
random.randint(0, 1),
created_by=adviser,
modified_by=adviser,
)
companies.extend(
ArchivedCompanyFactory.create_batch(
random.randint(0, 1),
created_by=adviser,
modified_by=adviser,
),
DuplicateCompanyFactory.create_batch(
random.randint(0, 1),
created_by=adviser,
modified_by=adviser,
transferred_by=random.choice(advisers),
transferred_to=random.choice(companies),
)

# Show a sign of life every now and then
if index % 10 == 0:
print('.', end='') # noqa

# The below ratio of contacts to companies does not reflect the live database.
for company in companies:
ContactFactory.create_batch(
random.randint(1, 2),
company=company,
created_by=adviser,
)
# for company in companies:
# ContactFactory.create_batch(
# random.randint(1, 2),
# company=company,
# created_by=adviser,
# )

elapsed = time.time() - start_time
print(f'{timedelta(seconds=elapsed)}') # noqa
42 changes: 42 additions & 0 deletions datahub/core/management/commands/rq_health_check.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import sys

from functools import reduce
from logging import getLogger
from operator import concat

from django.conf import settings
from django.core.management.base import BaseCommand
from redis import Redis
from rq import Worker


logger = getLogger(__name__)


class Command(BaseCommand):
help = 'RQ Health Check'

def add_arguments(self, parser):
"""Define extra arguments."""
parser.add_argument(
'--queue',
type=str,
help='Name of the queue to perform health check on.',
)

def handle(self, *args, **options):
if options['queue']:
queue = str(options['queue'])
redis = Redis.from_url(settings.REDIS_BASE_URL)
workers = Worker.all(connection=redis)
queue_names = reduce(concat, [worker.queue_names() for worker in workers], [])
missing_queues = set([queue]) - set(queue_names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
missing_queues = set([queue]) - set(queue_names)
missing_queues = {queue} - set(queue_names)

Using set literal syntax is simpler and computationally quicker. More info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
missing_queues = set([queue]) - set(queue_names)
missing_queues = {queue} - set(queue_names)

Using set literal syntax is simpler and computationally quicker. More info.


if missing_queues:
logger.error(f'RQ queue not running: {missing_queues}')
sys.exit(1)
logger.info('OK')
sys.exit(0)

logger.error('Nothing checked! Please provide --queue parameter')
sys.exit(1)
63 changes: 63 additions & 0 deletions datahub/core/test/management/commands/test_rq_health_check.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import logging
from unittest import mock
from unittest.mock import patch

import pytest

from django.core.management import call_command


class MockWorker:
"""
Mock queue names object returned by worker
"""

queue_name = ''

def __init__(self, queue_name, *args, **kwargs):
self.queue_name = queue_name

def queue_names(self):
return self.queue_name


def test_rq_health_check_ok():
logger = logging.getLogger('datahub.core.management.commands.rq_health_check')
with patch(
'datahub.core.management.commands.rq_health_check.Worker.all',
return_value=[MockWorker(['short-running']), MockWorker(['long-running'])],
):
with mock.patch.object(logger, 'info') as mock_info:
with pytest.raises(SystemExit) as exception_info:
call_command('rq_health_check', '--queue=short-running')

assert exception_info.value.code == 0
assert 'OK' in str(mock_info.call_args_list)
assert mock_info.call_count == 1


def test_rq_health_check_rq_not_running():
logger = logging.getLogger('datahub.core.management.commands.rq_health_check')
with patch(
'datahub.core.management.commands.rq_health_check.Worker.all',
return_value=[MockWorker(['long-running'])],
):
with mock.patch.object(logger, 'error') as mock_error:
with pytest.raises(SystemExit) as exception_info:
call_command('rq_health_check', '--queue=short-running')

assert exception_info.value.code == 1
assert "RQ queue not running: {\'short-running\'}" in str(mock_error.call_args_list)
assert mock_error.call_count == 1


def test_command_called_without_parameter():
logger = logging.getLogger('datahub.core.management.commands.rq_health_check')
with mock.patch.object(logger, 'error') as mock_error:
with pytest.raises(SystemExit) as exception_info:
call_command('rq_health_check')

assert exception_info.value.code == 1
assert 'Nothing checked! Please provide --queue parameter' \
in str(mock_error.call_args_list)
assert mock_error.call_count == 1
Loading