Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TP2000-1493 - TAP driven quota open data export #1302

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions additional_codes/tests/test_business_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ def test_ACN2_type_must_exist(reference_nonexistent_record):
def test_ACN2_allowed_application_codes(app_code, expect_error):
"""The referenced additional code type must have as application code "non-
Meursing" or "Export Refund for Processed Agricultural Goods”."""

additional_code = factories.AdditionalCodeFactory.create(
type__application_code=app_code,
)
Expand Down
50 changes: 50 additions & 0 deletions exporter/management/commands/export_quotas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import logging
from typing import Any
from typing import Optional

from django.core.management import BaseCommand
from django.core.management.base import CommandParser

from exporter.quotas.tasks import export_and_upload_quotas_csv

logger = logging.getLogger(__name__)


class Command(BaseCommand):
help = (
"Create a CSV of quotas for use within data workspace to produce the "
"HMRC tariff open data CSV. The filename take the form "
"quotas_export_<yyyymmdd>.csv. Care should be taken to ensure that "
"there is sufficient local file system storage to accommodate the "
"CSV file (although it should not be very large, less than 5MB "
"(1.8MB at time of creation) - if you choose to target remote S3 "
"storage, then a temporary local copy of the file will be created "
"and cleaned up."
)

def add_arguments(self, parser: CommandParser) -> None:
parser.add_argument(
"--asynchronous",
action="store_const",
help="Queue the CSV export task to run in an asynchronous process.",
const=True,
default=False,
)
parser.add_argument(
"--save-local",
help=(
"Save the quotas CSV to the local file system under the "
"(existing) directory given by DIRECTORY_PATH."
),
dest="DIRECTORY_PATH",
)
return super().add_arguments(parser)

def handle(self, *args: Any, **options: Any) -> Optional[str]:
logger.info(f"Triggering quotas export to CSV")

local_path = options["DIRECTORY_PATH"]
if options["asynchronous"]:
export_and_upload_quotas_csv.delay(local_path)
else:
export_and_upload_quotas_csv(local_path)
43 changes: 43 additions & 0 deletions exporter/quotas/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""
quotas Export
=============
The quotas export system will query the TAP database for published quota data and store in a CSV
file.
The general process is:
1. query the TAP database for the correct dataset to export.
2. Iterate the query result and create the data for the output.
3. Write the data to the CSV file
4. Upload the result to the designated storage (S3 or Local)
This process has been chosen to optimise for:
- Speed, query and data production speed will be a lot faster when processed at source.
- Testability, We have the facility to test the output and process within TAP effectively
- Adaptability, With test coverage highlighting any issues caused by database changes etc., the adaptability
if this implementation is high
- Data Quality, Using TAP to produce the data will improve the quality of the output as it's using the same filters
and joins as TAP its self does - removing the need to run queries in SQL which has been problematic, and is
difficult to maintain.
"""

import os
import shutil
from itertools import chain
from pathlib import Path
from tempfile import NamedTemporaryFile

import apsw
from django.apps import apps
from django.conf import settings

from exporter.quotas import runner
from exporter.quotas import tasks


def make_export(quotas_csv_named_temp_file: NamedTemporaryFile):
quota_csv_exporter = runner.QuotaExport(quotas_csv_named_temp_file)
quota_csv_exporter.run()
Loading
Loading