Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API endpoint for data registry integration for collection metadata #421

Closed
jpmckinney opened this issue Apr 9, 2024 · 2 comments · Fixed by #424
Closed

Add API endpoint for data registry integration for collection metadata #421

jpmckinney opened this issue Apr 9, 2024 · 2 comments · Fixed by #424
Assignees
Labels
feature Relating to loading data from the web API or CLI command
Milestone

Comments

@jpmckinney
Copy link
Member

jpmckinney commented Apr 9, 2024

The registry currently sets the following from Pelican:

        job.date_from = parse_date(meta.get("published_from"))
        job.date_to = parse_date(meta.get("published_to"))
        job.license = meta.get("data_license") or ""
        job.ocid_prefix = meta.get("ocid_prefix") or ""

Pelican in meta_data_aggregator.py currently just takes the OCID prefix from the ocid of the first release/record, and the license (and publication policy) from the first package.

It also executes a single SQL query to get the published from/to dates.

We can add an endpoint to perform these simple, fast queries.

@yolile
Copy link
Member

yolile commented Apr 15, 2024

@jpmckinney
Copy link
Member Author

jpmckinney commented Apr 19, 2024

In #424 I'm also deleting an unused /collections/ list endpoint. It had a query to count the types of steps, which can be more easily rewritten as follows:

https://docs.djangoproject.com/en/5.0/ref/models/conditional-expressions/#conditional-aggregation

Collection.objects.aggregate(
    LOAD_steps_remaining=Count("pk", filter=Q(processing_steps__name=ProcessingStep.Name.LOAD)),
    COMPILE_steps_remaining=Count("pk", filter=Q(processing_steps__name=ProcessingStep.Name.COMPILE)),
    CHECK_steps_remaining=Count("pk", filter=Q(processing_steps__name=ProcessingStep.Name.CHECK)),
)

If we ever need it in future (e.g. we can add it to models.py and then use it in the collectionstatus command (e.g. instead of just giving the total count of steps).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Relating to loading data from the web API or CLI command
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants