Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear rules on uniqueness of record #847

Closed
jpmckinney opened this issue Mar 29, 2019 · 4 comments
Closed

Unclear rules on uniqueness of record #847

jpmckinney opened this issue Mar 29, 2019 · 4 comments
Assignees
Labels
Focus - Documentation Includes corrections, clarifications, new guidance, and UI/UX issues
Milestone

Comments

@jpmckinney
Copy link
Member

jpmckinney commented Mar 29, 2019

At many locations, it's recommended that there be only one record per process. However:

  • The records reference page at the same time says that publishers can consider publishing versioned and non-versioned copies of the record.
  • If a record is updated each time a release is published, it's likely that old versions exist in e.g. ZIP files prepared on a daily, weekly or monthly basis.
  • If a record is updated each time a release is published, it's likely that users, aggregators, etc. will have old copies of the record.

As such, it's clear that there isn't only one record per process.

What we really want is for it to be clear which record is the most up-to-date and complete. We should update the documentation to be clearer about this.

Need to update releases_and_records.md and records_reference.md among others.

@jpmckinney jpmckinney added the Focus - Documentation Includes corrections, clarifications, new guidance, and UI/UX issues label Mar 29, 2019
@jpmckinney jpmckinney added this to the 1.2 milestone Mar 29, 2019
@yolile yolile self-assigned this Nov 11, 2020
@yolile yolile removed their assignment Jan 5, 2021
@duncandewhurst
Copy link
Contributor

Related to #1264

@jpmckinney
Copy link
Member Author

Merging #1264 into this issue:

The implications of this statement are unclear. For example, cases that are allowed:

  • If a publisher creates a bulk download of all data on some schedule, and names the file according to the creation date, it is allowed for a contracting process to have a record in each bulk download.
  • If a publisher has API endpoints for OCDS 1.0 and 1.1, it is allowed for a contracting process to have a record in each endpoint. (See Upgrade docs: Interaction with release immutability and dates #849 for similar discussion about release IDs.)
  • If a publisher added an API method to view a record "as of" a user-defined date, this should be allowed.
  • If an API endpoint allows the user to define which fields to return (like in GraphQL), this should be allowed.

Cases that we want to disallow:

  • An API endpoint contains multiple records with the same OCID. (A given record, however, is allowed to be modified by user-defined arguments, like an "as of" date or GraphQL field selection.)
  • A bulk download contains multiple records with the same OCID.

As such, we might need a paragraph to explain that it is per contracting process per distribution (in the sense of DCAT), where a distribution might be a specific API endpoint or a specific bulk download file.

In reviewing the Primer draft, @JachymHercher wrote:

Is our request really for people to delete old records (i.e. old JSONs), e.g. because they could confuse reusers? Or is the point that only the record with the latest date is relevant (because each record contains all the releases up to date) and we don't actually really care about the clean up / it's a detail (possibly not worth mentioning in the primer)?

@duncandewhurst
Copy link
Contributor

I suggest updating the first sentence of https://standard.open-contracting.org/staging/1.2-dev/en/schema/records_reference/ to read:

A record aggregates the releases related to a contracting process. There should be a single record per contracting process per distribution, where a distribution might be a specific API endpoint or a specific bulk download file.

To account for old records and API methods with an "as of" argument, we might need to clarify the following requirements:

  • a record must provide a list of all the existing OCDS releases
  • the compiled release contains the current state of all the fields
  • A versioned release aggregates all values of all fields from all releases

Variations on those requirements appear in several places:

One option is to add 'at the time of publication', which is already used to describe a compiled release in https://standard.open-contracting.org/staging/1.2-dev/en/schema/records_reference/#record-structure. @jpmckinney, what do you think?

We should also update the following sentence from https://standard.open-contracting.org/staging/1.2-dev/en/getting_started/releases_and_records/:

While releases are never updated, records are updated each time there is a change.

to

While releases are never updated, records can be updated each time there is a change.

@duncandewhurst duncandewhurst self-assigned this Jun 14, 2021
@jpmckinney
Copy link
Member Author

All sounds good. I think "at the time of publication" works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Focus - Documentation Includes corrections, clarifications, new guidance, and UI/UX issues
Projects
Status: Done
Development

No branches or pull requests

3 participants