rfc | start_date | decision_date | pr | status |
---|---|---|---|---|
21 |
2018-10-03 |
2018-10-30 |
approved |
This RFC proposes creating a new Snapshot resource and redefine the Record resource to have a single responsibility.
This is a breaking change.
One of the most difficult concepts to explain in the Registers ecosystem is the concept of “record”. The reasons boil down to two things: First, the concept of “record” has shifted over time as a natural adaptation to how Registers has evolved and second the current record definition doesn't match the expectations on the record resource. Probably the second is consequence of the first.
Let me elaborate on what does “the current record definition doesn't match the expectations on the record resource” mean.
The current definition of record is: The most recent entry for a given key. The record resource though is the combination of the most recent entry for a given key and the blob of data referenced by it.
The reason for having the blob of data merged in is to make the record resource useful to the type of user that wants to get the data for a given key.
The reason for having the entry data is to provide the context for how it relates to the log of entries. Useful to users that need to deal with entries, blobs and the log itself.
The first reason is “porcelain” in the sense of “don't make me think, give me the data”. The second is “plumbing” in the sense of “let me handle the low level bits myself”.
The result is tension between these two needs and an ill-defined concept.
Finally, there is a discrepancy of structure between JSON and CSV serialisations and not a strong reason for being that way.
The proposal is to split both needs into dedicated resources and to make records converge across serialisation.
Let's define “snapshot” as a set containing the most recent entry for each key.
type Snapshot =
Set Entry
The operation to compute the snapshot for a given log can be defined as:
collect : Log -> Snapshot
EXAMPLE:
For example, given log log
of size 10, we can compute the latest snapshot
with:
collect log
And compute the snapshot at size 5 by slicing the log to that size:
collect (take 5 log)
Let's redefine “record” as the combination of the data with the entry key.
type Record =
{ key : ID
, data : Dict Name Value
}
Note that the attribute data
holds the same type of data structure as the
blob. The reason for not using the blob here is because this opens the door
to return data that is not a perfect match with the blob. For example, the RFC
draft for removing fields would benefit from this flexibility.
Note that Value
is either a string or a set of strings, the same as the blob
definition.
The operation to compute the record for a given entry can be defined as:
view : BlobStore -> Entry -> Result Error Record
Where BlobStore
is the complete set of known blobs. Given that the blob
store may not contain the blob referenced by the entry, this computation can
fail with an error.
EXAMPLE:
For example, to compute the latest set of records for a given log:
map (view store) (collect log)
So, first we compute the snapshot for the full log and then we transform each entry into a record.
Returns an array of entry resources.
This resource MAY be paginated like the rest of the collections.
GET /snapshots/{log-size}
Name | Type | Description |
---|---|---|
log-size |
Integer | The log size to compute the snapshot. |
Note: We can consider some kind of alias for the latest log size like the
latest
tag used by Docker.
Returns an entry resource.
GET /snapshots/{log-size}/{key}
Name | Type | Description |
---|---|---|
log-size |
Integer | The log size to compute the snapshot. |
key |
ID | The unique identifier for an element of the register. |
Returns a record resource.
GET /records/{id}
Name | Type | Description |
---|---|---|
id |
ID | The unique identifier for an element of the register. |
The record resource MUST be a JSON object with the data for the record. It
MUST also have the key of the entry using the reserved attribute _id
. Given
that an attribute name is of type Name
, it is not possible to have a clash
with _id
(underscores (_
) are not allowed as the first character for a
type Name
).
EXAMPLE:
For example, the most recent record for the contry identified by GB
:
GET /records/GB HTTP/1.1
Accept: application/json
Host: country.register.gov.uk
{
"_id": "GB",
"name": "United Kingdom",
"official-name": "The United Kingdom of Great Britain and Northern Ireland",
"citizen-names": ["Briton", "British citizen"]
}
The record resource MUST be a CSV row with the data for the record. It
MUST also have the key of the entry using the reserved attribute _id
. Given
that an attribute name is of type Name
, it is not possible to have a clash
with _id
.
EXAMPLE:
For example, the most recent record for the contry identified by GB
:
GET /records/GB HTTP/1.1
Accept: text/csv
Host: country.register.gov.uk
_id, name, official-name, citizen-names
GB, "United Kingdom", "The United Kingdom of Great Britain and Northern Ireland", "Briton;British citizen"
Returns an array of record resources.
GET /records
This proposal doesn't offer a way to return records for a snapshot other than
the latest one. It could be easily extended with a querystring {?log-size}
if there is need for it.
This is a breaking change.