Implementation proposal: representing changing beneficial ownership over time #475
Replies: 14 comments 8 replies
-
Firstly, thank you for putting this detailed proposal together! As I've argued in the past, and as it was pointed in the document again, there are limitations and complex questions around The fact that the temporal aspect of this data can now be captured more organically is probably the most attractive outcome of this proposal. I can see a series of problems and use cases that could now be solved with this added support. I'm still reflecting on the ability to produce the latest (or "most recent") version of the ownership and control network. If I'm not mistaken, to achieve that, one would simply discard records that are "not closed"? I'm also a bit unsure whether there's enough clarity about closing records on the publisher's side and the potential complexity there. For example, in Declaration 4, described in the proposal, R2 R3 E2 are all marked as closed. So firstly, there's a need for the publisher to make sure all 3 are marked consistently. But in theory, I believe that closing R2 and R3 would suffice, because E2 would then implicitly be eliminated from the chain when reading only "not closed" relationships. There is also a subtlety which I think we might need to capture here. What if E2 is still present in the register with other relationships - I'm assuming that in this case a publisher would be advised not to close the E2 record, even if R2/R3 are gone, correct? I think this raises the question if "closing" should be exclusive to relationship records. It's not 100% clear what is the meaning of closing an entity record. Closing an entity/person record seems to be an implication of relationships being closed, if I understood this correctly. It might be a lot easier for publishers to only worry about closing relationships records, since reaching entities/people that are part of those relationships won't be possible anyway (in the "latest" snapshot). It would also remove the complexity around "is E2 still needed for other things? OK, then don't close it, even if R2 R3 are closed", etc. |
Beta Was this translation helpful? Give feedback.
-
In terms of the nomenclature we're proposing, good to note that our proposed concept of a 'record' aligns with the European Union's understanding of the same term as set out in the 2021 regulations relating to the Beneficial Ownership Registers Interconnection System (BORIS):
|
Beta Was this translation helpful? Give feedback.
-
@cosmin-marginean - thanks so much for sharing those reflections.
Other way around, I think. Discard all statements which are part of closed series. Then, of the remainder, only retain the latest statement each record-series.
Or, theoretically, closing E2 (disclosing that E2 is no longer an intermediary in this chain) should close R2 and R3. It is - yes - something that the publisher would need to handle.
Well, I think this is where the BODS concept of a record differs from a publisher's concept. If a publisher has a single database record for an entity, but different declarants make statements about that entity, then there should be a unique recordID per declarant. So, the E2 BODS record would be closed in this case, but not the DB record which E2 pointed to. I'd diagram that out if I had time!
I need to think a bit more about the implications of centring the relationship records in this way. Interesting idea. |
Beta Was this translation helpful? Give feedback.
-
Yes, of course, I wrote two half sentences separately and came out wrong - it makes sense now. Thanks for looking into this! I will ponder and re-read as well. |
Beta Was this translation helpful? Give feedback.
-
Here's a link to a few slides I created to show how the data model would change in line with this proposal. |
Beta Was this translation helpful? Give feedback.
-
Noting here an unintended feature of the French BO data that @Blueskies00 was looking at, most probably related to change over time. The data contained a series of records for the same person, containing exactly the same information. It looks like there may have been updates to fields redacted for publication (e.g. person's full address) which kicked off the publication of the equivalent of a new statement. We may want to include advice in the 0.4 documentation about best practice in this situation. Specifically: where fields are not exposed to publication via a given BODS stream/channel, an update to that field should not trigger the publication of a statement to that stream/channel. |
Beta Was this translation helpful? Give feedback.
-
I'm quite hesitant about a status field, without some real evidence that publishers have the capacity to update it correctly. I wrote a longer comment about records and metadata more generally in #477 (comment) |
Beta Was this translation helpful? Give feedback.
-
I've just had a read-through of the implementation proposal for changing the statement data structure for a future version of BODS. It seems mostly okay to me, and I'm supportive of the overall idea. I'm very supportive of the idea of getting rid of
This is minor, but I wonder about using such a term which is already doing a lot of heavy lifting both in data storage and informal discussion. I can foresee discussions similar to: 'the record was updated'—'wait, do you mean the record record, or the statement record', etc.
I think the nomenclature here could lead to a little confusion. Does a record only have
What if the name of the beneficial owner is updated a second time? The
So does this mean the record
What if multiple statements are made about an entity on the same date? I suppose this is unlikely when considering records coming from UK Companies House, for example, but what about for countries which have highly automated, digital corporate registers like Estonia? And what if an entity being updated causes a different source to update their records and make a new statement, resulting in the same date?
Versus the record
But doesn't that mean that entities from different sources would have multiple records? In that cases, what about sources which make statements about entities in another country (such as already happens)? And how would this be compatible with the idea that a single record refers to an entity? Making multiple records and statement series about the same entity would still necessitate some kind of multi-record reconciliation or merging process, would it not? And this would greatly complicate using the data model.
It's probably worth including some concept of a
It's not clear to me how this would be accomplished without introducing the notion of partial vs full statements, and also introducing traversal within the statement series to resolve partial updates into a full model. This would be much more performance-intensive, but perhaps more importantly, it would be important to be clear when importing statements form a source about whether they are updates or full statements themselves. For example, if an entity had two addresses, and then there was an updated statement with a third address, does that mean that the previous two addresses are replaced, or does it mean that there are now three addresses? And does it mean that company number, etc., is deleted, because it hasn't been mentioned, but if treated as an update, which fields are updated and what is the replacement or merging strategy? This would require further modelling, since it could become complex.
Will the new records be exported at all? I suppose they will have to be, if relationships will be between them. So there will be statements contained within records coming from sources (and potentially including the statement series closed meta-statement…), but also statements outside of records relating records for ownership-and-control? |
Beta Was this translation helpful? Give feedback.
-
@tiredpixel - Thank you so much for taking time to interrogate these proposals: it's extremely useful to re-examine things from others' point of view. I've tried to answer as best I can below.
The data standard is designed for data exchange so I think it helps to consider record storage and management as something done by data-handling and storage systems (e.g. a company register) that may or may not involve bods-style structured data. So - for example - a system might not even maintain a status field, but generate a In any case, I think a preferable way of saying "many records will continue to be marked as new even if they've been there for a long time" is that: statements are immutable and stick around and the first time a record with a given
No - it's not a transitional status.
Yes, the status would stay as 'updated'.
So, I think this is where the particular use case of the OO register comes in. TBH, the Register is not currently designed to demonstrate how best to handle incoming BODS streams, update related records and then publish reconciled BO information. I'd argue that that should be the next iteration of development! I'd say that, if the Register was designed as a demonstration system, then it might publish a
In BODS 0.4, the statementDate field will accept timestamps, date-time info.
That's fine. You'd have two statements from different sources on the same date and about the same entity. The statement model in BODS exists because of considerations like that. I'd love for the next iteration of the Register to demonstrate how competing statements about the same real world thing can be handled instructively. So, for example, imagine Source A updated the name of a company to MERIDIAN INC on 12th Dec but Source B updated it to MERIDIAN INC on the 14th Dec. Then if you search for the company in the Register you'd see that there was an information conflict for a couple of days.
Record information is always wrapped in a statement. (That is how we're 'repackaging' things, compared with previous versions of BODS: see the spreadsheet linked to from #477.) So, a statement's
Again, I think we're talking about the particular use case of the OO register here. Ideally, yes, the Register would be handling statements from different sources about the same objects (entities, people, relationships). (And each of those sources would have their own separate records for those objects.) The question for the Register (in the future!) is: how are those information streams handled? The answer depends on what the purpose of the Register is (in the future).
That might work for internal information-handling, but not for publishing: it would break immutability. A statement would only have
Yes - I agree. Much more thought would need to be given to any slimmed-down publishing format.
'Record' is really being used as a concept in BODS 0.4 to refer to a record that exists in a publisher's database or system. Data held in those records is published as point-in-time snapshots via BODS Statement objects. We'll be doing a thorough update of the docs to explain the conceptual model and the data model and how they relate to one another. |
Beta Was this translation helpful? Give feedback.
-
@kd-ods, thank you for taking the time to answer my questions so comprehensively. The proposal makes a lot more sense to me, now. I can see that one thing which was causing confusion was my OO Register-centric thinking, whereas of course the BODS is designed to be far more agnostic than that. Given that OO Register 2 mostly stores things directly in BODS format, this led me to think that perhaps some things should be included in the new standard. But I understand your point about It's great that
This is very interesting. I, too, would be curious to see how these divergent or forked views of reality could be modelled and utilised effectively. Perhaps it would be instructive for OO Register to introduce the concept of an observer, at least in thinking about the implementation. (Out of scope for here, I know, but interesting to think about.) So, to try to ensure I now understand the proposal, please permit me to make some basic statements, from an alternative point of view. I'd be very grateful if you could confirm or correct my refined view of things.
|
Beta Was this translation helpful? Give feedback.
-
Yes
Yes
Yes
Well, Statements (or, more precisely the recordDetails object inside a statement) should reliably refer to a real world entity or person (or relationship). In the case of persons and entities via the
This isn't quite accurate. "BODS Statements refer to a BODS Record". No, BODS statements refer to a record held in a publisher's system. So if we think about the mapping that we currently have of UK Companies House PSC data to BODS 0.2; we need to identify within that PSC data source which id can reliably be mapped to the new
Yes, perhaps. Again, this is an implementation detail for the Register. It's probably helpful to consider that the register works: (1) To ingest several streams of BODS data. (2) To manage, process and display BO data. (3) To provide an export of processed and merged BO data in BODS format. (There is also a prior stage (0) Mapping source data to BODS.) So when you say "the
Exactly. |
Beta Was this translation helpful? Give feedback.
-
@kd-ods, thank for for the confirmations and corrections to my understanding. This discussion has been very useful in me understanding the proposal and BODS more comprehensively. |
Beta Was this translation helpful? Give feedback.
-
RecordsI think it's important to always keep in mind what facts BODS intends to represent. I believe it intends to represent declarations (made by declarants) that contain statements relevant to the domain of beneficial ownership (about control, etc.). The record proposal expands the scope of BODS, to cover the representation of these facts within source systems. However, other than as an attempt to clarify changes over time, there are no expressed use cases or user needs for this new scope. As such, I would strongly encourage keeping BODS to its existing scope, and exploring alternative solutions to changes over time. Changes over timeWithin government-controlled systems and processes, my understanding is that the current BO information for a given declarant (company) is whatever is in their most recent declaration. No complex algorithm or special logic is required to disentangle what information is true at a given point in time – you just pull up the most recent declaration prior to the given point in time. Note: If there are multiple, conflicting declarations about the same entities from different sources, no amount of standardization will help users – users will need to decide for themselves which sources they trust. This problem of changes over time can therefore be narrowed to changes within a given system. Is there any real, experienced issues with the above solution? (Other than Identifiers#392 (and thus the record proposal) mixes in another issue that isn't directly related to changes over time.
The fundamental problem is that BODS doesn't have people or organizations, it just has statements. In BODS-land, there is no RDF node "Bob" about which Alice states "his eyes are green" (ID 1), and Vaughn states "his eyes are brown" (ID 2), and Noah states "his eyes are blue" (ID 3). There is only their free-floating statements. So, when Alice wants to correct with "his eyes are brown" (ID 4), then every "1" needs to be changed to a "4" – because BODS has no way to identify "Bob". The record proposal is saying: Alice has a compartment in her head about Bob, let's call it "Record B". And similarly for Vaugh and Noah. When Alice makes statement ID 4, nothing needs to be updated, because everything is referring to Record B, which is constant. I think a much simpler, clearer and natural solution is this: Add Person and Organization classes to BODS. Forget about "records". The result is the same – but much less confusing. The only difference is perhaps that you might expect "Bob, the person" to be the same node for Alice, Vaughn and Noah – whereas you are okay accepting that "Bob, the mental compartment" is distinct for each of Alice, Vaughn and Noah. But... we're not trying to solve the holy grail of the semantic web here. On the web and all over RDF, different sources make statements about the same thing without standardizing identifiers or URLs, all the time. So, I wouldn't worry about allowing the UK to have a Person with one ID and France to have the same person with another ID - that's totally normal and not an issue with this new proposal. |
Beta Was this translation helpful? Give feedback.
-
Thanks for everyone’s input and scrutiny on this proposal. Response to issues raised aboveWe will be using the ‘record’ concept explicitly to support the handling of changing beneficial ownership over time, and the representation of the lifecycle of data within publishing systems. There was a question about forcing publishers to create different Systems for handling beneficial ownership data will have various ways of managing the lifecycle of their records. For those with ‘low-resolution’ detail of beneficial ownership over time, it may not be practical to populate a Accepted proposalBringing it all together, we will be implementing this proposal, in line with the initial proposal summary, with the following nuances:
|
Beta Was this translation helpful? Give feedback.
-
See Feature development in BODS in the Handbook.
Comments on this ticket can be used to question, refine and develop this implementation proposal. Interactions and work on this ticket represent a collaborative process. Proposals may be paused, withdrawn, or developed into draft implementation plans. Update the 'Proposal status' as thinking progresses on this thread. Highlight changes or updates to the proposal within thread comments, with a clear 'Updated proposal' heading.
Implementation proposal for: representing changing beneficial ownership over time
Feature ticket link #392
Implementation proposal status: BEING IMPLEMENTED (Not: paused | withdrawn | active)
Initial proposal
Overview
We propose that a series of statements about the same person, entity or ownership-or-control relationship is linked over time by a dedicated identifier. To support this, we propose updating our conceptual and data model. In particular:
We introduce the concept of a record, in recognition that publishing systems are likely to have an updatable record for each entity and individual (and ownership-or-control relationship) stored in their system. In the data model, the identifier that would link a series of statements about the same object over time would therefore be called a record ID (recordID).
The replacesStatements property would be removed.
A relationship statement* is linked to a subject entity and an interested party not by their statement IDs but by their record IDs.
When an entity, person or relationship is no longer disclosed as part of an ownership or control network, its record is closed. This is recorded in a Record Status field (recordStatus).
* See terminology change proposal
For the full details of the proposal for representing changing beneficial ownership over time see this doc.
Please add comments, thoughts and reactions to this thread, rather than as comments on the doc. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions