-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: IBA/IBE/ICE refactor #564
Comments
Bump: This has been in the works for a long time. Could people please weigh in so we can finally get this done? |
@alanruttenberg I agree with most of your proposal. I'm not convinced about barcodes. Barcode subclasses do seem to be more about standards than about material artifacts. However, someone went to the trouble to add them to CCO, presumably because they thought that level of detail would be useful. Class Code 93 Barcode has its uses as formulated. Members of the class are individuals that conform to the Code 93 standard. Formalizing that standard as a subclass of Directive Information Content Entity is also useful, but I'm not prepared to advocate for eliminating the class hierarchy under material artifact as well. I would:
In the course of composing this reply, I've begun to wonder why CCO has this much detail on barcodes. Can anyone supply some history? I would be open to moving subclasses of Barcode to a separate module. |
I agree with the general point about moving the focus to ICEs and refactoring classes. I dislike spreadsheet being a subclass of document, as I wonder whether that would entail that a lot of other types of databases end up being documents alongside spreadsheets, and that doesn't sound right to me. Can you elaborate a bit on the logistics regarding existing extensions of CCO? I am thinking about cases such as the Cyber Ontology, where a lot of development has been done under the IBA side of the hierarchy. If I understand correctly, you are currently suggesting that direct children of IBA are not necessarily moved? |
@alanruttenberg Thinking from an information systems design perspective, "Document serving" is the basis of most of what happens on websites, emails, messaging systems. Document servers handle discrete packages rather than streaming services, for example. Anything that is served on a document server system would be related to a "Document." When we see streaming on a website, the website document is providing a place saver to serve streaming content from a streaming server. So when a Document is defined in a top level or midlevel ontology, it has deep technical meaning to the digital system designers around the world dealing in document server systems. For those who deal with the physical world, a document is a discrete, portable artifact used to share information about something. There are identity documents, license documents, etc. But there are also various publications, reports, etc... For the record, information systems design handles how humans interact with both digital and physical information processing technologies :) There is no doubt in my mind that a spreadsheet is a document in the digital or physical world because it has content (Body), a beginning (Header) and an end (Footer); communicates information having a particular data architecture or content and is portable for sharing on paper or electronically. |
@giacomodecolle I don't follow the Cyber ontology, but if they are building out under CCO's IBA they are likely making the same conceptual error that we see in CCO. In my proposal I do indeed suggest deprecating most classes of IBA, and I would suggest that, to the extent that the Cyber Ontology builds under IBA, once the change to CCO is made, they appropriately update their ontology, or remain using an earlier version of CCO. Consider 'Email Message'. The sense of that term under IBA would have to be a particular piece of paper with the message written on it. It's highly unlikely that that's the intended sense in almost any ontology that has subject matter regarding email messages. Most phishing emails, for example, have no power in their printed form and instead rely on clicking a link on what is effectively a concretization of the message that is currently being presented on any of their screens. So keeping the term, and having an ontology keep using it, would mean supporting the likely incorrect use of the term. And, once people start using the proper ICE term we would have interoperability problems, with some ontologies using the IBA term and others the ICE term. My proposal did suggest keeping a couple of terms (and perhaps other of similar nature) like Material Book, or Material Document, to mean those sets of printed pages, because sometimes the physical artifacts are tracked and so these terms would be useful. Should developers of the Cyber Ontology or others affected by this change wish to discuss this in more detail, I'd be happy to do so. |
@swartik #561 suggests factoring out most barcode classes into a separate domain ontology. I don't agree with you regarding barcode typically denoting a physical object. The "same" barcode is printed on e.g. all the local brand 1% milk at my store. There would a stronger argument for RFIDs having an IBA sense, as each is unique, as well as an ICE sense for the indentifier. |
@alanruttenberg, what you describe is something that I've never seen explicitly stated in CCO, or in BFO for that matter. Please correct me if I'm wrong. You're right that the same barcode is printed in lots of places. The images on each carton are physically different only because they are made with different ink. For two barcodes on two milk cartons, the difference is trivial and maybe not worth the effort. It becomes less trivial as you scale up. Are two cartons of milk of the same brand and fat percentage represented by the same individual? Are two printed copies of the same book represented by the same individual? Are two cars of the same make and model represented by the same individual? To take the example to a (fictional) extreme, are our Earth and the Earth in the Star Trek episode Miri represented by the same individual? There is a case to be made for answering yes to all of these questions. As usual, it depends on what one's ontology is trying to express. Does CCO (or BFO) have a position on whether every physically distinct entity is a different individual? Does the answer depend on the class, and if so, what's the distinguishing characteristic? My personal viewpoint: I wouldn't want to represent every single printed barcode as a different individual because I can't see the need to state that ink batch 1 was used to print barcode 1 and ink batch 2 was used to print barcode 2. But, should I want to express that a carton of Safeway Brand 1% milk has a particular barcode, I'd do it by:
I'd use this approach for books, cars, and planets. But as I say, if CCO already has another approach, let me know. |
@swartik To be clear: Both the material artifact and the ICE exist. Both could be part of the ontology. What differs is the sort of thing one says about all of them. To some extent this is a matter of good taste in ontology development. As a practical concern, having both terms will confuse people who aren't completely clear on the distinctions between the two. Most of the things that will be said of a barcode, will be said of all barcodes with the same form. For instance, the pattern of marks, who originated it, what it denotes. There is very little to say about an individual bar code. The sort of thing we would say about a material version is that this one on this milk carton is smudged. Since it seems to me that it is vastly more common to want to say something that is true of all the barcode, the subject would be the ICE, which captures the regularity. So when I see this situation, I vote for ICE representation as primary. In the rarer material usage, one can always define the class: Artifact and bears some bar code. In a few cases the material is as important as the ICE, because it is not uncommon to track specific copies. For example, we track a specific physical copy of a book in a library. In classified settings, how physical copies of reports are to be distributed is constrained and it isn't uncommon to track each individual. So for a few select classes (Books, Documents) I suggest we do also have the material term. I don't understand what you mean by "I didn't expect to populate" - maybe: you don't plan to specifically assert statements like x rdf:type barcode artifact? You should consider such a situation a mark against representing something. It isn't determinative, but one is often (properly) drive to define classes because there are instances of them that are of individual importance. Sometimes we don't plan to directly instantiate, but that happens usually for more general classes - middle classes in an ontology. The barcode you describe will not be that sort of class. You are also missing out that the barcode is about something, namely some class of Safeway Brand 1% milk. Just because we know every member of a class has a part of a certain class, doesn't mean we know what the relationship between them is. Since I think we can say everything of importance by reference to the ICE in this case, there's also an argument that your choice of representation is more complicated than it need be. |
As an example of confusion, one might represent, in CCO an email message as an IBA instance, asserting there was an author of that message. But that makes an assertion about exactly one physical copy or while-on-screen copy of the message, not all of the messages with the same content. But the author statement will be true of all the copies. Therefore it should be an assertion on the ICE. |
Here is the outline of what I would change. I'll rewrite definitions if there is consensus that these are the changes to make. There are further changes I would make, adding axioms and additional classes, but this is the bare minimum.
IBAs that make more sense as ICEs
Rearrangements:
Book, Journal article to be subclass of Document. I'm unsure whether
spreadsheet should also be but lean that way. Document, I think, connotes a whole. Images
and Charts may be documents, but also may be only parts of documents.
Rename
Document Form -> Form Document. The label makes it sound like a part of document.
Material artifacts
Some of the above are sometimes interesting as physical items, things
you would track copies of.
Book Artifact: Material artifact and is carrier of some Book
Document Artifact : Material artifact and is carrier of some Document
An alternative would not to define them and just have material
artifact instances and relate them to what they carry, in the RDF.
remaining IBA classes
Timekeeping Instrument with subclass System Clock, Instrument Display
Panel, seem to me to be Information Medium Artifacts.
IBE
Document Field: Move to ICE. Add axiom: continuant part of some Document
Deprecate IBA.
The distinction between IBE and IBA is minor and IBE is more general. It's arguable whether
a tree with a carving "A heart L" is an IBA, but it is clearly an IBE.
Properties whose domains or range is IBE
Deprecate the 'has value' properties like 'has text value' in
favor of a single 'has value' property. That would include all data
properties other than: 'has latitude value', 'has longitude value', and
'as WKT'. The typing of the relation is unecessary - the type information is available from the value, and restrictions could be added to constrain types where relevant.
In the below relations, change IBE to ICE in domain or range
is_tokenized_by
Deprecate and suggest has value instead.
Logistics
The proper thing to do is to deprecate old terms and create new terms
where there is a significant change, like IBA->ICE. The alternative is
to keep using the same IRIs, but this risks cases where domain
ontologies have specialized below the top-level classes. So specializing
Document will be fine in the switch, because a subclass of Document will
remain a subclass of Document in the switch. However, if a new direct
subclass of IBA is created, that won't move, as the proposal does not
include equating ICE with the old IBE.
For rearrangements, such as putting Book and Journal article below
Document, the potential damage is less, and in such cases the terms do
not need to be deprecated.
There will be a mapping file from english name IRIs to numeric IRIs. I'd
suggest that the old classes not be included in that mapping, but a
supplementary information mapping that maps deprecated terms to
alternatives be included as an adjunct.
The text was updated successfully, but these errors were encountered: