Annotation Data Design #40

balmas · 2020-10-14T18:56:14Z

Related discussions: #33 #38 #24

See also https://github.com/alpheios-project/documentation/blob/master/development/lex-domain-design.tsv which defines some domain requirements for creation of data in the data store

Data design needs to accommodate:

linking of all known Alpheios (both user and alpheios supplied) data about a lexical entity
a wide variety of lexical entity types
a wide variety of languages
ability for data to be public or private
ability to consume data from and contribute data to LOD sources such as the the Lila Lemma Bank (https://lila-erc.eu/sparql/) the Latin Wordnet (https://latinwordnet.exeter.ac.uk/), the Greek Wordnet etc.
standard lexical ontologies
user contribution of both data entities and relationships between entities
user comments on any entity, relationship or comment
ability to filter data by a wide variety of criteria (including its provenance)

balmas · 2020-10-14T18:59:19Z

Proposed design:

Lexical entities are nodes with properties specific to the entity type
Relationships are represented as edges between nodes; the edges themselves can (1) carry additional properties and (2) be nodes in other relationships
Distinct relationships should only be expressed once (i.e. there is no reason to say wordX hasLemma lemmaY more than once) but assertions of the correct/incorrect nature of that relationship can be many, are specific to the user making the assertion, may be public or private. It will be up to the consumer to decide the meaning of multiple assertions that disagree based upon characteristics such as who asserted it, how many other people asserted it, what degree of confidence a user has assigned to an assertion, etc.
A relationship requires an assertion of its truth (or falseness) to be published and used.

(Entity details still very much a work in progress)

balmas · 2020-10-14T19:16:03Z

Some of the known issues with the current Alpheios morphology service engines can serve as a set of use cases for annotations on morphological data.

Reference: alpheios-project/morphsvc#38

Summary: Whitaker Engine of Morph Service reports the lemma of 'afore' as 'afore'. While it's possible that 'afore' is an accepted lemma variant of 'absum' our inflection table and full definition for this verb is keyed off of the lemma 'absum' as the "canonical" lemma

Entity Nodes:

[Lemma:afore]
[Lemma:absum]
[User:alpheios]

Edges:

[Lemma:afore isLemmaVariant Lemma:absum]@prefer=Lemma:absum
[User:alpheios assertsTrue [afore isLemmaVariant absum]]

Sample Query: https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70
Sample Response: https://gist.github.com/balmas/ff9ae018feaccfb8fbda4dff618bf4a8

Reference: alpheios-project/morphsvc#29

Summary: Whitaker Engine of Morph Service is missing the identification of the vocative case as a possible inflection of the form senatu of the lemma senatus

Entity Nodes:

[Lemma:senatus]
[Inflection:Form=senat-u|Case=vocative]
[User:alpheios]

Edges:

[Inflection:Form=senat-u|Case=vocative canBeInflectionOf Lemma:senatus]
[User:alpheios assertsTrue [Inflection:Form=senat-u|Case=vocative canBeInflectionOf Lemma:senatus]]

Sample Query: https://gist.github.com/balmas/f6e55dc3b3551a60d034ef131798ba4d
Sample Response: https://gist.github.com/balmas/623f26e6dc5abbb43e5646b6658bdfd8

Reference: alpheios-project/morpheus#28

Summary: Morpheus Engine of Morph Service parses τίνος with the Lemma τίς specifying pos of irregular (along with a parse of a demonstrative pronoun). The irregular lemma should be the interrogative pronoun τίς with one genitive singular inflection

Entity Nodes:

[Lemma:τίς|pos=X]
[Lemma:τίς|pos=PRON]
[Word:τίνος]
[Inflection:Form=τίνος|Case=Genitive|Number=Singular]
[User:alpheios]

Edges:

[Word:τίνος hasLemma Lemma:τίς|pos=X]
[Word:τίνος hasLemma:τίς|pos=PRON]
[Inflection:Form=τίνος|Case=Genitive|Number=Singular canBeInflectionOf Lemma:τίς|pos=PRON]
[User:alpheios assertsFalse [Word:τίνος hasLemma Lemma:τίς|pos=X]]
[User:alpheios assertsTrue [Word:τίνος hasLemma:τίς|pos=PRON]]
[User:alpheios assertsTrue [Inflection:Form=τίνος|Case=Genitive|Number=Singular canBeInflectionOf Lemma:τίς|pos=PRON]]

Sample Query: https://gist.github.com/balmas/ecc9db3da04fbf32d3e0f8efdf6b2774
Sample Response: https://gist.github.com/balmas/2639a0e14248e6da6cc98905cfd643cd

Reference: alpheios-project/morpheus#32

Summary: Morpheus Engine of Morph Service doesn't parse the word μεμνήμεθα because it only recognizes this word by the alternate spelling μεμνῄμεθα.

Entity Nodes:

[Word:μεμνῄμεθα]
[Word: μεμνήμεθα]
[User: alpheios]

Edges:

[Word:μεμνῄμεθα isSpellingVariant Word:μεμνήμεθα]
[User:alpheios assertsTrue [Word:μεμνῄμεθα isSpellingVariant Word:μεμνήμεθα]]

Sample Query: https://gist.github.com/balmas/f402883b85041e5227737509be6adce3
Sample Response: https://gist.github.com/balmas/e52f54a2f5e32adf92d8738c4f195dde

(Additional use cases from the morph bugs can be found at https://docs.google.com/spreadsheets/d/1ej-7dAntWQZVASg7aQp0P-PRo2u9Nkn3LYChclYtDVo/edit?usp=sharing)

irina060981 · 2020-10-15T12:10:44Z

I need more time to read and understand the idea, will work on it tomorrow.

kirlat · 2020-10-15T13:12:33Z

Need time to study this and mull over it as well

irina060981 · 2020-10-16T06:34:03Z

About the data structure:

If I understood right, the structure has two main entities - Word (I believe that in terms of Alpheios extension it is TargetWord) and User, all other entities are around them:

lexical data with relationships from various source
alignment data
comments data

And here I have some questions:

A User could have the following roles according to Words:
- a word could be inside his WordList (owner type1)
- a word could be inside Aligned Groups in Alignment (owner type 2)
- a word (or some sub-data connected to this word) could be the target of his comments (contributor)

How this roles are defined in the Model? Where would be defined user's rights for the words, tokens and comments?
May be it is worth to add additional entity - UserRole - and attach a userRole to appropriate type/domain of data?
Such a division could create an early data separation for various requests and helps with perfomance.

We have an important property of each word - language, in some word sub-entities we could have even ywo defined languages. In future we could think about the fact that any User could have its own language (defined in browser for example). And we have some specific language separations - inflections, word usage and etc. And most times specific users would work only with a specific languages.
What do you think about adding an additional division by language - to reduce an amount of all words to Word+Language pieces?
About Alignment structure data - the schema has a direct relation between Word and AlignedGroup, but in the application it has a middle object Token, that has a text property with the word . And Token could be included in several AlignedGroups. Do you suggest to reduce here relationships somehow?

According to the examples I think it is a worthy structure, but it is really difficult to see how it would work with all languages in GraphQL bparadigm for me.

balmas · 2020-10-16T16:45:28Z

These are really good questions.

If I understood right, the structure has two main entities - Word (I believe that in terms of Alpheios extension it is TargetWord) and User, all other entities are around them:

lexical data with relationships from various source

alignment data

comments data

I think it's not really true that Word and User are the main entities. There can be relationships that don't involve either of these -- for example inflectionA canBeInflectionOf lemmaA and so on.

The connection points to Alpheios applications will in many cases be specific to a User and a Word, but they are not the core of the data model.

User does have a somewhat special place in the model though, because it is through a User's assertion that a relationship between entities is True or False that makes the data usable.

But it isn't true that a user of Alpheios will only have access to data that is connected to the data that is asserted as true or false by their User id. We will have to give users control over what data they do and do not see, based upon who asserted it. And we also have to give users control over whether the data they create is available to other users. For now, we have decided that there are 2 possibilities: public or private. In the future it is very likely we will need to be able to express finer-grained levels of access - such as group-level, site-level etc. But to start we are going to support these two.

So for example, as the "alpheios.net" user, we may publish corrections to the results of the morphological parsers as annotations (these are the use cases I've described above). The assertions of their truth will be made by the "alpheios.net" user (exact identifier TBD) and they will be available to anyone.

On the client-side, a user will have the choice of which data to retrieve -- they will be able to say, for example, 'give me all data asserted by alpheios.net and by myself, and no other' or 'give me all data that is publicly asserted, but exclude data that is asserted by userx'. Or maybe even 'give me all data asserted by alpheios.net and myself, plus any data that is public and which has been asserted X number of times' (the implicit assumption being that the more people agree with a statement the more it is likely to be correct).

And here I have some questions:

A User could have the following roles according to Words:

a word could be inside his WordList (owner type1)

a word could be inside Aligned Groups in Alignment (owner type 2)

a word (or some sub-data connected to this word) could be the target of his comments (contributor)

How this roles are defined in the Model? Where would be defined user's rights for the words, tokens and comments?
May be it is worth to add additional entity - UserRole - and attach a userRole to appropriate type/domain of data?
Such a division could create an early data separation for various requests and helps with perfomance.

I need to think more about this. For the moment, there are 2 roles identified in the data model - (1) creator of data (the user that put the data into the database) and (2) subject or object of an assertion (as in User X assertsTrue Relationship Y). At the moment, I'm thinking that access restrictions would be tied only to the former, and the isPublic property on the assertion is what gates the access to it.

In this approach, it means that we CANNOT make the data or a query endpoint to it itself publicly available, all access would have to be gated through queries that enforce the access restriction rules. This is how the wordlist works currently although it's not a graph-based model.

There are certainly other roles with respect to data that it might be good to record, some of which get quite philosophical (who actually "created" the words of a text that is aligned? the author of the text?).

I should also explain that I don't right now at least envision making this database the source of truth for User information. I would like to keep that data separate, as it is much more sensitive. I think we will continue to use Auth0 for our User database, and at a minimum, just the opaque userId would be available in this new shared data store. Additional information about themselves that a user chooses to make public might be retrieved from the Auth0 database at runtime, or synced, depending upon performance. But I want to keep user-identifying data separate from this new data store as much as possible. We WILL have to make enhancements to our use of Auth0 to support this to allowing users to enter and edit profile information.

We have an important property of each word - language, in some word sub-entities we could have even ywo defined languages. In future we could think about the fact that any User could have its own language (defined in browser for example). And we have some specific language separations - inflections, word usage and etc. And most times specific users would work only with a specific languages.
What do you think about adding an additional division by language - to reduce an amount of all words to Word+Language pieces?

I though a lot about this and I'm not sure what makes the most sense. For the initial modeling, I used a document property for language to limit the proliferation of document collections as I was still trying to figure out what they should be, but it could definitely be better to have individual collections of document per language. It's something that I think we'll have to analyze the performance of different queries to decide. The indexing options in Arangodb are pretty flexible and we can index on property values, but it does definitely introduce potential error into the data using a language property unless we enforce a schema on its values.

About Alignment structure data - the schema has a direct relation between Word and AlignedGroup, but in the application it has a middle object Token, that has a text property with the word . And Token could be included in several AlignedGroups. Do you suggest to reduce here relationships somehow?

I think Token probably does need to be represented because it isn't exactly the same as Word -- i.e. multiple Tokens could be combined to make up a single Word. In order to make that connection though, we would have to retain that information from the tokenizer. It's something that needs to be thought through more.

kirlat · 2020-10-16T17:49:15Z

Thanks for the diagrams and the detailed description. I have several small questions:

What does a Lemma below the other Lemma designate?

Does the lower lemmf is a lemma variant of the lemma above?

Can we say that connections between the nodes (edges) are not set once and for all but are flexible, as they can be edited, added, and changed in other ways during the lifetime of the lexical data piece? Would we store nodes and edges in a "disassembled" state with some links that represent "glue points" between them and then "assemble" them into a graph that will depend on the user dynamically (i.e. different users will be presented with a different version of a graph)? Or do we plan to build one huge graph and then store it in the database? Or would it be something in between? I think we cannot store everything in our own database and we have to provide some way to link to external resources that are outside of our control, such as Tufts morphology.
Will data access be based on a GraphQL API? I think it's really powerful and fits naturally into how the relationships are constructed.

I like the idea of separating nodes (entities) and edges (connections) very much. It creates a strong concept and a meaningful vocabulary to represent the concepts of handling the lexical data.

Many details of the implementation are not defined yet but I think there is already a solid foundation that we can use to move forward.

balmas · 2020-10-16T19:53:33Z

What does a Lemma below the other Lemma designate? ... Does the lower lemmf is a lemma variant of the lemma above?

Yes, in this case it's showing 2 separate Lemma nodes, connected by the isLemmaVariant edge

Can we say that connections between the nodes (edges) are not set once and for all but are flexible, as they can be edited, added, and changed in other ways during the lifetime of the lexical data piece?

I think we will need to have delete protection on anything that can be pointed at, which includes edges (that can function as nodes). Otherwise we will end up with meaningless assertions of the truth or falseness of a relationship. I think this may also mean that edges cannot be edited substantially once they are created, otherwise it would put into doubt the validity of the assertions that point at them. I think anything that can be pointed at might need to be in a frozen state once it participates in a relationship.

Would we store nodes and edges in a "disassembled" state with some links that represent "glue points" between them and then "assemble" them into a graph that will depend on the user dynamically (i.e. different users will be presented with a different version of a graph)?

Yes, I think the graphs will be built dynamically based upon the query.

Or do we plan to build one huge graph and then store it in the database? Or would it be something in between? I think we cannot store everything in our own database and we have to provide some way to link to external resources that are outside of our control, such as Tufts morphology.

Definitely we cannot store everything that might ever be part of the graph, but we need to store them once they become part of the graph. That is, it's the point at which someone annotates a relationship that is asserted by an external resource that that relationship (and the nodes in it) will get added to the database. When we have a persistent IRI for an external resource (which right now is rare) we should use it. We can also use properties to identify the original source of data (see for example the lemma properties in the sample query at https://gist.github.com/balmas/f6e55dc3b3551a60d034ef131798ba4d where I am specifying that the data I'm looking for annotations on has come from the "net.alpheios:tools:wordsxml.v1" source.)

lemma:{
          representation:"senatus",
          lang: "lat",
          pos: NOUN,
          source: "net.alpheios:tools:wordsxml.v1"
          principalParts: ["senatus"]
        },

Will data access be based on a GraphQL API? I think it's really powerful and fits naturally into how the relationships are constructed.

Yes, I have some naive first attempts at the api in the Gists I've linked to above in the sample use cases.

kirlat · 2020-10-19T11:50:40Z

I think we will need to have delete protection on anything that can be pointed at, which includes edges (that can function as nodes). Otherwise we will end up with meaningless assertions of the truth or falseness of a relationship. I think this may also mean that edges cannot be edited substantially once they are created, otherwise it would put into doubt the validity of the assertions that point at them. I think anything that can be pointed at might need to be in a frozen state once it participates in a relationship.

I'm afraid that the protection deletion would be pretty hard to manage. I've started to think if we could find a way around it without imposing and maintaining such restrictions.

I might be wrong, but it seems to me that the nodes are more "stable" pieces than the edges in the lexical data model. If there is a lexeme, or an inflection, their existence is probably a more-or-less reliable truth. The question usually arises around whether a particular inflection belongs to a particular lexeme, or several lexemes (it might in theory not belong to anything at all if it is considered incorrect), i.e. if there should be a relationship (an edge) between the one and the other. Someone may say that "A is an inflection of lexeme B". This statement not only asserts the relationship, but also establishes a relationship itself (creates an edge) between the "inflection A" and the "lexeme B" (if such edge was not already established by the other assertion before). The relationship is solely based on the assertion, if we can put it this way. If the assertion will be revoked, relationship should be destroyed too.

I'm wondering if, in situations like this, it would make sense to store relationship along with the corresponding assertions. If the assertion is edited, the relationship is edited too. If the assertion is destroyed, the relationship should cease to exist too.

It's like storing parts of a graph separately. When we construct a graph for a specific word, we can check if there are any relationships for its lexemes that are part of it, and if there are, a final graph will reflect that.

I think this will make management of relationships more flexible. If we want to remove or edit a relationship or an assertion, we can do it all in one place. There will be no need to "lock" the whole graph or the parts of it. I think this way it should be easier to manage the lexical data than when it's all in one complex graph.

Does such an approach make sense? What do you think?

kirlat · 2020-10-19T12:39:42Z

The service for the decentralized annotations publishing is dokieli. It uses, among other, the following technologies:

Linked Data Platform
RDFa
Web Annotation Data Model
JSON-LD
Turtle can be useful too.
I think these technologies were created to solve problems that we're trying to solve. As those are established standards I think it would make sense for us to use these technologies whenever appropriate.

We might also look at Solid as it is within pretty much the same problem domain as wll.

balmas · 2020-10-19T14:38:53Z

We should also look at https://www.hypergraphql.org/ when designing the GraphQL API

balmas · 2020-10-19T16:42:04Z

https://linkeddatafragments.org/ is relevant to suggestions from both @irina060981 and @kirlat

balmas · 2020-10-20T22:56:45Z

Revised sample use cases:

Reference: alpheios-project/morphsvc#38

Summary: Whitaker Engine of Morph Service reports the lemma of 'afore' as 'afore'. While it's possible that 'afore' is an accepted lemma variant of 'absum' our inflection table and full definition for this verb is keyed off of the lemma 'absum' as the "canonical" lemma

LexicalEntity Nodes:

{ IRI:"https://alpheios.net/data/lemma/lat/afore" 
  type:"lemma", 
  lang: "lat",
  representation: "afore", 
  pos: "VERB", 
  source: "net.alpheios:tools:wordsxml.v1",
  creator: "net.alpheios"
}
{ IRI:"https://alpheios.net/data/lemma/lat/absum" 
  type: "lemma", 
  lang: "lat",
  representation: "absum", 
  pos: "VERB", 
  source: "net.alpheios:tools:wordsxml.v1",
  creator: "net.alpheios"}

LexicalEntityRelation Edges:

{ _from: "https://alpheios.net/data/lemma/lat/absum", 
    _to: "https://alpheios.net/data/lemma/lat/afore" 
    type: "isLemmaVariant" 
    creator:"net.alpheios"
    prefer: "https://alpheios.net/data/lemma/lat/absum"
}

User Collection (not part of the graph}

{ IRI: 'net.alpheios' }

Prototype GraphQL Query: https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70
Prototype GraphQL Response: https://gist.github.com/balmas/ff9ae018feaccfb8fbda4dff618bf4a8

Reference: alpheios-project/morphsvc#29

Summary: Whitaker Engine of Morph Service is missing the identification of the vocative case as a possible inflection of the form senatu of the lemma senatus

Entity Nodes:

{ IRI:"https://alpheios.net/data/lemma/lat/senatus" 
  type:"lemma", 
  lang: "lat",
  representation: "senatus", 
  pos: "NOUN", 
  source: "net.alpheios:tools:wordsxml.v1"
  creator: "net.alpheios"}

{ IRI:"https://alpheios.net/data/infl/lat/senatuvoc" 
  type:"inflection", 
  lang: "lat",
  form:senatu", 
  stem: "senat", 
  suffix:"u", 
  udfeatures: { 
    Case: "vocative" 
  },
  source: "net.alpheios:tools:wordsxml.v1",
  creator: "net.alpheios"
}

Edges:
LexicalEntityRelation Edges:

{ _from: "https://alpheios.net/data/infl/lat/senatuvoc", 
    _to: "https://alpheios.net/data/lemma/lat/senatus", 
    type: "isInflectionOf", 
    creator:"net.alpheios"
}

User Collection (not part of the graph}

{ IRI: 'net.alpheios' }

Sample Query: https://gist.github.com/balmas/f6e55dc3b3551a60d034ef131798ba4d
Sample Response: https://gist.github.com/balmas/623f26e6dc5abbb43e5646b6658bdfd8

Reference: alpheios-project/morpheus#28

Summary: Morpheus Engine of Morph Service parses τίνος with the Lemma τίς specifying pos of irregular (along with a parse of a demonstrative pronoun). The irregular lemma should be the interrogative pronoun τίς with one genitive singular inflection

Entity Nodes:

{   IRI:"https://alpheios.net/data/lemma/grc/tisx"
    type: 'lemma',
    representation: 'τίς',
    lang: 'grc',
    pos: 'X',
    langpos: 'irregular',
    source: 'net.alpheios:tools:morpheus.v1',
    creator: 'net.alpheios',      
}

{   IRI:"https://alpheios.net/data/lemma/grc/tis"
    type: 'lemma',
    representation: 'τίς',
    lang: 'grc',
    pos: 'PRON',
    source: 'net.alpheios:tools:morpheus.v1',
    creator: 'net.alpheios',
}

{  IRI: "https://alpheios.net/data/inflections/grc/tinosgensing"
   type: 'inflection',
   form: "τίνος",
   stem: "τίνος",
   udfeatures: {
        Case: 'genitive',
        Number: 'singular'
    },
    xfeatures: {
        stemtype: 'inter',
        morphtype: 'enclitic indeclform'
    }
}

{   IRI: "https://alpheios.net/data/words/grc/tinos"
    type: 'word',
    representation: 'τίνος',
    lang: 'grc',
    creator: 'alpheios.net' 
}

LexicalRelation Edges:

{
   _from: "https://alpheios.net/data/lemma/grc/tisx",
    _to: "https://alpheios.net/data/words/grc/tinos",
   type: 'isNotLemmaOf',
   isPublic: true,
   confidence: 1,
   creator: 'net.alpheios',
}
{
   _from: "https://alpheios.net/data/lemma/grc/tis",
    _to: "https://alpheios.net/data/words/grc/tinos",
    type: 'isLemmaOf',
    isPublic: true,
    confidence: 1,
    creator: 'net.alpheios',
}
{
   _from: "ttps://alpheios.net/data/inflections/grc/tinosgensing",
    _to: "https://alpheios.net/data/lemma/grc/tis",
    type: 'isInflectionOf',
    isPublic: true,
    confidence: 1,
    creator: 'net.alpheios'
}

Sample Query: https://gist.github.com/balmas/ecc9db3da04fbf32d3e0f8efdf6b2774
Sample Response: https://gist.github.com/balmas/2639a0e14248e6da6cc98905cfd643cd

Reference: alpheios-project/morpheus#32

Summary: Morpheus Engine of Morph Service doesn't parse the word μεμνήμεθα because it only recognizes this word by the alternate spelling μεμνῄμεθα.

Entity Nodes:

  {
      IRI: 'https://alpheios.net/data/words/grc/memenealt1',
      type: 'word',
      representation: 'μεμνῄμεθα',
      lang: 'grc',
      createdBy: 'alpheios.net' 
}
{
      IRI: 'https://alpheios.net/data/words/grc/memenealt2"
      type: 'word',
      representation: 'μεμνήμεθα',
      lang: 'grc',
      createdBy: 'alpheios.net'     
}

** LexicalRelation Edges**:

{ 
  _from: "https://alpheios.net/data/words/grc/memenealt2",
   _to: "https://alpheios.net/data/words/grc/memenealt1"
   type: 'isSpellingVariant',
   isPublic: true,
   confidence: 1,
   creator:'net.alpheios'
}


Sample Query: https://gist.github.com/balmas/f402883b85041e5227737509be6adce3
Sample Response: https://gist.github.com/balmas/e52f54a2f5e32adf92d8738c4f195dde

@irina060981 and @kirlat thank you both for your feedback and for talking me off the complexity ledge :-)

Above is a revised approach to the data model, based upon your suggestions and the additional reading mentioned above.

A few things to point out:

In this scenario, we have the potential for many edges that essentially say the same thing, asserted by different authorities. I think ideally the query implementation would dedupe and aggregate them, but it will be up to the client to decide how to handle conflicting assertions.
I am not at all sure of what ontologies we will use for all of the properties of the nodes and relationship edges. It will likely be a combination of mostly existing ontologies and a few alpheios-specific vocabulary items. For the most part I think we will use dublin core terms, the ontolex ontology and olia but we will have to fill in some gaps.
the diagram tries to show the abstract concept of nodes and edges, examples of potential concrete but database-agnostic implementation, and database-specific details. (I always pack too much into my diagrams, I know). So, for example, in the implementation details, _key, _id, _from, _to are special properties in ArangoDB.
In response to the suggestion about separating by language, I would like to start by having language-specific collections of the node documents, but keeping the edge collections language agnostic. I will keep language properties on the node documents, however, because I believe they should be able to stand on their own outside of the database structure. If performance testing indicates that we need to further breakdown the edges by language we can, but I think since we are not normally starting queries from an edge, but a node, I'm not sure there would be a benefit in doing so.
I think conceptually, it makes sense to have separate edge collections for the nature of a relationship. So here I am proposing an edge collection for LexicalRelationships where the edge documents themselves would specify the type of the relationship in a property. We can index on relationship type to facilitate query performance, and if that isn't sufficient we can break it down further. I have also proposed an edge collection for "attestedAt" relationships, which are essentially the identification of a specific lexical entity in a specific context. We might also have one for the commentingOn relationship, one for the relationship between parts of alignment nodes, etc. Those all still need to be fleshed out.
I have greatly reduced the doubling up of edges as nodes, but not completely. I think we need to be able to support comments on edges. I think there are some clear separations of the different graph types, and so we can have a graph of comments and a graph of lexical entities and a graph of alignments and they can have intersection points but not all be in the same graph.
I have removed the User from the graph model for now, opting instead to use a reference to the user in the createdBy property of both Node and Edge documents. I want to be able to associate a User with both an edge and a node within the same graph, and there isn't a graph-like way to do that that I can think of that doesn't break the graph. It might also make sense to some point to have completely separate User collections for data (similar to the language-specific collections of nodes). In LDP terms, the User is probably more a container than either a node or an edge.
The change in data model in my prototype implementation hasn't really affected the GraphQL API at all. The handling of the negated lemma is a little different (it's an assertion on the word now rather than on the lemma which is maybe more accurate anyway).
apologies that the prototype details in the prototype JSON data objects above don't match exactly the prototype details in the diagram or the GraphQL output. As this is all just prototype for discussion, it's a little messy but I'm hoping you get the gist.
I still need to think about how we will handle versioning of the data.
finally, although not outlined above, here's a little sample of what a graph of a disambiguation scenario would look like (e.g. an assertion that a word in a particular context has a specific lemma and inflection):

Also, in case it helps with understanding this, my code for the ArangoDB prototype where I have been working through all of this is at https://github.com/alpheios-project/arangodb-svcs

kirlat · 2020-10-21T13:37:34Z

Thanks for the detailed description! I like the new model, I think it's much more flexible and extendable now. A few comments to it:

I am not at all sure of what ontologies we will use for all of the properties of the nodes and relationship edges. It will likely be a combination of mostly existing ontologies and a few alpheios-specific vocabulary items. For the most part I think we will use dublin core terms, the ontolex ontology and olia but we will have to fill in some gaps.

I think it does not matter what ontology do we use as long as we specify the ontology IRI along with the ontology terms. This will provide client with the reference to the ontology and make things non-ambiguous. I think this, even being more verbose, will give us flexibility to use any ontologies we want without any limitations.

In response to the suggestion about separating by language, I would like to start by having language-specific collections of the node documents, but keeping the edge collections language agnostic. I will keep language properties on the node documents, however, because I believe they should be able to stand on their own outside of the database structure. If performance testing indicates that we need to further breakdown the edges by language we can, but I think since we are not normally starting queries from an edge, but a node, I'm not sure there would be a benefit in doing so.

I agree with keeping the edges language agnostic. Edge is a connection between two nodes and it does not, on my opinion, "belong" to any language on itself as a word, an entity that caries some language-specific data, does.

I think we can use language-based indexes to group nodes into language-based collections. Edges can be included into such collections based on nodes it connects: if those nodes do belong to a certain language, edges can be included to collections for that language too.

I have greatly reduced the doubling up of edges as nodes, but not completely. I think we need to be able to support comments on edges. I think there are some clear separations of the different graph types, and so we can have a graph of comments and a graph of lexical entities and a graph of alignments and they can have intersection points but not all be in the same graph.

I think we can also have comments on edges being shown on the lexical entities graph as well if we will introduce edges as entities in the query. We can have something like:

word {
   isSpellingVariang { // An edge included into the diagram
        word { // Information on the related word
          representation
        }
        comment // A comment on the "isSpellingVariang" edge
   }
   representation // Information on the main word
}

I've seen this approach used in GraphQL queries on many occasions, including the popular Gatsby generator. We can do something along those lines.

I have removed the User from the graph model for now, opting instead to use a reference to the user in the createdBy property of both Node and Edge documents. I want to be able to associate a User with both an edge and a node within the same graph, and there isn't a graph-like way to do that that I can think of that doesn't break the graph.

I think we can add user information to the edge using the approach shown above. Extending the example above, we could have something like:

word {
   isSpellingVariang { // An edge included into the diagram
        word { // Information on the related word
          representation
        }
        comment // A comment on the "isSpellingVariang" edge
        user {
          name // A name of the user who asserted the relationship
        }
   }
   representation // Information on the main word
}

Or if there are multiple users, we can show an array of users using the plural users filed. It will contain an array of user entities instead of a singular user one.

I still need to think about how we will handle versioning of the data.

Maybe we can use a timestamp as a version? In that case, if we would like to assemble the latest version of a graph, we'll use entities with the most recent timestamp. If we would like to go back to the previous version, we can specify a certain point in time and assemble a graph from entries that have a timestemp below that date. Of course that would not work if we would need to establish specific snapshots that are not time-synced (such as a special version of a graph for some particular purpose).

kirlat · 2020-10-21T14:33:10Z

I'm thinking if our GraphQL queries may be simpler if we introduce edges into them For example, if to take a request from https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70 and change it to something like:

query {
  wordAnnotations(...) {
     lemmaVariants { // Renamed from 'isLemmaVariant' to 'lemmaVariant' and used in plural form to make it more readable; this represents an edge requested (i.e. the type of relationships)
      lemma { // This is what information we want to get for this relationship
         IRI,
         lang,
         representation, 
         pos, 
         source
         creator
      }
    } 
  }
}

Then the response might be something like:

data: {
    wordAnnotations: {
      lemmaVariants [
         lemma { 
            IRI: "https://alpheios.net/data/lemma/lat/afore"  
            lang: "lat",
            representation: "afore", 
            pos: "VERB", 
            source: "net.alpheios:tools:wordsxml.v1",
            creator: "net.alpheios"
       },
      lemma {
         // Some other lemma variant
      }
    ] 
  }
}

I might mess up some syntax and details but hope the code above conveys the idea.

What do you think? Would it work for us? Would it make things simpler?

kirlat · 2020-10-21T14:51:20Z

I also have a question about query parameters of the sample query
https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70#file-gistfile1-txt-L3-L26
Do I understand correctly that their purpose is to set a filter on what word annotations to be returned?

I'm thinking if it would make sense to use a less formal, but a simpler approach and specify only the fields whose values would serve as an actual filter only? Something like:

wordAnnotations(
  word: "afore",
  lang: "lat",
  pos: VERB,
  ... // Other values that would serve as a filter
)

This would be consistent with usage examples I've seen and will make the query simpler. What do you think? Does it make sense?

balmas · 2020-10-21T17:29:18Z

I think we can also have comments on edges being shown on the lexical entities graph as well if we will introduce edges as entities in the query. We can have something like:

Yes, I agree -- this is the nice thing about the GraphQL api being separate from the database implementation. Even if in the database the comments are in a separate graph from the data they comment on, we can present them as a single graph in the GraphQL API.

balmas · 2020-10-21T17:34:35Z

Maybe we can use a timestamp as a version? In that case, if we would like to assemble the latest version of a graph, we'll use entities with the most recent timestamp. If we would like to go back to the previous version, we can specify a certain point in time and assemble a graph from entries that have a timestemp below that date. Of course that would not work if we would need to establish specific snapshots that are not time-synced (such as a special version of a graph for some particular purpose).

I agree timestamps make sense as a way to identify versions. I am not sure how many past versions of data we want to keep available in the graph though. I like the approach of having a non-versioned IRI for data that always returns the latest version, and referencing the prior versions in the the data (per the approach described in http://lrec-conf.org/workshops/lrec2018/W23/pdf/2_W23.pdf) but I don't think we will keep unlimited versions of all data points. This may not be too big of an issue though, because for the most part, at least for the lexical data, we are talking about very small data objects that likely won't change once created. We could also have different status for data such as draft and published, and only allow referencing published data.

balmas · 2020-10-21T17:38:54Z

I'm thinking if our GraphQL queries may be simpler if we introduce edges into them For example, if to take a request from https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70 and change it to something like:

I think this is an interesting suggestion. The type of relationships (edges) that we might query will be many and will grow over time and the same input will feed into many of them. I think we can use variables in GraphQL to keep the queries concise in case (i.e. to keep from repeating the same expanded lemma object over and over). But I think your suggestion is very much in line with the approach outlined in the linked fragments proposal in that it puts in the hands of the client to know exactly what it is asking for.

balmas · 2020-10-21T17:47:12Z

I also have a question about query parameters of the sample query
https://gist.github.com/balmas/e7e0e6bc16f2501f3ca06f7462203f70#file-gistfile1-txt-L3-L26
Do I understand correctly that their purpose is to set a filter on what word annotations to be returned?

I'm thinking if it would make sense to use a less formal, but a simpler approach and specify only the fields whose values would serve as an actual filter only? Something like:
wordAnnotations(
  word: "afore",
  lang: "lat",
  pos: VERB,
  ... // Other values that would serve as a filter
)
This would be consistent with usage examples I've seen and will make the query simpler. What do you think? Does it make sense?

I think generally I agree with you. I'm a little uncertain about the example though. In the data model, word is an abstract object (based upon the Ontolex ontology https://www.w3.org/community/ontolex/wiki/Final_Model_Specification), with a property "representation" that contains the actual letters that make up the written representation of the word. We can hide that detail from the client of course in the GraphQL api, but that's why "representation" is there along with "pos" and "lang" which are also properties.

kirlat · 2020-11-13T14:44:08Z

I've added #42 for the discussion of the annotation UI concepts.

balmas · 2020-11-18T18:21:04Z

See #43 for discussion of PIDs for data objects.

balmas · 2020-11-19T17:49:13Z

note that we might want to consider TEI Lex-0 as a possible export format for the lexical data. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#

kirlat · 2021-01-05T18:03:12Z

I've started to work on the implementation of annotations for short definitions and I think our requirements dictate us to change the way we store and serve lexical data within the application. The major driver for the change is the requirement for the user to specify dynamically what annotations should be applied to the data model displayed in the UI.

I think the best way to achieve this is to move from static props to methods that return data dynamically, based on user preferences supplied to it. Let's take a DefinitionSet class, for example. Now it has a shortDefs prop, an array. It should, I think, be replaced with a getShortDefs(options) method. The options argument mentioned in this example is the object with parameters that specifiy what annotations and corrections should be taken into consideration (author's only, alpheios.net only, both, etc.). This method will return an array of short defs,, but the information it will contain will be defined by the user preferences passed via the options object.

It seems to be very similar by the approach to the GraphQL where any field requested may have options that would specify what data should be returned and how it must be filtered.

If to accept the approach above, the following implementation would make sense, on my opinion. Upon a lexical query request for the target word a word object containing all information related to the target word specified is returned from the GraphQL facade. This information as returned is, maybe with some amends made by the lexical query, stored within the Word JS object. This data is hidden from the user and is not exposed directly. The data stored within the Word object would be represented by both nodes (lexeme, lemma, definition, etc.) and edges (connection, such as "D is a definition of the lexeme L"), as in the GraphQL response.

The user would then use methods of the Word object to retrieve specific lexical entities: getHomonym(), getLexemes(), and others. getLexemes() may be part of the Homonym object returned by the getHomonym() call or, if called on the Word object, would return all lexemes contained within the GraphQL query response. The objects returned by those methods would contain no edges, only nodes (to be compatible with what we have now), but information from the edges would be attached somehow to the nodes, probably, and thus will be accessible by the requester.

In order to avoid data duplication and to let data changes to be reflected in all instances of the return objects, the nodes data has to be references to the "original" objects stored within the instance of the Word.

It could be backward compatible as the "old" props would be combined with the "new" methods within same objects. The annotation-aware code would call the "new" methods while the existing code would use data from the props.

When the user would want to annotate (edit) a connection (an edge), as when saying "D is not a definition of the lexeme L", a method of the Word instance will be called, Its arguments would be IDs of the definition and the lexeme whose relationship is being edited, along with other information about the edit (is that an assertion or a negation, the user who made the change and so on). Based on that, the Word would create an additional edge representing the edit and all other necessary changes to the lexical data within the Word instance.

This data objects that were created as results of the Word methods would be notified about the data updates. The Word object itself will send a GraphQL request to the backend with the information about the edge created. The backend will store it in its database and will include in all consequential requests.

The Word would become some kind of the "mother" object that will be able to produce data in various forms. That data would exist as long as the mother object stays alive (it can be copied if it need to live longer).

Would the approach like this allow us to achieve what we want? I think it's more complex but infinitely more flexible. As we were discussing before that the Homonym is not the right structure on many occasions, the approach like the above would allow us to retrieve data in any form we want, be it a homonym, or a list of lexemes, or something else. We could vary it depending on the situation and user preferences.

What do you think about the approach? Do you see any pitfalls in it?

balmas · 2021-01-05T18:19:00Z

I agree this is the right direction. In #37 we also have proposed changes to the data model to introduce the Word as the driving object. I also agree with the suggestion for the props (e.g. DefinitionSet.shortdefs) to become methods that can be used to retrieve a filtered set of data according to user preferences. This all seems to me to be the way we should be going. There may well be pitfalls but we need to expose them as we go!

kirlat · 2021-01-06T09:54:34Z

I believe that we should save this cross-independency. Then DataModels would be safely used in other applications (like AlignmentEditor). Also we have WordList workflow with saved Homonym data - that doesn't need any annotations addons there, I believe.

I think this is a good point. If not all clients of data models need annotations, it should not be in the data model package. I think we should try to keep annotation-related and the "regular" business logic separated, if possible.

I think it's too hard to build such model theoretically so we should probably try to implement it in code keeping in mind a separation of knowledge domains.

kirlat · 2021-01-06T11:35:52Z

This is how I think we should represent edges. All edges should always represent asserting, not negating, statements, like "D is a definition of a Lexeme L". The reason for this, I think, is that an asserting statement creates a connection. There is no reason to deny something that does not exist in the first place. So the connection should always be created first before any statements can be made about it.

So here is a statement that defines an edge: "D **is a definition** of the Lexeme L".

Then we can start to gather statements about this edge. The statements could be either assertions, confirming that this connection is valid, or the negations, denying the connection's existence. There could be multiple instances of both from various users. The first assertion should probably come from the party that created it (as alpheios.net) and should be created automatically, along with the edge.

Statements could be attached to the connection as metadata:

"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
    negations:
        User 1
        User 2
        User 3

So in this case we have two assertions and three negations and under the normal conditions the connection should not be used during the lemma construction: the definition should not be attached to the lexeme. But if the user sest an option to respect only his/her statements, then the definition should be attached to the lexeme: we have 1 assertion versus 0 negations.

When someone creates an assertion or a negation, it will be passed to the GraphQL API in order to be stored as the edge's metadata. If every edge would have its unique ID, that would be easy to do: we would need to pass an assertion/negation and the ID of the edge it should be attached to.

There are also comments. I think we should be able to attach comments to anything that has an ID: to the node, to the edge, or to the other comment (that would allow to create threaded discussions, if necessary).

So if the connection has comments, it would look like the below:

"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
    negations:
        User 1
        User 2
        User 3
    comments:
        Comment 1
        Comment 2

Let's say that the User 4 wants to confirm the assertion and create a comment about that. In that case two new objects need to be created: an assertion and a comment. Both of those new objects will be connected to the edge by the edge's ID, nothing else should be needed. The transaction to update the annotation database will send both objects along with the ID of the edge they're attached to. The resulting edge would look like below:

"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
        The user 4
    negations:
        User 1
        User 2
        User 3
    comments:
        Comment 1
        Comment 2
        Comment 3 from the user 4

Now let's assume that someone wants to create a new definition for the existing lexeme. In that case we'll need to:

Create a new definition object and assign a new unique ID to it.
Create a new edge: a connection stating that a new definition belong to the existing lexeme. It will hold two IDs of objects it connects: an ID of the existing lexeme and the ID of the newly created definition.
An assertion by the user who created the definition. It will state that the connection is truthful.
(Optional) a comment explaining why this definition should belong to the lexeme.
All four of those objects will be sent to the annotation DB backend.

Would something like this work? if so, I will create GraphQL transactions around them.

balmas · 2021-01-06T14:38:55Z

I believe that we should save this cross-independency. Then DataModels would be safely used in other applications (like AlignmentEditor). Also we have WordList workflow with saved Homonym data - that doesn't need any annotations addons there, I believe.

I think this is a good point. If not all clients of data models need annotations, it should not be in the data model package. I think we should try to keep annotation-related and the "regular" business logic separated, if possible.

I think it's too hard to build such model theoretically so we should probably try to implement it in code keeping in mind a separation of knowledge domains.

Agree this is an interesting point. However, to be clear, it's not just annotations we're talking about here. Another reason for this refactoring is that we need to be able to retrieve resources from a wider variety of sources and let the user choose which to include and how to combine them. But that's also the point of using GraphQL -- it is supposed to address just this use case. We should keep that business logic for combining resources behind the GraphQL facade, but I'm not sure it means that we shouldn't have a method on the data model object to specifically request the data according to the user preferences.

I think it's good to proceed cautiously here and and at each step ask ourselves if we have appropriate separation of concerns.

balmas · 2021-01-06T14:54:53Z

This is how I think we should represent edges. All edges should always represent asserting, not negating, statements, like "D is a definition of a Lexeme L". The reason for this, I think, is that an asserting statement creates a connection. There is no reason to deny something that does not exist in the first place. So the connection should always be created first before any statements can be made about it.

So here is a statement that defines an edge: "D **is a definition** of the Lexeme L".

Then we can start to gather statements about this edge. The statements could be either assertions, confirming that this connection is valid, or the negations, denying the connection's existence. There could be multiple instances of both from various users. The first assertion should probably come from the party that created it (as alpheios.net) and should be created automatically, along with the edge.

Statements could be attached to the connection as metadata:
"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
    negations:
        User 1
        User 2
        User 3
So in this case we have two assertions and three negations and under the normal conditions the connection should not be used during the lemma construction: the definition should not be attached to the lexeme. But if the user sest an option to respect only his/her statements, then the definition should be attached to the lexeme: we have 1 assertion versus 0 negations.

When someone creates an assertion or a negation, it will be passed to the GraphQL API in order to be stored as the edge's metadata. If every edge would have its unique ID, that would be easy to do: we would need to pass an assertion/negation and the ID of the edge it should be attached to.

There are also comments. I think we should be able to attach comments to anything that has an ID: to the node, to the edge, or to the other comment (that would allow to create threaded discussions, if necessary).

So if the connection has comments, it would look like the below:
"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
    negations:
        User 1
        User 2
        User 3
    comments:
        Comment 1
        Comment 2
Let's say that the User 4 wants to confirm the assertion and create a comment about that. In that case two new objects need to be created: an assertion and a comment. Both of those new objects will be connected to the edge by the edge's ID, nothing else should be needed. The transaction to update the annotation database will send both objects along with the ID of the edge they're attached to. The resulting edge would look like below:
"D is a definition of a Lexeme L"
    assertions:
        alpheios.net
        The current user
        The user 4
    negations:
        User 1
        User 2
        User 3
    comments:
        Comment 1
        Comment 2
        Comment 3 from the user 4
Now let's assume that someone wants to create a new definition for the existing lexeme. In that case we'll need to:

Create a new definition object and assign a new unique ID to it.

Create a new edge: a connection stating that a new definition belong to the existing lexeme. It will hold two IDs of objects it connects: an ID of the existing lexeme and the ID of the newly created definition.

An assertion by the user who created the definition. It will state that the connection is truthful.

(Optional) a comment explaining why this definition should belong to the lexeme.
All four of those objects will be sent to the annotation DB backend.

Would something like this work? if so, I will create GraphQL transactions around them.

I need to think about this a bit. Is the primary difference between this and the original model I proposed (and then revised to remove the assertions as nodes), is that rather than assertions/negations being nodes with a user at one end and an edge (treated as a node) at the other, they are properties of the edge itself?

kirlat · 2021-01-06T16:24:12Z

Is the primary difference between this and the original model I proposed (and then revised to remove the assertions as nodes), is that rather than assertions/negations being nodes with a user at one end and an edge (treated as a node) at the other, they are properties of the edge itself?

I was drawing a lot of diagrams on paper picturing possible ways to express lexical relationships and then was trying to match it to our existing data structures and to the possible GraphQL API. What I've described is the simplest way to achieve what we want that I've found.

There is an edge between lexicaI nodes and the user, but they are straightforward, has no metadata attached, is always one-to-one, would never be amended once created, so I decided to omit them and show users as properties. But technically it is an edge. I just did expose it as that for simplicity.

I think it's very similar to your approach:

I have removed the User from the graph model for now, opting instead to use a reference to the user in the createdBy property of both Node and Edge documents. I want to be able to associate a User with both an edge and a node within the same graph, and there isn't a graph-like way to do that that I can think of that doesn't break the graph. It might also make sense to some point to have completely separate User collections for data (similar to the language-specific collections of nodes). In LDP terms, the User is probably more a container than either a node or an edge.

I believe the "main" graph should portray relationships between lexical entities only. Users represent a different concept and I think they probably should not be on the graph. Having them as props that hold a reference to the user object in the user collection (as multiple objects could refer to the same user) should be sufficient for us, I think.

... we have the potential for many edges that essentially say the same thing, asserted by different authorities. I think ideally the query implementation would dedupe and aggregate them, but it will be up to the client to decide how to handle conflicting assertions.

I liked this approach and I've tried it first, but I think it's too complex and would create too many issues with representing it in both GraphQL and JS objects. So I've tried to replace it with a simpler one: one edge with many assertions attached. The sum of those assertions would decide how "strong", or "valid" the connection is.

So I think the major difference is that I suggest to replace multiple relationships representing an individual assertion or negation each with a single relationship having many assertions/negations attached to it (unless I'm missing any other important points). I think it would be way simpler to store it in DB this way and to present in GraphQL results.

We might also have one for the commentingOn relationship, one for the relationship between parts of alignment nodes, etc.

Similar to users, I think it's simpler not to create an edge, but just to attach a comment object to either node, edge, or another comment. First, comments are conceptually different from lexical entities; they probably belong to other, "non-lexical" dimension. Second, we have to be able to add comments to relationships (edges) but then we wouldn't be able to do so because we'd have to create an edge between an edge (a lexical relationship we want to comment on) and the comment itself (the node). Edges can connect nodes only. So we should not have an edge here, I think.

Here are my thoughts on this. What do you think? It's still fresh in my head and not fully formalized, but I think it's probably enough to represent some adjustments to a concept.

balmas · 2021-01-06T16:50:24Z

I think your approach about the assertions/negations is worth trying. It is probably easier to support than having edges as negations and I agree with the philosophical point that creating an edge to say a relationship doesn't exist is counterintuitive. However, we need to be able to have properties on the Assertions other than the user -- they also need, for example, level of confidence, and creation dates.

For comments, I'm a little less certain. I agree comments are probably in a separate dimension but we have to also consider the use case of comments on other comments. Maybe they need to go the other way --- i.e. Comments are Nodes and there can be a commentsOn relationship between two Comment Nodes, but a Comment on a LexicalRelationship references the LexicalRelationship edge it comments on as a property?

kirlat · 2021-01-06T17:37:27Z

However, we need to be able to have properties on the Assertions other than the user -- they also need, for example, level of confidence, and creation dates.

Should have no problems with it, I think.

For comments, I'm a little less certain. I agree comments are probably in a separate dimension but we have to also consider the use case of comments on other comments. Maybe they need to go the other way --- i.e. Comments are Nodes and there can be a commentsOn relationship between two Comment Nodes, but a Comment on a LexicalRelationship references the LexicalRelationship edge it comments on as a property?

What if in order to solve this conundrum we follow the FB approach and split comments on "comments" and "replies"? Comments could be attached to both lexical entities and lexical relationships. But if someone wants to add a comment on a comment, that would be a reply, and there will be an edge connecting the comment and the reply, or two replies, if it's a threaded discussion. It also would be in-line with original meanings of the terms: https://ux.stackexchange.com/questions/118624/comment-vs-reply-in-a-social-feed.

Here is a piece of documentation confirming that FB treats comments and replies differently. Not sure what's the reason, but maybe they faced issues similar to what we're trying to solve.

We could probably think of it as of different transparent planes stacked on top of each other. The base plane is the one with the lexical relationships graph. The one on top is comments/replies. The comment prop on the lexical graph plane may become a node on the comments plane to which replies can be attached.

So the comments/replies graph would exist only if there are replies to a comment. There would be multiple reply graphs each having a comment as a root node.

balmas · 2021-01-06T17:56:58Z

Hmm. It's not clear to me from that FB link that Facebook really treats comments and replies separately -- to get the comments on a comment you access the /comments edge and it says that a comment may be a reply.

I think we have the following use cases for comments

(1) a comment on a lexical entity node
a comment on a lexical entity relationship
(3) a comment on (or reply to) a comment

for (1) and (3) it seems pretty clear to me that the comment should be a separate node, and the relationship between the comment and the thing it comments on is an edge.

for (2) it's murkier, but it seems like perhaps then we still create the comment as a node, but here it is referenced as a property of the lexical entity relationship, and then comments/replies to it are in the comments/replies graph.

I think this is essentially what you were suggesting, except I think the comment should always be a node regardless of whether it has any comments/replies to it.

kirlat · 2021-01-06T18:29:04Z

I'm not fully familiar with the FB approach but that phrase from documentation

Replies - If this is a comment that is a reply to another comment, the permissions required apply to the object that the parent comment was added to.

and the way they use /comments edge hints to they way that they have replies as a special type of comments, yet within the comments group.

See also the user comment in the other link stating that

Facebook also allows "Replies" to "Comments"

But those probably are just terms used so that the model made more sense.

I guess generally we have a plan and this is just one of the minor points. Technically, since edge would have an ID, we can attach anything to it (a node). And if we consider comments to be on a different plane this would not break the model of the lexical entity relationships graph. But I'm not sure if it's the best way to implement it. Will think more about it.

kirlat · 2021-01-07T12:35:37Z

I was thinking about it, and reading about different implementations, and I think we might solve our issues if we introduce a node in the middle (NiM) between a lexeme and a definition. It is something similar conceptually to the "connection" concept that many GraphQL tools have (but probably it's not exactly the same). A connection point would provide a place where comments, along with other data, can be attached.

This is how it might look on a diagram:

So the connection point reflects the fact that there is a definition attached, but it that does not describe what this definition exactly is. I think those two concepts are related, but yet separate.

I think it might be very beneficial not only as a point to attach comments, but as a way to track changes. There could be two types of edits: one to replace a definition text with the other one, and the other to add a new definition, or remove definition altogether. The diagram with connection points allows to track both those cases.

Let's check the diagram with direct connections between a lexeme and definitions:

Let's say someone decided that the first definition does not make sense for the lexeme and has to be the deleted and the text of the second definition should be edited. The resulting diagram would be:

From the resulting state, however, is not clear whether it is a result of editing the first definition and removal of the second or the opposite. The other problem is: if history and comments are attached to the edge, how would it be kept if a definition is removed? We cannot have an edge that points to nothing. It has to be deleted. As a result, the related history will go away with it, but I think we would like to know why, when, and why removed the definition.

If we'll use connection points, the state after the edits will look like below:

It is clear from the diagram now that it's the first definition being deleted and the text of the second definition being replaced. We will also retian an empty connection point that would keep the data about the deletion.

Let's complicate the situation and say that an addition to the edits described above (first definition removed, second one edited) there is also the third definition added. With direct connection, the resulting diagrams would be:

I think it does not convey what exactly happened here.

With connection points, a diagram would be much more informative, on my opinion:

The connection point might seem to be an unnecessary complication, but, at the same time, it helps (I think) to solve our problems and would ultimately serve to our convenience. I think we can think of connection points as of the number of definitions attached to the lexeme (i.e. if we can say "this lexeme has three definitions" this will mean three connection points). The definitions itself, on the other hand, could be think of as just a representation of a definition text. As with value objects, if we want to edit a definition text, we create a new Definition object with the text edited and reattach it to the same connection point, replacing the old definition. The old definition object could be kept in database, and being referenced from the history of definition edits. If someone would decide that the old definition was better, it can be reattached to the same connection point.

What do you think about this approach as a whole?

balmas · 2021-01-07T13:36:31Z

I think that could work. Would there also be a “replaces” edge between definition c and b ? (Eg definitionC replaces definitionB)?

kirlat · 2021-01-07T14:49:17Z

I think that could work. Would there also be a “replaces” edge between definition c and b ? (Eg definitionC replaces definitionB)?

I'm not sure that we need it. I thought we might keep all edit-related info at the connection point. If I understand use cases correctly (please let me know if not), we would never ever replace Def.C with Def.B once and for all. We probably have to keep both indefinitely.

So let's say we retrieved lexical data and are building a lexeme. According to our algorithms we think that is should have a Def.B attached to this lexeme. We create an assertion stating that the Def.B is attached to this connection point, authored as "alpheios.net". Some time later someone thinks this is wrong. He/she creates a negation stating that the Def.B does not belong to this connection point and an assertion that the Def.C is the right one. So the Def.C (it has one assertion, assertion score is 1) should normally be attached to the lexeme. That beats the Def.B which has an assertions score of 0 (one assertion vs. one negation; I'm ignoring confidence, weight, and other factors here).

But if an option be set to ignore user's annotation, then the same lexeme would be rendered with the Def.B.

So it seems both definitions need to be connected at the same time, and the definitions are never really removed. The diagram would look more like below:

It also means that assertions, negations, and comments should contain the ID of the definition to which they are attached. We then would check all connections existing from this connection point (to Def.B and Def.C), gather assertions/negations related to each, and, based on that, decide what connection should prevail.

Is that so? Do I understand things correctly?

balmas · 2021-01-07T19:58:50Z

Ok. I guess I was thinking of the versioning scenario, where definition c was a correction of the text of definition b . But I think we should not get too bogged down with all of the possible variations right now. I think the structure you have proposed, with the node-in-the-middle addresses one of the key things that was still troubling me about the data model design and is a reasonable jumping off point.

I will work on introducing that into the prototype ArangoDB model.

kirlat · 2021-01-08T09:24:05Z

As we've discussed previously, it would be not a good idea to change the existing Data Model objects in order for them to support annotations. That's because other apps are using them that do not embrace the annotations concept.

How about the components library, and, especially, the UI components themselves? In order to support annotations and annotation options they should be aware at leas of the existence of annotations: there would be annotation options to display, and many annotation-related data objects, such as comments, assertions, and negations that need to be displayed by the UI. The lexical query should also be aware of the annotation-related data in order to factor it into the resulting Word object. The best we could do is probably to keep some UI and other components annotation-agnostic, but we can't do it for all of them. The annotation knowledge has to trickle in somewhere inside the components (but we can try to keep this tightly controlled).

So my question is that: would we ever need an assembly of components (i.e. the build of the components library) that is annotation-unaware? If so, we can handle it, probably, this way:

Make annotations be a package on top of components, providing some alternative component and method implementations that will include annotations data.
If we want an "annotation-free" build of components, we'll use the components library.
If we want annotations to be included, then the build from the annotations package should produce the "annotatable" version of components.

If that is not needed, we can simply let the (hopefully) limited amount of annotation knowledge to trickle into the components, maybe in a form of plug-ins and/or modules.

What would be the best approach to handle that?

balmas · 2021-01-08T13:28:06Z

As we've discussed previously, it would be not a good idea to change the existing Data Model objects in order for them to support annotations. That's because other apps are using them that do not embrace the annotations concept.

I'm not sure we have concluded that. See my comments at #40 (comment)

balmas · 2021-01-08T13:34:45Z

I would rather not think of this as annotations-included or annotation-free. But instead, recognize that the data sources that contribute to produce the final data that the user sees are fluid and both the user and the application may influence not only which data sources are included but also how they are combined.

kirlat · 2021-01-08T15:45:09Z

Here is the first take at GraphQL type definitions with annotation support: https://gist.github.com/kirlat/5c36baaf26e3ea399bfe36d0a354c7b1. Only some objects are annotatable (Lexeme, Definition, and the connection between them); I think we can add that to other objects later.

What do you think? Am I missing anything there?

balmas · 2021-01-08T19:25:33Z

Thanks! I added some comments directly in the Gist.

kirlat · 2021-01-11T15:28:00Z

Please check an updated version with the suggested changes implemented and some mutations added: https://gist.github.com/kirlat/5c36baaf26e3ea399bfe36d0a354c7b1

I've also made types more specific by introducing the Assertion and the Negation types. I was following an advice from this article.

balmas · 2021-01-11T17:45:33Z

Comments added to the gist.

kirlat · 2021-01-15T13:44:59Z

Per discussion on Slack with @balmas: we need a way to integrate annotation data into our existing data model without significant changes to the data model itself (for several reasons).

The current situation: Lexical query produces a Homonym or HomonymGroup object which contains all necessary information encapsulated within the objects it is comprised of.

How could this be changed to accommodate for the annotations data? Two approaches comes to mind.

One is the centralized annotation data storage:

Data Model Object stores the results of the word lexical query. It has a method that returns the word object. That might be used instead or along with both the Homonym and the HomonymGroup (there could be methods to return either Word, Homonym, and HomonymGroup). The lexemes and other objects down the hierarchy are exactly the same as they are now. They do not contain any annotation data.

An annotation data could be retrieved, updated, added, or removed via specialized methods of the Data Model Object class. Lexical elements to which annotation data is connected are referred by their IDs.

In this model any piece of code has to have a reference to the Data Model Object and can use its methods to retrieve/alter the annotation data.

The other approach is when annotation data is spread across all lexical data objects within the hierarchy. We could keep the structure of the lexical objects (Lexeme, DefinitionSet, Definition) the same as it is now, but add an AnnotationData object to the prop dynamically. This way changes to the object would be minimal. It would also be backward compatible: the parties that are not aware of the annotation data would simply ignore it and work with lexical objects as before:

Whoever need annotation data would use methods of the AnnotationData objects to obtain and edit it.

Another option, a combination of the two approaches described above, is to integrate not the annotation data, but the annotation API to the lexical data items. These API methods would retrieve and change the data that is located within the Data Model Object:

I think of it as of a convenience version of the first approach. Annotation methods are grouped into objects whose annotation data will be accessed. As a result, instead of a "big ball of methods" we have methods grouped in a nicer way. Also, since methods are aware about what data object they're attached to, their signatures can become simpler. For example, to add an annotation to the lexeme, instead of dataModelObject.addLexemeAssertion(lexemeID, assertion) we can use lexeme.annotations.addAssertion(assertion).

We might use the same "distributed API approach" in other cases. For example, if the lexeme has changed and we want to pull the updated data, we can use something like lexeme.update() instead of dataModelObject.updateLexeme(lexemeID). The method will request an updated data from the Data Model Object and will update the lexeme data fields with it.

Are there any other approaches possible?

What do you think would be the best way to go for us?

P.S. After reading this interesting Stack Oveflow question I've started to think that another beneficial approach might be for the Data Model Object to return lexical objects without the annotation data attached but then the annotation-aware code to use the annotation API of the Data Model Object (or something else) to pull the corresponding annotation data and, possibly, attach it to the lexical data objects (or not to attach it and use as separated objects). That would provide the best isolation between the lexical data and the annotations as only the annotation-aware code would pull the annotation data into the application context.

balmas · 2021-01-15T15:56:07Z

I think the 3rd approach is the closest to what we need to do, not only for annotations but also for the lexical data itself.

A big problem for the 2nd approach (annotation data kept in the data models) is that it doesn't account for the inter-dependencies between the different parts of the lexical data objects.

The work on the treebank disambiguation has made this a little clearer to me and I think it applies to both the aggregation/disamibugation workflow and the annotation workflow.

Looking at that a little more closely, we currently have something like this:

Lexical module queries the individual data sources via the client adapters
The client adapters apply domain business logic to create alpheios data objects out of the raw data, applying both language-specific and resource-specific rules.
The lexical query aggregates the results
The lexical query tells the Homonym object to disambiguate all the possible Lexemes
the Homonym object tells each Lexeme to compare itself to another Lexeme
The Lexeme object tells its Lemma to compare itself to the other Lexeme's Lemma
The Lexeme object tells each Inflection to compare itself the other Lexeme's inflections
Only when we have all of the comparison results, can we make a decision about how to combine those results to create the Homonym object we show the user.
Lexical query then passes the resulting aggregated/disambiguated data model objects back to the application, discarding the original query results

There are a number of problems with this including:

the code cannot yet handle comparing more than 2 possible sources of data, and the rules about how they are combined are essentially hard-coded.
By the time we show the user the results, the details of how we made our decision of what to show them are no longer present in the object, so we can only show the results and now how we came to them. This will be a problem for annotation because we need to be able to be explicit about each piece of data we are annotating and tie it back to its original source.
The query results are transitory. The annotation/disambiguation process changes the results and there is no way to go back in and recombine them in a different way using different rules.

balmas · 2021-01-15T16:02:27Z

In domain driven design there is a repository pattern that doesn't apply perfectly to our use cases, but I think we need something like that. I think we need a place where we can aggregate the results of individual resource queries (instantiated as alpheios data model objects) and reach back into as needed to recompose the data objects we present to the available for the user to view and annotate.

kirlat · 2021-01-18T12:45:52Z

I think that based on the all above we can conclude that lexical data and annotations are different domain contexts and should be kept separate as much as possible. We, however, need a context mapping between those two contexts. The lexical data could be the supplier of the information and the annotation data would be a consumer.

I think something like the repository pattern may work well. There could be two repositories: of lexical data and of annotations. The code that does not use annotations would pull data from the lexical data repository (get lexical data for a specific word). The annotations-aware code would pull data from the annotation repository (annotations for a specific word). The code of the annotation data object then would get what it needs from the lexical data repository, and will combine information from two repositories together.

The lexical data should assembled so that it would be possible to track how this data as combined. It should allow data to be recombined a different way at any moment. Same can be said about annotations.

kirlat · 2021-01-19T17:41:02Z

I've tried to apply principles of the DDD to everything that was said above and here is what I came up with:

Let's consider two workflows of retrieving data by the client that needs lexical data only (as we do currently) and the other client that expect it to be enriched with the annotation data. Please note that I'm using the terms in their Domain Driven Design (DDD) meanings.

Workflow without annotations:
Initiated by the user input, the presentation layer calls a getLexeme() (should probably be a getWord()) method of the Lexical Data App Service (LDAS). This is something that Lexis does currently. LDAS checks the Lexical Data Repository (LDR) for a word. We don't have a repository object now but need it for the history support. If the word is not in the repository, the LDR will go to the corresponding client adapters to obtain data, create a Word Aggregate out of it, store it internally, and return to the LDAS service. The LDAS will transform the Word Aggregate (i.e. combine data from different sources, disambiguate it) and will create a Word DTO (Data Transfer Object). The Word DTO will be returned to the presentation layer that will display it to the user. We might have several various DTOs each serving its specific purpose.

The Word Aggregate is the object that stores lexical data from all sources separately, without transforming it in any way other than mapping source values such as grammatical features into the values of our domain context. The mapping would be a responsibility of client adapters (they do that currently). The lexical data from the sources would be combined using either the methods of the Word Aggregate or the LDAS service.

The change we need to make to match this model is to move the logic to combine data from different lexical sources out of the client adapters into the LDAS.

Workflow with annotations:
Initiated by the user input, the annotations-aware presentation layer calls a getWord() method of the Annotated Lexical Data App Service (ALDAS). ALDAS checks the Lexical Data Repository for a word and the Annotations Data Repository (ADR) for the annotations data. If the word and the annotations are not in the repository, it will go to the corresponding client adapters to obtain data. It will create a Word Aggregate and put it into the LDR repository. It will create one ore several annotation aggregates and will store them in the ADR. All aggregates will be returned to the ALDAS service. The ALDAS will combine the Word Aggregate and the Annotations Aggregate data in accordance with the getWord() options and will create an Annotated Word DTO. The Annotated Word DTO will be returned to the presentation layer that will display it to the user.

With the architecture as the above we will keep data from different lexical sources separate and will be able to recombine it in different ways. During this we would store some transformations history data that will describe what sources were used to produce the final data and what transformations were applied. That will allow to display that information within the presentation layer.

Does the above make sense?

Right now our lexical objects from data models (Lexeme, Lemma, Inflections) has traits of both aggregates (because they have the domain business logic such as disambiguation methods) and of the DTOs (because their purpose is to be used in the presentation layer). I'm not sure how it's best to deal with them. Should we put them into the aggregate and treat as entities and value objects? That will require some changes that may make them incompatible with the other clients that are using them. Should we treat them as DTOs (because I think what they actually are now, functionally) and ignore the business logic functionality? That would require to create lexical entities and aggregates from scratch but will give us much more freedom. Or should we do something else? What do you think?

balmas · 2021-01-19T19:16:01Z

I think this is about right.

The change we need to make to match this model is to move the logic to combine data from different lexical sources out of the client adapters into the LDAS.

A small point but I think technically this logic to combine data from different lexical sources actually currently lives in a combination of the lexis module and the data model objects (e.g. Homonym, etc.)

I need to think a bit about the question about the lexical objects and the DTOs.

kirlat · 2021-01-22T13:39:36Z

I would like to summarize what I think we can do in order to support annotations. Here is a detailed diagram showing all the architectural components and the workflow of getting the word data:

The object that stores lexical data is the Word. It is an aggregate root, a container that stores lexemes. Each lexeme contains definitions and inflections (I'm listing only the most important items here). Each inflection contain features. Words, lexemes, definitions, and inflections are entities, they have their own unique IDs that are based on content. Assertions, negations, and comments are entities too. All other objects are value objects. All the object mentioned above would be domain objects, existing only within the domain context.

The rule for the aggregate roots is there should be no references from the outside to the objects that are stored inside the root. All changes to the object within the aggregate root (i.e. the Word) should be done only through the methods of the Word. I think we can satisfy this requirement so the Word can hold instances, not ID references, of lexemes. If lexemes would be accessed from the outside, they should be stored in their own repository and the Word object should keep their IDs, not direct references.

Keeping object instances within the Word object is the simplest solution so I think we should go with it unless it will cease to satisfy our requirements.

Actually, the Word may not need to hold the lexemes objects at all. It may store results of the lexical queries returned by the client adapters in form of plain JSON-like objects. But the Word could provide a method with options that, when called, will return a set of lexemes (a DTO) that will be constructed out of those lexical query results according to the user options provided to the method. This lexemes DTOs will then be used to show lexical data to the user in the UI. The lexemes DTOs might be cached using a combination of word, language, context, and options as a key in order to avoid duplicate method calls. DTOs exist in the presentation layer.

I think our current lexical objects (Lexeme, Inflection, Definition, etc.) are better suited for the role of DTOs because that's what they were created for: to display lexical data to the user. So if we would need corresponding domain object, the should be presented by different, newly designed classes.

The data retrieval workflow could work the following way.

When user selects a word in the UI, the presentation layer (the Vue component) sends request to the application layer represented by the Lexis module. The Lexis module then asks the Word Repository to return the Word object. If the requested word is in the repository, it is returned to the Lexis module. If not, the Word Repository creates an empty Word object and returns that empty object to the Lexis.

Once created, the Word object initiates queries to client adapters and/or GraphQL interfaces to obtain all lexical and annotations data that is required for its existence. What exactly is needed is determined by the word data (word, language, context). Those are passed as parameters to the Word constructor. References to client adapters and GraphQL interfaces (so that the Word would know where to obtain the data) are passed to the Word constructor as well.

Once each piece of lexical or annotations data is retrieved, the Word object fires a domain event. Lexis and Annotations modules listen to those events and update the flags in the Vuex store. The tracking structure in the Vuex could be a an object with the following structure:

{
 wordID1: {
     lexicalDataUpdated: dateTimeValue,
     annotationsDataUpdated: dateTimeValue
     },
 wordID2: {
     lexicalDataUpdated: dateTimeValue,
     annotationsDataUpdated: dateTimeValue
     }
}

The presentation layer (Vue components) tracks changes in those flags and, when the changes affect the data it displayes, it runs a method on Lexis or Annotations to return the updated DTOs.

How does the data flow from the Word object to the presentation layer? The method of the Word object (something like getLexemes()) returns a DTO with the lexemes (or an object that is converted to the DTO in the Lexis and/or Annotations). The lexemes are processed (disambiguated, changed according to annotations) to form the output DTO. How this processing is done depends on the values of the options provided to the getLexemes() method. As a result, different options applied to the same Word would yield DTOs containing different data. The application layer (the Lexis and Annotations modules) then passes the DTOs to the presentation layer.

How would the data updates be handled in this model? Updates are simpler, in a way. Only annotations can be updated, not the lexical data itself. The update of annotations, however, may affect the resulting lexical data DTOs, so the Vue components that display those DTOs would need to pull an updated data.

In order to update annotation, the Vue component in the presentation layer uses a method of the Annotations module from the application layer. Upon receiving such a request the Annotations module gets the corresponding Word object from the Word Repository and executes an updateAnnotations() method of it. The Word object updates the annotations data internally and sends a request to the annotations adapter or the annotations GraphQL API to update data on the remote server. The Word object also publishes data update events that will notify code modules that display this word's data that there was an update and modified data has to be pulled out.

This is how this process looks on the diagram:

There should be specialized methods to change, add, or remove each type of annotation. The method's argument should be specialized DTOs containing the word ID and the data describing the change in the annotations to be made. So we'll need to have multiple annotations input DTOs, each for a specific type of operation.

That's what I think might work for the purpose. It should be flexible and extendable, but I've also tried not to overcomplicate it. It may use many objects of the existing infrastructure and requires only smaller amount of newer objects to be created. Some details are probably missing but I think we'll be able to figure them out once this will go into implementation.

I would greatly appreciate your feedback on this.

balmas · 2021-01-22T14:55:26Z

I think this approach makes perfect sense.

Actually, the Word may not need to hold the lexemes objects at all. It may store results of the lexical queries returned by the client adapters in form of plain JSON-like objects. But the Word could provide a method with options that, when called, will return a set of lexemes (a DTO) that will be constructed out of those lexical query results according to the user options provided to the method. This lexemes DTOs will then be used to show lexical data to the user in the UI. The lexemes DTOs might be cached using a combination of word, language, context, and options as a key in order to avoid duplicate method calls. DTOs exist in the presentation layer.

This is an important point. One of the difficulties we have right now, when we have only one source of annotations impacting the DTOs, is that there are interdependencies between the components of a DTO that are needed to be taken into account in order to construct the DTO that is displayed to the user. For example, a inflection can impact a decision about whether a lexeme is equivalent to another and needs to be merged with it. As we increase the number of data sources, the possible permutations will only grow. I think that storing the results from adapter queries as plain (but normalized) JSON objects within the Word repository would probably make it easier to deal with this.

balmas added the discussion label Oct 14, 2020

balmas assigned irina060981 and kirlat Oct 14, 2020

balmas mentioned this issue Nov 18, 2020

persistent identifiers for annotation data #43

Open

balmas mentioned this issue Nov 23, 2020

Annotation UI Design #42

Open

kirlat mentioned this issue Jan 7, 2021

Session History/Multiple Popup #37

Open

balmas mentioned this issue Jan 11, 2021

treebank: option to show all inflections, highlighting the one from the tree alpheios-project/alpheios-core#540

Closed

Annotation Data Design #40

Annotation Data Design #40

Comments

balmas commented Oct 14, 2020 • edited Loading

balmas commented Oct 14, 2020 • edited Loading

balmas commented Oct 14, 2020 • edited Loading

irina060981 commented Oct 15, 2020

kirlat commented Oct 15, 2020

irina060981 commented Oct 16, 2020

balmas commented Oct 16, 2020

kirlat commented Oct 16, 2020

balmas commented Oct 16, 2020

kirlat commented Oct 19, 2020

kirlat commented Oct 19, 2020 • edited Loading

balmas commented Oct 19, 2020

balmas commented Oct 19, 2020

balmas commented Oct 20, 2020

kirlat commented Oct 21, 2020

kirlat commented Oct 21, 2020

kirlat commented Oct 21, 2020

balmas commented Oct 21, 2020

balmas commented Oct 21, 2020

balmas commented Oct 21, 2020

balmas commented Oct 21, 2020

kirlat commented Nov 13, 2020

balmas commented Nov 18, 2020

balmas commented Nov 19, 2020

kirlat commented Jan 5, 2021

balmas commented Jan 5, 2021

kirlat commented Jan 6, 2021

kirlat commented Jan 6, 2021

balmas commented Jan 6, 2021

balmas commented Jan 6, 2021

kirlat commented Jan 6, 2021

balmas commented Jan 6, 2021

kirlat commented Jan 6, 2021

balmas commented Jan 6, 2021

kirlat commented Jan 6, 2021

kirlat commented Jan 7, 2021

balmas commented Jan 7, 2021

kirlat commented Jan 7, 2021

balmas commented Jan 7, 2021

kirlat commented Jan 8, 2021

balmas commented Jan 8, 2021

balmas commented Jan 8, 2021

kirlat commented Jan 8, 2021

balmas commented Jan 8, 2021

kirlat commented Jan 11, 2021 • edited Loading

balmas commented Jan 11, 2021

kirlat commented Jan 15, 2021 • edited Loading

balmas commented Jan 15, 2021 • edited Loading

balmas commented Jan 15, 2021

kirlat commented Jan 18, 2021

kirlat commented Jan 19, 2021

balmas commented Jan 19, 2021

kirlat commented Jan 22, 2021

balmas commented Jan 22, 2021

balmas commented Oct 14, 2020 •

edited

Loading

balmas commented Oct 14, 2020 •

edited

Loading

balmas commented Oct 14, 2020 •

edited

Loading

kirlat commented Oct 19, 2020 •

edited

Loading

kirlat commented Jan 11, 2021 •

edited

Loading

kirlat commented Jan 15, 2021 •

edited

Loading

balmas commented Jan 15, 2021 •

edited

Loading