-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WordList refactoring and approaches to GraphQL #38
Comments
I think this makes a lot of sense for the data model objects
Yes, I agree and it's one of the reasons motivating the refactoring, so that the client code can easily specify data sources and priorities and let the GraphQL api manage the details. However, we have to think carefully about the API -- i.e. it will need to be flexible enough that user preferences around which data sources to use and which take precedence in a merge/disambiguation scenario can be specified as inputs to the GraphQL query.
Let's talk through this at our check-in on Monday. I would like to understand how the session history fits in here. |
In our current Wordlist actions we use both lean and rich data. We store data inside local and remote storage in Json-like way, and convert it to an object with methods while working with it. The main problem with lazy conversions is to convert it both ways each time when object is updated. From my point of view object-oriented model with inheritance and methods allows to have more abstract code. So in both ways conversions would be used often.
But it is interesting to try new technology and GraphQL looks like as the next step for abstraction of data retrieval. |
Some links for reference: Original requirements for the wordlist: Architecture design discussion for the wordlist: |
As we proceed with this, I don't think we should assume that our current approach to local/remote database storage for the wordlist is something that we want to retain. Due to the syncing requirements across applications and devices, at the moment, the only scenario in which the local indexedDb provides any value is in the scenario where for storage to the remote database stops working somehow mid-session without the user knowing about it (but the lookups otherwise continue working -- so not a working offline scenario). This is an edge case, and it doesn't really justify the user of indexedDb for the wordlist as it currently stands. That isn't to say we don't want to make better use of indexedDb generally across the application, to improve performance and reduce remote lookups, because I think we do, but that should probably be tackled separately. |
Stepping back to look at the overall architecture framed according to the philosophy outlined at https://khalilstemmler.com/articles/client-side-architecture/introduction/ here's what I think it looks like (the core library -- I'm not trying to represent the client applications of alpheios core here) I think, at least according to this architecture approach, are model and presenter layers are not separate enough, and within the model, there isn't much clean separation of concerns between interaction and infrastructure. |
Discussion from Slack:
|
I agree (1) that the Lexical Query, Wordlist (and the future Session History) are all related in that they make use of domain data (which is retrieved from multiple sources and currently aggregated in the Homonym data model object) and (2) that it does make sense for Apollo Client to provide a Facade API (using GraphQL) to the shared domain data objects I would like to be a bit more cautious about the jump to Apollo Client for State Management and direct access from the Vue components to the GraphQL storage. Ultimately that may make sense, but I think we have a fair amount of detangling to do first. |
I think one thing that worries me about this is that we don't really have unique IDs for some of the data we're talking about. |
We have them for |
We can of course create a UUID for anything, but in some cases, they may not be helpful for reuse of data across sessions or even across word lookups. For the user wordlist, where we aggregate results for a single word form from whatever context it was looked up in we are able to use a combination of the language code and word form as a unique identifier for the word. But the specific lexemes that result from a word lookup can depend upon the context of the word, so we will need a more complex solution. On the server side, we cache requests based upon http request parameters. We will need to look at each data object we store and decide when and how it will be reused and develop the identification key accordingly. |
For the lexemes that depend on the context, can we use language + word + hashes of texts of pre and post text selectors?
I have a feeling that this is an essential part of a solution. Once we decide it, other components may fall into their places much more easily. Can we approach this using a Domain-Driven Design principles? First we could define events, then commands, and data objects may arise from an intersection of those two (i.e. from the aggregates). Of course, that would require to define domain boundaries beforehand, which may be the hardest part, on my opinion. |
Possibly
Yes, I'm working on this now. And you're right defining the domain boundaries is very difficult. |
Ok, the results of my first pass on the domain design for this is at I've taken some liberties with the approach, by including the business processes as actors, supplying preconditions where appropriate and distinguishing between originating and affected views. Also, our use of data is a little different than a traditional scenario for this sort of design approach, in that the queries are not just producing views on user created data. Up until now, except for the user word list, we have been a query-only system when it comes to domain data. With the introduction of the features that allow users to annotate the results of queries, and create their own data, we now have queries populating data objects that then may be "corrected" and saved by users as their own. So what I have done is to model both the Query and the User as potential actors on commands which create data. This may not be a proper application of the domain driven design approach but I couldn't find any examples that showed how to apply the approach to our scenario and I wanted to reflect the "creation" of the most granular level of the domain model elements from the query data because that's what's happening in our system -- I..e we aren't just creating views on already existing data -- we're aggregating data objects from the results of queries and then doing further operations on those data objects. @kirlat and @irina060981 let me know what you think. |
Thank you for the model! It's very helpful in understanding what we do on different levels. I will study it carefully but it already helps to see some things better. As you've noted (I did not really think about it before), our queries do not always produce views. What we do is, I think, create new data out of the existing pieces. Once it's done, we display some pieces of data to the user. So maybe we can make a distinction between these two groups of operations (data synthesis and data display)? Data synthesis:
Once the new data is constructed and stored, we can query it and display to the user as in the traditional scenarios. There is probably nothing special about this part of the workflow. If user decide to correct the displayed data, we go to data synthesis again. Once data modification is done, we use the presentational workflow part to display it. Would separation like that make sense? Can we consider those two areas as separate domains (we can subdivide them further)? Can we separate those two areas clearly? I think that can make things simpler. What do you think? |
I like the "data synthesis" description. I think we won't really know until we get into the details of the user annotations whether this separation between synthesis and display will be sufficient but I think it is a good place to start. |
The more I read about DDD and other related concepts, the more I feel that the right solution would be to put a lexical query behind a GraphQL facade. What user, in most cases, wants from our system? A To get a word, user makes a (conceptual)
Those status props would be reactive. When data is loaded, the That would effectively make a live Having the What do you think about it as of a concept? |
This is generally what I was thinking as well. |
As discussed in today's check-in, we agree generally on this model, although we need to move incrementally towards it, reusing existing code sensibly. |
After making some successful tests with the Appollo GraphQL and realizing it's strong points (an ease of combining local and remote queries, full-fledged in-memory cache, among other things) and limitations (there is no way to make trully asynchronous requests for local data, see notes about the read method that is used for that) I would like to offer for discussion a concept of the possible WordList refactoring.
One of the important facts about GraphQL queries (that is applicable to other queries as well) is that they return "lean" JSON-like objects (i.e. data only with no methods). We, on the other hand, use "rich" JS objects (full-fledged objects with powerful methods and many auxiliary data items) almost everywhere. Once we receive a lean data object from, we, following our current practices, would convert it to the JS object. If we would like to update a
WordItem
in the GraphQL storage, we would have to convert the JS object back to the JSON object.I'm wondering what if we would be lazy with such conversions? Having a JSON word item object, we could always convert it to
WordItem
JS object whenever necessary. Having a plain JSON object has several advantages, on my opinion:So what do you think if we start using more of JSON objects as our main source of truth?
Apolllo has a powerful caching. It is behind almost every GraphQL request that goes through Apollo. If we would to accept GraphQL as our API for data retrieval, it would, on my opinion, make sense to get full use of the Apollo caching instead of our own solutions (and we have to use caching within many Apollo use cases anyway). Would that be acceptable for us?
If GraphQL will prove itself, it might make sens to use GraphQL for many things as a standardized data retrieval and/or data update interface. We could use it to store options, for example, within the options refactoring work. A universal API to store different types of data may simplify things a lot.
The question with GraphQL is where to put the business logic related to data management (i.e. data merges and transformations). It can be that a GraphQL data provider would return a "raw" data and then the requesting object would be responsible to transforming it into the form that is required. Or it could be the GraphQL data provider that would allow to retrieve data in many different forms, and the requester would specify in what form the data should be obtained via a GraphQL query. The GraphQL data provider would do necessary transformations behind its facade and will return data formatted according to the needs of the client. I think the latter approach would be more in the spirit of GraphQL and we probably should use it whenever possible. What do you think?
For the
WordListController
architectural change, I think that it would make sense to create a GraphQL-enabled object that would sit between theWordListController
and theUserDataManager
. We can call it asWordListDataManager
or some other way. Instead of keeping allWordItems
in theWordListController
, they can be stored in the cache of theWordListDataManager
instead. TheWordListController
would then issue GraphQL requests to retrieve or update individual word items from theWordListDataManager
.WordListController
would receive word item(s) as JSON object(s) and then convert them to aWordItem
JS object(s) as necessary (hydrate theWordItem
object with the word item data). Would an approach like that make sense?@balmas, @irina060981, please let me know what do you think about all this. Thanks!
The text was updated successfully, but these errors were encountered: