-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate Micro ORM #12
Comments
Why would you still need EF? Pick one.
There is also Repository, but arguably this should never have been public. The rare cases where you need it can be replaced by direct use of the ORM. The way I see it, in this new world, raw HQL would correspond to raw text queries for the micro-ORM, HqlQuery would be replaced with a similar object representation built with Projections in mind exclusively (we would not encourage its use outside of projections, and maybe it's even in the projection module instead of on ContentManager), and GetMany would remain unchanged in interface, only the implementation would be new. Not sure what to do with Query. Maybe build a full IQueryable implementation at the content item level this time. |
What I don't like of this new world is that Orchard remains dependent of an SQL bd and in fact, invites devs to use raw SQL extensively for getting ids. I agree repository as you say should disappear in favor of accessing the ORM directly. But please let me query this ORM using LINQ. Related to what you point about that repository should be private (in this hypothetic new future we would be talking about the ORM): The reason you want it private is because devs break consistency (I read your blog post). I agree it is a bad use. But solution I think that more than to make it internal is only offer read only access to data provided by a repository. If we are talking of an ORM as a replacement of repository we can force in dbContext that all access to data are done with NoTracking ( in EF terms). Finally related to Get and GetMany I agree that by the moment (while we continue with a relational db) it is better to implement it ourselves instead of relying on an ORM. Due to two reason: It will be a very located code so future refactorings will be easy and it is the most often used method so performance is important. However I see chances of simplify its code and improve its performance if finally Orchard moves to graph DBS + EF. Because those DBS work specially well joining nested objects stored in different tables (or nodes) and because Include() method fits perfectly for this purpose simplifying other approaches not based on it. |
IQueryable is Linq. A document db would be lovely, but we would need to find one that is xcopy-deployable, and well-accepted. I don't think "raw SQL" was suggested anywhere. |
Yes IQueryable is LINQ but you only propose to offer it at ContentItem level. ContentItem level is not always the most efficient way of accessing data. Sometimes you need to move in a smaller grain level to get OIDs as you mentioned. At that level I also would like to use LINQ specially if Orchard starts supporting another db that turns LINQ in an efficient way of accessing data. That's why I prefer to use an hibrid db like ArangoDb or OrientDb than a document db it allows you more flexibility for doing queries at low grain level maintaining a good performance and to have relations between content items that benefit of its good performance without worrying of add redundancy for improving speed. Mmmh xcopy deployable ... well Siaqodb is copy deployable but it is not free but I think BrightStar it is, ArangoDb and OrientDb I don't know I will make further investigation. You are right I made a mistake ;) I replaced raw SQL per raw text queries. And it is a big mistake because there is an important difference regarding independence of the db. But to what extent when we talk about NoSql dbs those raw text queries based on MicroORM are giving us independence of the bd? |
In order to propose you how I would design Orchard data access looking to harness the good performance and flexibility graph dbs would give us using ORMs navigation properties and Include method. I have been reviewing how Orchard data schema is organized. But as you surely saw years before than me :[ a traditional ORM doesn't fit with dynamic nature of Orchard data. Problem is current ORMs by the moment has limitations to do what I want. What I want is ContentItemRecord table contains items with its own navigation properties to their partRecords that compound the ContentItem. The point is Graph dbs would support it without problems. But ORMs are focused on relational tables and this made them of typed nature. It forces us to have the same navigation property for all ContentItems in the table. Other option is use ORM inheritance but it is not recommendable for performance reasons. So it has no sense by the moment. We would have to wait to see to what extent EF7 supports the dynamic nature of NoSQL dbs to see if it allows to harness the power of those dbs maintaining the independence of the db. So now I see your point a MicroORM and an schema like a document db by the moment is enough. As you pointed most of what gives you an ORM is useless in Orchard due to its dynamic nature. If this circunstance changes and EF7 finally offers support to dynamic nature of NoSQL I woud prefer an hibrid db: graph+document schema than only a document db schema. It would improve relation between ContentItems, performance of complex queries and would help to end with current redundancy of data in infoset and in contentpart tables for performance reasons. |
Well thinking it twice inheritance could be ok. As other performance problems of ORMs problem comes from the gap between OO and relational dbs. But to reflect that all the Content items would inherits from its Content Type (it will de defined as code generated class) and that one would inherit fromb an abstract ContentItem class. It could be a good approach if EF is able to reflect this inheritance in a natural way in a graph db: A set of Content item nodes where each node is an instance of a different Content type. The point is I know it would change how everything is done in Orchard. Content types would be generated with code generation techniques and dynamic compilation only storing in db metadata for configuring its behavior. Many dynamic stuff will turn into typed stuff, this is better for devs. In general it looks a big challenge but very interesting to explore in my opinión. I hope I don't disturb you with all this entries changing what I propose. "Conversation" helps to refining ideas (brainstorming) even with myself XD ------- EDIT ------- |
I'm sorry but all this sounds like a big step backwards. The type system is built at runtime for a reason, and Orchard uses composition rather than inheritance for a reason as well. Those are not the features that we should reconsider, they are the features that made Orchard successful. Thanks for the suggestions though. |
The EF comment i left in there was because of something I heard Sebastien say on the last call. The way I see it, is that the current ORM solution is quite a bulky way to do things, as @bleroy mentioned earlier, we have several layers of abstraction over NHibernate as to which to query which provides us a massive amount of flexibility. But with that flexibility we have also had to write a lot of code in order for that to work. In the new world I would like to see a much lighter solution - I love the idea of using something like Massive, or Dapper that allows a much faster querying time against the DB. The idea to build the full IQueryable at content item level I think is perfect - this would open up a lot of doors. My only question is how would we expose records that are not content items? I would guess people would use a HqlQuery or Raw HQL right? Or something else? |
@bleroy said "The type system is built at runtime for a reason, and Orchard uses composition rather than inheritance for a reason as well" Regarding the other option based on composition and in a Graph+Document db I have added an issue in EF7 GitHub for asking them to support dynamic objects with navigation properties. Regarding the requirement of having an Xcopy deployable Nosql db, ArangoDb don't support it out of the box but they are interested on hearing our requirements regarding this feature: ------ EDIT ----- |
Notes from meeting.
|
Orchard enables code to compose entirely new content items from arbitrary parts at runtime. You can't do that with code generation. |
And just a note: what about 2nd level caching? Is it needed or quering is not affected by caching? |
So far the first issue when looking at Massive and Dapper is that they rely on System.Data, this requires us to only target ASP.Net 5.0 and removes the support for ASP.Net Core 5.0 (which would suck from a cross platform perspective). More investigation needed. |
So based on Sebastiens suggestion we should look at creating two API's, 1 for storage and 1 for querying. How would we view those API's looking? How would a full implementation of IQueryable at the content item level look? / @bleroy
From a storage perspective, if we used an ORM, we would still allow direct querying against the storage mechanism of choice for that user, but when storing the document we would store in multiple locations to allow the higher up query api to take advantage of faster query mechanisms. Correct? |
Finally, I had time to watch the meeting video where Sebastien shares his vision for data access layer. I agree is good to have two APIs one document oriented and one query oriented. Indeed it is the common appoach people is taking for apps based on Document dbs to have best of both worlds. @bleroy regarding to your requirement of xcopy deployable db. The answer is yes ArangoDb is going to support xcopy deployable feature at the end of february. And LINQ provider at the end of Q2. https://twitter.com/weinberger/status/564185891271630848 One thing I think new DAL API should help us to solve in a generic and elegant way are the performance problems of Taxonomies and Query-Projections, I mean the N+1 queries problem. The solution cannot be to cache in memory every Taxonomy we have. Or to add custom code for loading in one initial query all the content items Ids, in order to get in a second query all the content items to cache them for next requests of the shapes that renders current projection. What if the content items used in a taxonomy or returned by a projection have parts with LazyLoad properties referencing other content items? The problem is bigger. You need to add custom code not only for loading in one initial query all the content items Ids, also you need initial queries for loading at one time the Ids of the contents items referenced by the first level content items in the hierarchy. I would want a solution that makes Orchard work in the proper way when I add a Query-Projection or Taxonomy . I mean I want to get best performance without the need of adding the custom code I mentioned previously. If I were to develop DAL using a LINQ Object Oriented Db or an Hibrid db I would offer methods that allow me to set Includes for the query I'm performing. It will help consumers to indicate how deep they want it goes in the hierarchy of relations to get child content items for all the root content items of the query. Allowing to the DAL to use the minimum number of queries to retrieve all the data in a row. This will solve the problem with taxonomies and the problem with LazyLoad properties. The part of adding generic code to Taxonomies and Projection-Queries to harness those new methods of the DAL API to improve performance are out of the scope of this discussion, but with the Include feature in the DAL it doesn't look difficult to implement. As I pointed in previous posts is at this include thing where Graph dbs really shine. But by the moment no matter which storage we select we should offer an Include like solution for that problem and in the future if Graph dbs can improve performance it will be welcomed, but by the moment at least we need to improve what we have, don't you think? |
The API shouldn't depend on a specific choice of database engine. It should be doable to build a storage and query provider for pretty much any type of DB, resulting in specific performance differences, but that shouldn't affect the API. In particular, in your example of taxonomies, the storage API should enable you to specify query hints, a little like you do today with nHibernate, but hopefully easier and more flexible, so that you can get a list of content items, and related items, without causing select N+1 issues. You should also be able to filter on taxonomy terms by including them in the index that you will query on. |
Don't get me wrong, when I say "If I were to develop DAL using a LINQ Object Oriented Db or an Hibrid db I would offer methods that allow me to set Includes for the query I'm performing". I explicitly reference that technology because those are the techs to what I'm used to, but what matters is the functionality. I'm very new with Orchard I though query hints only allow you to filter. What I want to remark is the new API should provide mechanism not only to filter but also to select what related Content items the query should return for each Content item.result of a query |
Right: query hints don't filter, they tell the system what to eagerly fetch. |
I'm not sure if this is the right place to ask - but did anybody consider to use Lucene for content queries? I think if the right information is indexed it also allows to avoid N+1 query issues. |
The idea is that the Content Query is abstracted off.... so the flow is...
We might be able to cut that short and store the content actually in the IndexEngine too... but lets get this working first. |
You missed the reduce part of it, but essentially yes. There's also the initial indexing, that requires the storage providers to be able to enumerate their contents. |
@Jetski5822 : that's great to see. A question: are the lambda expressions developer sugar used to create otherwise dynamic content query definitions? Dynamic content query definitions could be created within the UI, and could probably be used as the building blocks for projections. |
It looks great how decoupled this design looks guys! To be sure I grasp what you propose some points:
If one day we move to a graph db it will be enough to disable all the indexing thing and to provide an implementation of this query class that uses the where lambdas to get root content items and hint lambdas to to get eagerly content items. Main differences are:
So it would be interesting the new query abstraction returns root content items no matter which underlying implementation is using instead of delegating in consumer the task of compose content item results with the Ids returned by the query |
@bleroy Ah yeah, missed that part :) good spot. @brentbysouth the idea is for projections to use this. The filter would directly translate to a lambda filter and that would become an index in its own right. @jersiovic If you want to swap the query store implementation for something else later, then the api it provides should be generic enough to do that, hence allowing you a graph db implementation. |
Just my 2 cent.... Besides Dapper, ever heard about https://github.com/jonwagner/Insight.Database . I've used it many times in favor of Dapper, not sure about the dependency to System.Data. I'm so happy to see that so many clever people are questioning EF6 / 7, I can assure you that whatever you try to do with an ORM that is just a little bit more complicated than mini sites, EF is going to haunt you night and day. Well... not just EF, but any fullblown ORM. What I really would like to see from Orchard regarding data access: |
I havent seen it, but will take a look thanks :) I plugged EF7 into Orchard VNext on the entityframework branch, and funny enough I havent actually come across any issues.. 'as of yet' - of course my use case is simple and it is using the in-memory db - so we shall see, I not adverse to getting rid of it in favor of something else, but still feel we need to provide a high level api and the ability to do something low level, like use SQL. I think with EF, at least, a lot of the examples I have seen are where people have put EF in to a system, and tried to do something without understanding the technology - A rushed implementation that gets pasted on to the internet and copied. Out of interest, why would you not rely on MVC? |
Does EF7 perform better on Azure SQL? I've had a few issues with application restarts and crashes due to Azure SQL DB maintenance in a cloud service configuration. Also, anyone look into Azure DocumentDB? It looks interesting. |
EF7 + DocumentDB dotnet/efcore#1035 |
Awesome, thanks! |
Regarding EF: https://orchard.codeplex.com/discussions/631681 :-) |
Meh - as I have said before, I am not adverse to removing it, there is just not a viable alternative at the moment that is CoreCLR compliant. |
This is all extremely interesting discussion. Are we trying to solve these issues by implementing something different. I know there is a strong desire to support alternative storage systems, but I have a feeling EFs drivers for these other storages will be a compromised. It would seem that Orchard should just abstract out the DA technology from their DA layer, then ship with NH as a default and let the community build implementations for other technologies so Orchard can stay focused on their platform and one DA technology. Everyone feels so strongly about other solutions, then the Orchard Team should empower them to build their solution to just drop in. My gut feeling is that others don't want to do the hard work of implementing the DA drivers. I feel this is not really a concern of the Orchard team other than allowing for drop in replacements. |
Yeah, I don't know about that. Many experts seem to agree that abstracting the ORM is a bad idea. We do need a higher level API because content items are a higher abstraction than the parts that correspond to tables, but there will always be a need for data access other than for content items, and for that, exposing the ORM directly is a pretty good choice. |
@Jetski5822 , using MVC from Microsoft does require you to use some folder conventions at least as default. Each time I start a project with MVC I soon realize that the folder conventions is annoying to follow. Why group non-related classes, i.e. controllers in one folder and views in another, and when the project reach a certain size it is quite cumbersome to work with. But still... MVC is just a pattern, and not some magic that many tries to put into it. Many people tends to think about MVC as a framework, but in reality it is nothing more than a View of some data. That was a long and very opinion based explanation :) One of my main concerns for EF is a tight coupling to the database, no join between 2 or more databases and requirements of identity column. I know I'm biased, but I always refactor VS-templated MVC-projects and re-implement IUserStore , not that big of deal after the first time :) , just so I can remove the dependency to EntityFramework.dll When all that is said, NHibernate is still light years ahead of EF, let us think a bit about: http://blogs.msdn.com/b/adonet/archive/2014/10/27/ef7-v1-or-v7.aspx Don't fall into the trap to depend on EF and it's never ending releases/ bug-fixes. |
I have started to sketch out a MapReduce model on a separate branch, as to not break master for the time being. https://github.com/OrchardCMS/Brochard/commits/MapReduce EF is the chosen ORM, at the moment.. Because it is the only one that supports CoreCLR, and as of yet, it does not leak in to any part of the system, and is abstracted to just the EF namespace. I had a think about Nancy support, and don't see why both couldn't be supported - But, not straight away, we just need the right abstraction to allow registration of the Nancy routing system. Using MVC with no changes to any extensions forces you to use their folder structures, yes. But we are not confined to that, we can have any folder structure we want, you just need to create an IExtensionsFolder implementation and away you go! |
Using EF (For now) with a Map Reduce pattern. |
Look in to Dapper and Massive as data access layers. How will the Content Manager API's functions talk to it? How can you surface the API's of the orm's themselves.
Can it work side by side with EF?
The text was updated successfully, but these errors were encountered: