Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Prevent additional query when updating new graphs (faster inserts) #146

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

mk1024
Copy link

@mk1024 mk1024 commented Jul 30, 2015

I'm using GraphDiff in a few projects that perform a lot of batch updates and inserts. Here are my optimizations that made it about 15-20x faster in my use cases. All tests in GraphDiff.Tests and in my own projects run fine. Perhaps Brent or Andreas can have a look at my changes.

Inserting new entities

When updating new graphs (insert) GraphDiff tries to load the persisted graph, which always returns null and requires a lot of unneccessary work (predicate expression, include strings, actual query to db). If an entity is new and not persisted yet, you should not try to load it from the database.

This optimization is optional for simple int and long primary keys (GraphDiffConfiguration) and is automatically ignored in other scenarios (string/guid/composite/etc).

Updating of collections

When updating many entities of the same type you have to call UpdateGraph in a loop. This results in a lot of queries from QueryLoader and extremely slows down large batch updates. As proposed by DixonD-git (Issue https://github.com/refactorthis/GraphDiff/issues/127), loading of many entities can be done in a single query to increase performance.

The performance boost was quite a surprise (15-20x on my local machine). I only had to change internal interfaces and classes, while the public interface (DbContextExtensions) did not change. Composite keys create predicates like (KeyA='a1' AND KeyB='b1') OR (KeyA='a2' AND KeyB='b2') OR ... and single int keys are translated to ...Key in (1,2,3,4,5...).

@kabua
Copy link

kabua commented Feb 10, 2016

@mk1024 - Thanks! We have incorporated your changes into our private code base.

@cryo75
Copy link

cryo75 commented Mar 14, 2016

After this fix, on inserting of new entities, the associated entities are still being loaded again before saving the graph.

The code where the loading happens is located here:
AssociatedEntityGraphNode.Update()

Does this fix actually work?

@mk1024
Copy link
Author

mk1024 commented Mar 14, 2016

@cryo75 - If your entities have int/long primary keys and if you set GraphDiffConfiguration.SkipLoadingOfNewEntities = true the loading of the new entity itself is skipped. Depending on your model it can still be neccessary to load associated entites to be updated.

@cryo75
Copy link

cryo75 commented Mar 14, 2016

I have the following:
`
public class AddressDto()
{
public int Id {get; set;}
public PostCodeDto {get; set;}
}

public class AddressModel() 
{
    public int Id {get; set;}
    public PostCodeModel {get; set;}
    public PostCodeModelId {get; set;}
}

`

In my particular scenario, the associated entities in the Dto classes are the actual entities themselves. I do no expose the foreign keys.

So, in the above, when inserting a new AddressModel, the model will only have a value for PostCodeModel. On saving I do:

return ((DbContext)dataContext).UpdateGraph<AddressModel> (entity, map => map.AssociatedEntity(x => x.PostCodeModel));

But GraphDiff loads all the associated entites before saving, which, in my case, is unnecessary. Is there a way to trim down the loading of associated models before inserts/updates? (However, note that the returned model must have the actual foreign keys set after saving).

@mk1024
Copy link
Author

mk1024 commented Mar 14, 2016

@cryo75 - I think that not exposing the fk property (PostCodeModelId) is causing this issue. If PostCodeModel should not be updated/inserted with AddressModel anyway, you could just set PostCodeModelId and leave PostCodeModel empty when inserting. Depending on your model and configuration, this should do the trick.

@cryo75
Copy link

cryo75 commented Mar 14, 2016

Preferably I would need a mixture so that GraphDiff:

  1. sets the foreign id's automatically from the associated models.
  2. doesn't reload the associated models before inserting/updating
  3. return the complete inserted/updated entity with the associated models and foreign id's all set.

As for OwnedCollections, the default will still apply. Is this possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants