-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit collections #10
Comments
This is already handled in a couple ways, but I'm sure there is room for improvement. See Limiting Recursion, and specifically Limiting Recursion With Object Constraints. This doesn't limit the total entities returned, like what you're suggesting, but that kind of arbitrary limiting is probably not desirable default behavior since it could be very confusing. I'll have a look at it though and get back to you. |
@trappar using addPersistedObjectConstraint do not prevent large collections from being loaded. It create huge performance problem. Please check PR - I just created dump for few GB database with 40 entities within a seconds. It have all entities and relations dumped. :) agree, default limiting could be confusing. maximumCollectionChilds is not set by default. |
@trappar PR does not work correctly - generated yaml have null value for limited collections. Working on patch now. |
@trappar fixed |
@trappar what do you think about this PR? |
I'm not sure. I feel like the idea is somewhat flawed. It limits results to totally arbitrary sets of entities - just whatever comes first. A similar but more controlled style of limiting can already be done by just constraining all the desired objects. So you can already say to return a specific set of comments rather than returning just the first X number of them. There's also the ignore annotation/metadata which can be used to arbitrarily prevent crawling. Also, the way you implemented it doesn't quite fit in with the rest of the library. I went ahead and implemented it myself just to see how difficult it would be. It's a little sloppy and requires more testing, but here's what I came up with: db84b4e This adds three methods to the FixtureGenerationContext: |
well, main goal of this PR - add easy way for creating test data sample from very large database. For example, I have an Api with tons of entities and GBs of data. Whenever I get some bug, like "in case X api provide wrong output" - I simply add test fixtures dump right after buggy data set is loaded. After that I cleanup generated file and got yml file for isolated tests within a few minutes. Right now I use it like this
Looks like your commit is adding similar functionality, I dont really get how to use it though :) |
I see what you're doing. The code I came up with allows you to designate a total object limit, and a per-entity limit, but it doesn't do anything about limiting the number of entities returned on a per-relation basis like what you're doing (if I'm understanding what you're doing right). I get that in your case the database you're working with is massive and you don't want to just dump everything, but in my experience it's better to cherry pick specific test cases that you want using object constraints than it is to get a random sampling of the whole database - like what you're doing. If you just queried for a handful of useful matches/incidents and set those as object constraints then none of this would be an issue. I don't feel like your changes are ready to be merged as is. I'd like to see the functionality entirely controlled by setting values in the FixtureGenerationContext, and I'd want to make sure that ValueVisitor wasn't visiting objects unnecessarily. I'm not 100% sure, but I'm guessing in your code that some objects get visited even if they aren't added to the results, which would be a performance problem. You can try to make those changes if you want, or just keep using your fork. For now I'll just leave this issue/PR/my branch open. |
Using own fork would be problematic - didnt founded any easy ways to load it in composer with "minimum-stability": "stable" regarding
Are you about CollectionHandler::limitCollection( function? FixtureGenerationContext seems to be wrong place for it - should I place in some new helper class?
|
I have located another approach of generating test data. Something like
Its pretty simple and dump data fixture, which I actually need. The only problem - even with setMaximumRecursion(0) CollectionHandler still fetch all collections from database, which make script very slow (a couple of minutes) with setMaximumRecursion(-1) only first entity is dumped, all other are skipped. So, I added simple property, which add ability to do not preload from database collections, which was not used before. Its in PR https://github.com/trappar/AliceGenerator/pull/13/files With that PR generator works in less that a seconds (taked few minutes without that PR) Let me know if its acceptable. |
@trappar what do you think about second approach? Right not its very hard to build "default" fixtures set for large entities count. |
@trappar can you please check https://github.com/trappar/AliceGenerator/pull/13/files and let me know if its acceptable? |
In order to generate fixtures from live data in symfony2 I use trappar/AliceGeneratorBundle and code like
The problem is - when $someEntity have large collections - generator just hang up.
I didnt founded solution, so I have added simple wrapper
And use it like
I have few questions
The text was updated successfully, but these errors were encountered: