-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.x] Add Builder@lazy()
and Builder@lazyById()
methods
#36699
Conversation
Builder@lazy()
and Builder@lazyById()
methodsBuilder@lazy()
and Builder@lazyById()
methods
This looks awesome! Such a powerful feature. I currently have some Builder macros similar to this and they have been critical when working with large data processing tasks. I think it would be great to have them in core so everyone can use them. |
Thanks! Would you mind sending me some documentation on this to laravel/docs |
Sure! I'll try to whip something up later today or tomorrow. |
Is the |
@JosephSilber much love from here. Thanks for this PR |
@tpetry The PHP docs only explicitely mention MySQL, so maybe this doesn't apply to other drivers. This would require testing the memory consumption on different databases, in order to know with any level of certainty. |
@JosephSilber can you look at this bug with Looks like fix will be this check $lastId = $results->last()->{$alias};
if ($lastId === null) {
throw new RuntimeException("The lazyById operation was aborted because the [{$alias}] column is not present in the query result.");
} |
Fixed here #48436 |
Background
For querying large datasets, the
Builder
currently has thecursor()
method, which returns aLazyCollection
. This uses less memory than a regularCollection
(returned from theget()
method), since it never keeps more than a single Eloquent model in memory.However, the
cursor()
method still has several drawbacks:It's not truly lazy, since PHP still caches all query results in its buffer.
Turning off buffering introduces its own set of challenges (namely not being able to execute other queries simultaneously).
It cannot eager load relationships, since it only ever deals with a single record at a time.
Depending on the DB, opening a cursor to a huge dataset may have a slight delay vs. running a query with a
LIMIT
.We also have the
chunk()
method, which is kinda lazy, but with a clunky API — as is evident by the fact that we needed to introduce separateeach()
andchunkMap()
methods (withchunkMap()
having to build up the whole result set in memory 🙈).Introducing
lazy()
The new
lazy()
method introduced in this PR will chunk results behind the scenes, and return a singleLazyCollection
of results:Since it's a lazy collection, you have the full power of collections at your fingertips:
You can call
each()
directly on it:You can call
map()
on it:Or even
chunk()
it... The possibilities are endless, and we'll no longer have to create all of these separate one-off methods to query and manipulate results lazily.