-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.8] Add duplicates method to collection #28181
Conversation
Thanks for the PR first. I'd love to know more about what the key represents. because in the second example the value "g'day" has as key 2 (which I understand two occurrences) but in the first example the value (1) has as key 2 which should have 3? if so what if the are two entries with the same count of duplicate? |
@akiyamaSM the keys on the "duplicates" resulting collection are a reference to the keys of the original collection, the first ocurrence of each "item" is not taken as duplicate only second and so on, that is why on the first example we get the 1, and 2 keys back (second and third ocurrence of number 1 in the collection) and 2 in the gday example since gday is duplicated on the 2nd position of the collection. To me it looks great @timacdonald |
Hey - so the Here is a more explicit version... $items = collect([
0 => 1,
1 => 1,
2 => 1,
3 => 2,
]);
$items->duplicates();
// [1 => 1, 2 => 1];
Which gives our final value I umm'd and arrr'd about whether or not I should retain the key or just return a re-indexed collection. I thought it doesn't hurt to have it retain the original index, so I left it, and in some cases it might be handy to know the index. Does that implementation seem right to you @akiyamaSM ? Would love to know if you think it works okay with that explanation. I guess the idea was to replicate this plain PHP for collections. |
Thanks @timacdonald & @alberto1el Its more clear now. Good luck for the proposal 💃 |
I would definitely make use of this. I question whether the first occurrence should also be returned, though I can see that's not consistent with the plain PHP array example. I would probably use this to get a list of indexes, and then use those to remove the duplicates from the original collection. However, there's a chance that the first occurrence of the item isn't actually the one I want to keep. Perhaps I want to keep the second occurrence instead? In a simple example (checking on integers), it makes no difference whatsoever which one is kept. But in a more complex example (checking a specific key or property like 'framework'), it might be that other keys/properties are not duplicated. Let's say that each element had a 'framework' and a 'version' key. I could use duplicates() to get a list of repeated frameworks, then I'd pass this list into some other logic that checks the 'version' of each duplicate and deletes all but the most recent one. In that case, I'd want the first index of that framework as well! Sure, there are plenty of ways to get around this and end up with the result I'd be looking for. Just thought I'd throw another potential use case in here that may not have been considered. |
Hey Shane, thanks for the feedback on this. Seen as though this has been merged as is, hopefully it will still work for your use case. I'm just glad to have something in there to be able to find duplicates! One less google search for me haha |
Thanks for this contribution. Would you be able to submit a PR to the docs? |
It is on my list. I just wanted to get the collection PR (#28193) out first. Will try and PR the docs tomorrow. P.S. Thanks for reminding me! |
This PR introduces the ability to get all the duplicate values from an
Illuminate\Support\Collection
.Usage
Loose and Strict
I've come at this from a bunch of different angles. I am more than happy to hear ways you think this could be improved - but please confirm it works with all the test cases (or I can for you), as I've found a lot of variations on this do not pass all the tests e.g. methods in the
array_diff
family don't play nice with multidimensional arrays.I'd also like to hear if anyone thinks there are additional test cases I should account for.
Does not work with integers and objects in loose comparison
Loose comparison between integers and objects will trigger a notice and fail. Good news is that the
unique
method actually triggers the notice when we call it from theduplicates
method, so the issue already exists.Eloquent keys
If you are happy with this idea / approach for the base collection I will update this PR or do another to make it take model keys into account on the
Illuminate\Database\Eloquent\Collection
Fingers crossed I never have to google "PHP how to find duplicates in an array" again 🤞