-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #31. Add \iter\unique which returns an iterator with unique va… #51
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,6 +85,41 @@ function map(callable $function, $iterable) { | |
} | ||
} | ||
|
||
/** | ||
* Leaves only unique occurrences by using a provided hash function. | ||
* | ||
* If hash function is not provided values of the iterable will be used for comparison. Storing values instead of hashes | ||
* can require more memory but it prevents possible false positives if there are hash collisions. | ||
* | ||
* @param array|Traversable $iterable Iterable to remove duplicates from | ||
* @param callable|null $hashFunction Hash function that returns the value which will be used to determine | ||
* uniqueness of the element | ||
* @param bool $strict If is set to true the types of the values from hash function will also be checked | ||
* @return \Iterator | ||
*/ | ||
function unique($iterable, callable $hashFunction = null, $strict = false) { | ||
_assertIterable($iterable, 'First argument'); | ||
|
||
$hashSet = []; | ||
|
||
foreach ($iterable as $key => $value) { | ||
|
||
if ($hashFunction === null) { | ||
$hash = $value; | ||
} else { | ||
$hash = $hashFunction($value); | ||
} | ||
|
||
if (\in_array($hash, $hashSet, $strict)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this could be way more efficient with a keyed lookup and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would work with hashes, but what if we use raw values and values are arrays, objects or nulls, and what about types checking? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this implementation has quadratic time complexity and as such is a no-go. As @staabm said, the correct way to do this is to use the value as a key. Objects and other values are exactly why there is the $hashFunction (more typically $keyFunction). This also obviates the need for the $strict parameter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it a good idea to use serialize() to get a key for the default implementation (if hash function is not provided) to keep the ability for strict comparison? $hashSet[serialize($value)] = ''; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
For objects I would do something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @staabm why? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because you dont need it and its slow There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant you shouldnt serialize scalars, sry There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @staabm how will you check the types then? $arr[5] and $arr['5'] will be the same keys. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm good point 🤔 |
||
continue; | ||
} | ||
|
||
$hashSet[] = $hash; | ||
|
||
yield $key => $value; | ||
} | ||
} | ||
|
||
/** | ||
* Applies a mapping function to all keys of an iterator. | ||
* | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test could be moved out of the
foreach
loop.Something like this:
$hashFunction = $hashFunction ?? 'serialize';