Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add allequal #43354

Merged
merged 6 commits into from
Feb 11, 2022
Merged

Add allequal #43354

merged 6 commits into from
Feb 11, 2022

Conversation

CameronBieganek
Copy link
Contributor

@CameronBieganek CameronBieganek commented Dec 7, 2021

Fixes #43353.

@CameronBieganek CameronBieganek changed the title Add allequal. Add allequal Dec 7, 2021
@oscardssmith oscardssmith added the triage This should be discussed on a triage call label Dec 7, 2021
@oscardssmith
Copy link
Member

Adding to triage to decide if we want this.

@rfourquet
Copy link
Member

Very much in favor, and this was just on my todo list. This is a feature which is requested from time to time and easy to get wrong. For example, on a zulip thread, the OP thought allunique was doing isequal, and then someone proposed foldl(==, collection) as an implementation for allequal (which works only with the most simple collections), which was also mentioned in a separate thread.

base/set.jl Show resolved Hide resolved
@timholy
Copy link
Member

timholy commented Dec 7, 2021

I'm opposed to growing our exports for essentially trivial functions.

@CameronBieganek
Copy link
Contributor Author

I don't personally care whether allequal gets added to Base, but here are the two arguments that I see in favor:

  1. allequal complements allunique, which is already in Base.
  2. Users implementing this function themselves might be likely to define allequal(itr) = length(unique(itr)) <= 1, which is much less efficient.

Of course the "users might implement it inefficiently" argument could be used for lots of different functions, so it's probably not a very strong argument.

@CameronBieganek
Copy link
Contributor Author

There are some errors in the CI tests. It seems like they're not related to the unit tests that I added, but I can't tell for sure.

@goretkin
Copy link
Contributor

goretkin commented Dec 7, 2021

I'm opposed to growing our exports for essentially trivial functions.

What about not exporting it, then?

For me, allequal is kind of in the same category as only. only is trivial. When you're feeling lazy or reckless, you can just use first. You can also just implement allequal(x) = all(==(first(x), x), but it's even more wrong than using first in place of only.

@CameronBieganek
Copy link
Contributor Author

You can also just implement allequal(x) = all(==(first(x), x), but it's even more wrong than using first in place of only.

Actually, there are scenarios where you might reasonably want all(==(first(x), x). For example, if you want to use three-valued logic:

julia> x = [missing, missing];

julia> all(isequal(first(x)), x)
true

julia> all(==(first(x)), x)
missing

If you're looking to use three-valued logic, then you expect allequal([missing, missing]) to be missing. One might also want allequal([NaN, NaN]) to return false:

julia> y = [NaN, NaN];

julia> all(isequal(first(y)), y)
true

julia> all(==(first(y)), y)
false

I used isequal in this PR because allunique uses isequal (actually, the implementation does not, but that's the API that allunique presents). And of course allunique uses isequal because unique uses isequal, and unique uses isequal because sets and dictionaries use isequal.

@goretkin
Copy link
Contributor

goretkin commented Dec 8, 2021

I should have have used isequal. I intended only to make a point about empty collections, which requires a length check, as does implementing only.

@Moelf
Copy link
Contributor

Moelf commented Dec 8, 2021

slightly against, the justification for allunique (IMO) is the large speed up + non-trivial implementation:

julia> @benchmark allunique(x) setup=(x = rand(1:10, 100))
BenchmarkTools.Trial: 10000 samples with 962 evaluations.
 Range (min  max):   83.630 ns    2.047 μs  ┊ GC (min  max):  0.00%  93.80%
 Time  (median):     102.137 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   118.439 ns ± 166.812 ns  ┊ GC (mean ± σ):  12.62% ±  8.44%

       ▁▁▁▄▃▅▆█▅▆▄▂▁                                             
  ▁▂▃▄▆█████████████▇▇▆▄▄▄▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  83.6 ns          Histogram: frequency by time          171 ns <

 Memory estimate: 400 bytes, allocs estimate: 4.

julia> @benchmark length(unique(x))==length(x) setup=(x = rand(1:10, 100))
BenchmarkTools.Trial: 10000 samples with 183 evaluations.
 Range (min  max):  558.109 ns   11.964 μs  ┊ GC (min  max): 0.00%  93.17%
 Time  (median):     592.109 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   632.985 ns ± 566.994 ns  ┊ GC (mean ± σ):  5.22% ±  5.48%

     ▁▄▃▅▅██▆▄▃▂                                                 
  ▁▂▄████████████▆▅▄▄▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  558 ns           Histogram: frequency by time          769 ns <

 Memory estimate: 880 bytes, allocs estimate: 7.

allequal seems to be something users can easily come up with (a good enough implementation) & type it out

@mcabbott
Copy link
Contributor

mcabbott commented Dec 8, 2021

Have certainly written naive versions of both of these quite a few times. It is tidy to have shared verbs so you don't have to invent a name, or figure out someone else's -- like only, as mentioned.

Re timing allunique, at the size shown a different naive version is much faster -- perhaps Base's implementation should switch below some size?

julia> _allunique(x::AbstractVector) = all(!(x[i] in @view x[i+1:end]) for i in eachindex(x));

julia> @btime allunique(x) setup=(x = rand(1:10, 100));
  min 66.872 ns, mean 90.156 ns (4 allocations, 400 bytes)

julia> @btime _allunique(x) setup=(x = rand(1:10, 100));
  min 4.333 ns, mean 9.602 ns (0 allocations)

Edit: that's now #43375.

@JeffBezanson JeffBezanson added the needs news A NEWS entry is required for this change label Dec 16, 2021
@JeffBezanson
Copy link
Member

Triage is in favor. Personally I like simple utilities like this that save you a bit of cognitive load, provided the meaning is really obvious which I think it is in this case.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Dec 16, 2021
base/set.jl Show resolved Hide resolved
@CameronBieganek
Copy link
Contributor Author

CameronBieganek commented Feb 5, 2022

I added a NEWS.md entry and updated the docstrings, including a compat entry for the allequal docstring. So, I think this is ready for a final review and merge.

@CameronBieganek
Copy link
Contributor Author

@oscardssmith Can this be merged before the 1.8 deadline?

@oscardssmith
Copy link
Member

LGTM

@CameronBieganek
Copy link
Contributor Author

If anyone is listening, this should be good to merge before the 1.8 deadline tomorrow. (I don't have merge rights.)

@N5N3 N5N3 removed the needs news A NEWS entry is required for this change label Feb 11, 2022
@oscardssmith oscardssmith merged commit 98e60ff into JuliaLang:master Feb 11, 2022
@CameronBieganek CameronBieganek deleted the clb/allequal branch February 11, 2022 16:08
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request Feb 17, 2022
* Add `allequal`.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
* Add `allequal`.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
* Add `allequal`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add allequal