Skip to content

Default implementation of immutable.Set has non-deterministic iteration order #8184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
scabug opened this issue Jan 27, 2014 · 9 comments
Closed

Comments

@scabug
Copy link

scabug commented Jan 27, 2014

Run the following code in the Scala REPL:

val o1 = new Object()
val o2 = new Object()
val o3 = new Object()
val o4 = new Object()
val o5 = new Object()
val items = Set(o1, o2, o3, o4, o5)
items.foreach(println)

Observe that the order the items are printed is not the order they were inserted into the set constructor. Further observe that the order is not consistent across multiple invocations of the REPL (i.e. different JVM runs).

This non-deterministic ordering can make bugs harder to reproduce and can lead to confusion if it impacts user-visible output or behaviour. It's also easy to miss this when writing tests, since sets of size < 5 are special implementations Set1...Set4 which do have deterministic iteration order.

@scabug
Copy link
Author

scabug commented Jan 27, 2014

Imported From: https://issues.scala-lang.org/browse/SI-8184?orig=1
Reporter: David North (dtnox)
Affected Versions: 2.10.3

@scabug
Copy link
Author

scabug commented Jan 28, 2014

Adam Voss (vossad01) said:
Not meaning to be daft, but isn't the behavior described the expected behavior? I thought the Set trait did not provide any guarantees on preserving the add order. If I am right about that, and I could be wrong, your point of contention would be that since the specialized sets do preserve order people may code against that implementation specific behavior thinking that it is a general guarantee by Sets?

@scabug
Copy link
Author

scabug commented Jan 28, 2014

David North (dtnox) said:
You are quite right that the Set trait does not make any guarantees about order.

My point is that it would be useful if the default implementation did have a deterministic iteration ordering. Non-determinism is bad - it makes bugs harder to reproduce.

My company came to this conclusion for Java a long time ago. Again, though the 'Set' interface does not guarantee any sort of ordering, we found it best to ban the use of HashSet and force people to use LinkedHashSet.

@scabug
Copy link
Author

scabug commented Jan 28, 2014

lvfangmin (alan) said:
It depends, from my perspective most of the cases I don't need the Set to be ordered.
If one need the order guaranteed feature, he/she should explicitly use the ordered collections other than the default one.

@scabug
Copy link
Author

scabug commented Jan 28, 2014

David North (dtnox) said:
Two problems with that:

(1) I can only make that choice in my own code, so I'm left having to force an arbitrary ordering on any Set coming out of a third-party library which I want to have a deterministic order

(2) Scala does not have an immutable Set implementation with guaranteed iteration order.

@scabug
Copy link
Author

scabug commented Feb 2, 2014

Adam Voss (vossad01) said:
Hi David, regarding your two problems:

Recall that immutable.Set is an inheritable trait. Changing the particular type of Set returned by the Companion Object would not fix this issue because methods could still return any subclass of the trait. Your first problem could only be addressed changing the contract of Set to require ordering; that would be both very unconventional and quite the breaking change.

I thought immutable.ListSet gave a consistent iteration order?

@scabug
Copy link
Author

scabug commented Feb 4, 2014

David North (dtnox) said (edited on Feb 4, 2014 2:06:00 PM UTC):
Fair point regarding the trait - but even though changing the type returned by the companion object would not fix the problem universally, it would still be an enormous improvement over the current situation.

immutable.ListSet does give a consistent iteration order, but inspection suggests that its contains() method appears to be O( n ) for an n-sized set, which is undesirable. What I'd really like is an immutable LinkedHashSet or similar, with near-constant contains performance.

@scabug
Copy link
Author

scabug commented Feb 4, 2014

@odersky said:
It would be great to have an efficient

collection.immutable.LinkedSet

which is analogous to the mutable.LinkedHashSet in that it preserves ordering. Contributions would be very much welcome here,

@scabug
Copy link
Author

scabug commented Feb 4, 2014

@adriaanm said:
We can't make the default Set preserve order because it would affect performance. I agree it's useful to have such a Set, though, so I've opened #8234 to track introducing this new collection class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant