-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sets.transform(Set<F>, Converter<F, T>) or the equivalent - WontFix: Use ImmutableSet.copyOf(Collections2.transform(...)) #219
Comments
Original comment posted by jim.andreou on 2009-08-13 at 11:59 AM How would you implement Set<T>.contains(Object) in the resulting set? (Or if that set |
Original comment posted by butkovic on 2009-08-13 at 12:11 PM I didn't notice it's implemented lazy for list, sorry, only now I checked the sources. |
Original comment posted by jared.l.levy on 2009-08-13 at 01:57 PM Sets.transform isn't feasible. The problem is that the function could map two distinct Instead, you can pass a set to Collections2.transform(). If you need a set at output, Status: |
Original comment posted by jim.andreou on 2009-08-13 at 02:13 PM To complement Jared response, he describes a situation where there wouldn't exist an But we should note that if there was a "Bijection" interface, which defined both |
Original comment posted by joe.kearney%morganst...@gtempaccount.com on 2009-08-13 at 02:29 PM I have a use case for this, where I'm trying to implement a Map. I want the keySet to The simplest solution is to extend AbstractSet and implement iterator() using What would you recommend to implement this? Just a fuller implementation of Set<F> transform(Set<F> fromSet, Function<? super F,? extends T> function, Function<? I realise that this would be a rather complex addition at this stage, but I'd be Thanks, |
Original comment posted by cpovirk+external@google.com on 2009-08-13 at 03:12 PM Joe, what features do you need that AbstractMap.keySet() doesn't provide? A fast |
Original comment posted by joe.kearney%morganst...@gtempaccount.com on 2009-08-13 at 03:38 PM Yes, sort of. I'm not using AbstractMap, but in any case that implementation of We're not using an AbstractMap for slightly obscure memory reasons, we're optimising I don't think it's particularly complex to just delegate everything back to the |
Original comment posted by kevinb9n on 2009-11-04 at 11:04 PM Reopening, because in the future we likely will have an invertible Function (we call Status: |
Original comment posted by kevinb@google.com on 2010-02-12 at 05:06 PM Upon further discussion, I'm coming to the same conclusion that I had a few years ago |
Original comment posted by kevinb@google.com on 2010-04-23 at 08:27 PM note: depends on bug 177. |
Original comment posted by kevinb@google.com on 2010-07-30 at 03:53 AM (No comment entered for this change.) Labels: - |
Original comment posted by fry@google.com on 2011-01-26 at 10:11 PM (No comment entered for this change.) Labels: |
Original comment posted by maliniakh on 2011-02-02 at 01:04 PM @Kevinb: Developers may be aware that some sets are truly lossless, like converting db entities into their ids. If sets happen to be lossy, simply runtime exception could be thrown within Sets.transform() method. Sounds like a reasonable solution to me. |
Original comment posted by kevinb@google.com on 2011-02-02 at 03:03 PM We wouldn't be able to throw such an exception; we wouldn't know there was anything wrong. It would just produce a buggy Set. Again, we're not rejecting the idea and it will probably happen eventually. |
Original comment posted by finnw1 on 2011-02-02 at 04:42 PM I think that's an example of where lossiness does not matter. If a DB entity has been deleted, yes regenerating it from its id will fail, but then the object must have been invalidated already (even if the fact was not known.) |
Original comment posted by cpovirk@google.com on 2011-02-17 at 04:30 PM Now that Converter has some traction in Google-internal code, we decided to revisit this. I did a survey of callers of {Iterables,Lists,Collections2}.transform(..., someConverter.asFunction()), which is the closest analogue to the requested Sets.transform, to see whether they might benefit from the new method. The result is that Sets.transform is unnecessary or bug-promoting for all of them. Here's a list of criteria that a caller would need to meet for the method to be helpful:
I remain convinced that Sets.transform would be very cool -- I had a CL to implement it myself 3 years ago, one of my first Java-libraries CLs -- but the more research I do, the more convinced I become that it would cause more problems than it solves. I'm re-closing the bug. The Converter class itself will still likely see the light of day. Status: |
Original comment posted by andreou@google.com on 2011-03-18 at 10:21 PM The comments below were written without seeing Chris' comments above. I agree the contract required for the Sets#transform's bijection is tricky, but not as tricky as described. I think the confusion comes from defining a bijection between all of A and all of B, while we only require it between a Set<A> and Set<B>, as I explain below, which is nowhere near as hard to implement correctly as the former one. (For me, the tricky part is this: Do we define a complete A<-->B bijection type, and reduce its requirements in Sets#transform, or build a constrained (Set<A> <--> Set<B>, i.e. something like "isDefinedAt(A)" method) bijection type to begin with, with exactly the right spec for Sets#transform? I would definitely choose the latter, which is as useful and general - it can mimick the former just by implementing "isDefinedAt(A)" to return true, if the implementor can truly upheld the contract) Just a comment from my recent experience implementing something like a Sets#transform: the tricky part is to specify the bijection contract. I mean the exact semantics of the methods, not just the signatures. For example: This, without any other specification, implies a bijection A <--> B, i.e. from every A to every B (oh, yes, A and B (infinite) sets should have the same cardinality too). This specification makes it the easiest to implement Sets#transform, but also the hardest to implement InvertibleFunction correctly. At least from the perspective of Sets#transform, what is really needed is something far less, and far easier to implement: a bijection not from all A to all B, but one that needs only to be defined for those A's in a specific Set<A>. The image of that would be a Set<B>, and similarly, the inverse function would only be required to map only the B's in that Set<B> back to A's of Set<A>. InvertibleFunction as given above doesn't make it easy to restrict ourselves to a smaller domain than the type parameter itself (Set<A> is just a subset of A). Aside: scala functions have an "isDefinedAt(x: A)" method, which is precisely what I'm talking about. They don't define functions over all A, but only a subset of it (a predicate on elements of A, such as isDefinedAt, defines a set of A just as Set<A> does, modulo not providing an iterator). But we have no isDefinedAt method, and we can't define one, and even if we could, interfaces with two methods are going to become much more costly than those with a single method when closures arrive, so I think we are pretty much stuck with the clumsier approach. The good news is that the clumsier approach is workable, but the specification of a potential Sets#transform would be forced to explain this subtlety, that not a complete bijection is required, but only one defined for the given Set<A> and its image Set<B>, for other inputs it would be free to return whatever. |
Original comment posted by andreou@google.com on 2011-03-18 at 10:38 PM "Maybe we're OK if "01" and "1" map to the same thing" -- indeed, this is where nastiness begins. That the user can create a broken implementation in which all of these hold:
Which is subtle, but as the fundamental CS tenet goes, 'garbage in, garbage out'... |
Original comment posted by tre...@scurrilous.com on 2012-02-27 at 08:46 PM The discussion on this issue is long and complex, with lots of good arguments for and against using Sets.transform in specific scenarios. However, the arguments around the necessity of a bijection seem to be a primary reason for not including it the library at all. This confuses me a bit because it seems like all that is really needed is an injection (aka a one-to-one (but not onto) function). Analogous to how Set refines Collection purely in terms of semantics (i.e. Javadoc only), couldn't a new Injection interface refine Function semantically as f(a) = f(b) implies a = b? This provides a well-defined Set with a linear-time contains() (as provided by AbstractSet), which is good enough in many common cases. I've done this myself, so I could easily provide code if desired. Adding a Bijection (which refines Injection) with an inverse function would allow the transformed set to delegate contains() to the base set, preserving its likely better than linear runtime. This could be provided as a Sets.transform overload. I think the main trick here is that type erasure doesn't allow contains(Object) to safely call inverse(F) (since you can't use instanceof with a type parameter). Of course, that test could be included in the Bijection interface, e.g. rangeContains(Object) or rangeCast(Object). Is this line of thought reasonable, or am I missing something? |
Original comment posted by andreou@google.com on 2012-02-27 at 11:01 PM Note that Sets#transform is already implemented with a bijection, feel free to peek in the code. (Used by something called "WellBehavedMap", if I recall correctly) But yes, this could have been implemented this with just an injection, and a linear time Set<B>#contains. But that would be surprisingly slow, right? You do expect Set#contains to be relatively fast. |
Original comment posted by wasserman.louis on 2012-02-27 at 11:11 PM I'm really leery of introducing a whole new Injection interface just for the sake of this one method. I'm convinced by Chris' argument earlier that Collections2.transform gives you something with the same performance guarantees, and if you really want a Set -- you probably really want the Set because you want a fast contains, so just copy it into an ImmutableSet. |
Original comment posted by andreou@google.com on 2012-02-27 at 11:30 PM Unless, of course, you have to keep the transformed set in sync with the original, and you plan to do enough contains() checks to worth the cost to construct an ImmutableSet. (Just framing this a bit better) |
Original comment posted by tre...@scurrilous.com on 2012-02-28 at 12:22 AM @andreou: Heh, thanks, I never thought to look for an internal implementation. Not to you personally, but reflectively for all: it's a little disingenuous to argue against the need for/validity of something while using it privately, no? ;-) @louis: No, I promise, I really don't want fast contains(). :-) I think there are many common usages that don't ever need to call contains() and primarily use size() and iterator(). However, Collection isn't always an option because of the need to interoperate with APIs that use Set because of its semantics (i.e. no duplicates), without expectations of the performance of contains(). Consider java.util.concurrent.CopyOnWriteArraySet as an example. Copying into ImmutableSet has worse performance in such cases, because each construction redundantly verifies the uniqueness constraint. I really find the pushback surprising here, given that: |
Original comment posted by andreou@google.com on 2012-02-28 at 12:31 AM Trevor, it would be funny if you directed that to me personally; I was arguing in favor of this. :) (#17 sums up the story of writing this) |
Original comment posted by cpovirk@google.com on 2012-02-28 at 03:48 AM Another way to convert your Collections2.transform Collection to a Set: class ForwardingCollectionSet extends ForwardingCollection implements Set { |
Original comment posted by wasserman.louis on 2012-02-28 at 04:23 AM ...except you need to override hashCode and equals, but other than that, Chris is correct. (That is the easiest way to view a Collection known to be unique as a Set, and it is quite easy.) My main objection is the increased API surface -- creating an interface just for the sake of one method seems a bit much. |
Original comment posted by tre...@scurrilous.com on 2012-02-28 at 09:21 PM That's not a terrible solution, but overriding hashCode and equals (correctly) seems like a subtlety that cries out to be done carefully in a library such as this, rather than being done ad hoc. Louis, I understand that you need to worry about API surface in general, but that argument seems like an oversimplification in this case: public interface Injection<F, T> extends Function<F, T> {} Other than Javadoc, that's it. It hardly seems like a maintenance problem, and I'm happy to submit a patch including the Javadoc. Speaking of subtlety, I'd like to point out a slightly different argument in favor of this feature: Obviously people are reaching for Sets.transform. Finding it missing, they likely end up here. If ImmutableSet.copyOf solves their situation, they're on their way in 20 mintues. Otherwise, they spend a day trying to understand this thread (and possibly want to rehash it) and then kludge together a solution that likely misses a few of the subtleties. Distilling an overview of these issues into a single static method and empty interface (and accompanying Javadoc) seems worth even more than the actual code, even if it encourages them to use ImmutableSet in the end anyway. |
Original comment posted by wasserman.louis on 2012-02-28 at 09:25 PM Let me put it another way: there's already a decent amount of momentum behind Converter, which you describe as Bijection. I'd be absolutely okay with waiting for Converter to get off the ground and then providing Sets.transform after that. I'd like to hear opinions from other Guava team members, though. |
Original comment posted by cpovirk@google.com on 2012-02-28 at 09:36 PM Indeed, my mistake makes the ForwardingCollection solution look bad, and it's not something I'd so much "recommend" as "fall back on if necessary." The main problem with Sets.transform remains that it doesn't work right in the main cases that people seem to want it to work -- parsing numbers, for instance. Dimitris and I disagree over whether it's even a good fit for WellBehavedMap. (That use, too, had a subtle bug, fortunately caught by another reviewer.) Plus, ImmutableSet appeared to be an adequate solution for all the potential users we looked at. There will be users for whom this is not true, but overall I suspect that the method would cause enough bugs to outweight any improvements it can produce. People are welcome to continue to discuss, but we have had this discussion repeatedly over the years, and I'm trying my best not to sink a few hours into it for the nth time. I'll update the bug title to make the workaround clear. |
Original comment posted by cpovirk@google.com on 2012-02-28 at 09:45 PM (No comment entered for this change.) |
Original comment posted by andreou@google.com on 2012-02-28 at 10:30 PM "Dimitris and I disagree over whether it's even a good fit for WellBehavedMap" This is the first time I learn about this. There are two reasons why WellBehavedMap can't use the suggested "ImmutableSet.copyOf(Collections2.transform(...))"
Sets#transform is really the best fit for WellBehavedMap - if you disagree, please provide data to back up that opinion. |
Original comment posted by cpovirk@google.com on 2012-02-28 at 10:36 PM Right -- I didn't mean that we should use ImmutableSet.copyOf(...) there, merely that I didn't think a general-purpose Sets.transform was necessary -- and, it seems, it actually distracted from the need for a live view. (Or maybe I'm just making excuses for not spotting the problem :)) |
Original issue created by butkovic on 2009-08-13 at 11:52 AM
I'd like to use method:
Sets.transform(Set<F> fromSet, Function<? super F,? extends T> function)
But there is no such method. I would expect it would behave the way similar
to one in List.transform()
thanks.
Peter B.
The text was updated successfully, but these errors were encountered: