Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

Flatten and interaction between flatMap and Options #48

Merged
merged 2 commits into from
Jun 27, 2017
Merged

Conversation

julienrf
Copy link
Contributor

@julienrf julienrf commented Mar 22, 2017

This is based on szeiger:wip/constrained-collections2. I only added the last commit to check that flatten would work with Iterable[Option[?]] values. This requires flatten to take an implicit view parameter and to define an implicit conversion from Option[A] to Iterable[A] (as it is currently the case in 2.12).

I also checked that it was possible to use flatMap with a function returning an Option[?] (instead of the expected IterableOnce[?]), which is a common pattern I think.

@szeiger
Copy link
Contributor

szeiger commented Mar 22, 2017

Good to see that it works, but should it continue to work? Should Options be converted implicitly to collections rather than requiring an explicit conversion?

@julienrf
Copy link
Contributor Author

That’s a good question… I think this is convenient but it’s a bit messy to change the signature of flatten from flatten[B](implicit ev A <:< IterableOnce[B]): C[B] to flatten[B](implicit ev: A => IterableOnce[B]): C[B] just to support Options…

What do others think? @odersky @Ichoran

@szeiger
Copy link
Contributor

szeiger commented Mar 22, 2017

If we want it to work we could also make Option implement IterableOnce. Unlike Iterable and the old TraversableOnce it contains only a single iterator method and none of the other baggage, so this would be feasible in the new design.

@julienrf
Copy link
Contributor Author

Then it raises the following question: is it desirable to have Option depend on the collections? Maybe it’s better the other way around. Or maybe it doesn’t matter because both will eventually end up in Scala core…

@szeiger
Copy link
Contributor

szeiger commented Mar 22, 2017

Option in 2.12 has an iterator method so it already depends on collections.

@julienrf
Copy link
Contributor Author

Oh yes, good point…

@ritschwumm
Copy link

ritschwumm commented Mar 22, 2017

the implicit conversion from Option to Iterable as it is in 2.12 is quite annoying, imho. some methods that could/should exist on Option don't - but do exist on Iterable, just with the wrong return type.

zip, for example. always good for a surprise, especially as scaladoc says it returns an Option:

scala> Some(1) zip Some(2)
res0: Iterable[(Int, Int)] = List((1,2))

@Ichoran
Copy link
Contributor

Ichoran commented Mar 22, 2017

I would worry that Option would be slowed down by anything extra in its inheritance hierarchy. It should be tested. But the very very common idiom of flatMaping a collection with something that produces an Option result as a way to filter would break if Option were not either convertible to or actually was a collection. Given that it's common, it argues that as long as the inheritance thing doesn't cause a problem, it would be best to have Option as a collection member so you don't create all the code to convert it and hope that the optimizer and JVM can rip it out again.

@odersky
Copy link
Contributor

odersky commented Mar 22, 2017

I also would be very reluctant to add superclasses to Option. @sjrd has a design to make options unboxed, and we should investigate whether we can adopt this. That would be a significant performance win. But with added superclasses it's not clear we can do it so easily.

@sjrd
Copy link
Member

sjrd commented Mar 22, 2017

But with added superclasses it's not clear we can do it so easily.

Actually it's pretty clear we can't. At least my design would not accommodate that at all.

@Ichoran
Copy link
Contributor

Ichoran commented Mar 22, 2017

Unboxed options (with null as the None type, and possibly a sentinel for Some(null)) would be really nice from a performance and ease-of-use standpoint. In that case, maybe we could overload the flatMaps to take an Option directly so the IterableOnce conversion is unnecessary. That would enable use of flatMap like collect except it would run faster and be more flexible.

@sjrd
Copy link
Member

sjrd commented Mar 22, 2017

FTR my proof of concept is at https://github.com/sjrd/scala-unboxed-option
Note that it does not use null as None, as most people think is a good idea. It isn't, it keeps us on a path that leads nowhere. My design keeps null as Some(null), just like any other a: A is its own Some(a). None has a a sentinel object, as well as every nesting of None in Somes, e.g., Some(Some(None)).

@Ichoran
Copy link
Contributor

Ichoran commented Mar 22, 2017

@sjrd - It's more robust to use sentinels as then you never rely upon the type seen by the compiler. Isn't it slower, though, in the case where types are all known (i.e. the typical case)? Then you don't need to consider the Some(None) case and the branching logic is simpler.

I guess fragility under erasure is sufficiently unacceptable so that a performance hit, even if there, is okay. (Note: I implemented a scheme very like yours with right-unboxed either, but didn't feel like working on macros for long enough to get it all the way across the zero-cost finish-line.)

@sjrd
Copy link
Member

sjrd commented Mar 22, 2017

Isn't it slower, though, in the case where types are all known (i.e. the typical case)?

Even if the type is known to be Option[String], you still have to deal with both None and null. One of them has to be a sentinel. There's simply no way around it.

Then you don't need to consider the Some(None) case and the branching logic is simpler.

My design lends itself to that kind of optimizations by a general-purpose(-ish) optimizer, after inlining. For example, consider Some.apply, whose source is

    def apply[A](value: A): USome[A] = value match {
      case value @ UNone      => value.wrap.asInstanceOf[USome[A]]
      case value: WrappedNone => value.wrap.asInstanceOf[USome[A]]
      case _                  => value.asInstanceOf[USome[A]]
    }

Once inlined, the optimizer can tell that value is of type, say String. It is then straightforward(-ish) for it to prove that neither value eq UNone nor value.isInstanceOf[WrappedNone] is possible, and hence everything collapses to value.asInstanceOf[USome[A]], which is in fact value.asInstanceOf[Object] due to erasure, i.e., value.

forceGet is less lucky. Its source is:

    private def forceGet: A = (self: Any) match {
      case none: WrappedNone =>
        none.unwrap.asInstanceOf[A]
      case _ =>
        self.asInstanceOf[A]
    }

But here, since self has an erased type of Object, it is always potentially the case that self is an instance of WrappedNone, and therefore that test cannot be eliminated.

@szeiger
Copy link
Contributor

szeiger commented Mar 23, 2017

You'd have to do implicit class UOptionOps[A](...) extends Any with IterableOnce[A] and incur the boxing overhead when you use a UOption in this way. Unless you use the same Option value multiple times as an IterableOnce this would be the same overhead that we currently have with the boxed Option plus the implicit conversion.

Or you define a separate implicit class that represents a true IterableOnce (which cannot be traversed multiple times) for a UOption. It would be its own Iterator, so you can drop one layer of boxing (UOption -> IterableOnce with Iterator instead of UOption -> IterableOnce -> Iterator). This should be similar in efficiency to an Option extends IterableOnce that allows multiple traversals.

@julienrf
Copy link
Contributor Author

Hi, I rebased this PR and applied the trick suggested by @szeiger: the implicit conversion to IterableOnce directly returns an Iterator, in order to avoid one intermediate allocation.

@julienrf
Copy link
Contributor Author

For some reasons the tests fail under dotty. Any idea of why?

@julienrf julienrf added this to the 0.2.0 milestone Jun 27, 2017
@szeiger
Copy link
Contributor

szeiger commented Jun 27, 2017

The test depends on the iteration order of HashSet which is undefined

@julienrf
Copy link
Contributor Author

Indeed, I just fixed that.

@julienrf julienrf merged commit 4da00e3 into master Jun 27, 2017
@julienrf julienrf deleted the flatten branch June 27, 2017 18:44
@szeiger
Copy link
Contributor

szeiger commented Jun 28, 2017

Why is it different between Scala and Dotty though? It's the same HashSet implementation and hash codes for Ints are identities in both.

@julienrf julienrf mentioned this pull request Jul 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants