Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding NonEmptyVector #1137

Merged
merged 11 commits into from
Jul 13, 2016
230 changes: 230 additions & 0 deletions core/src/main/scala/cats/data/NonEmptyVector.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
package cats
package data

import scala.annotation.tailrec
import scala.collection.immutable.VectorBuilder
import scala.util.Try
import cats.instances.vector._

/**
* A data type which represents a non empty Vector.
*/
final case class NonEmptyVector[A] private (toVector: Vector[A]) {

/** Gets the element at the index, if it exists */
def get(i: Int): Option[A] =
Try(getUnsafe(i)).toOption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that toVector.lift(i) would probably be a more efficient implementation (though this is performance speculation without a benchmark).


/** Gets the element at the index, or throws an exception if none exists */
def getUnsafe(i: Int): A = toVector(i)

/** Updates the element at the index, if it exists */
def updated(i: Int, a: A): Option[NonEmptyVector[A]] =
Try(updatedUnsafe(i, a)).toOption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More performance speculation: if (toVector.isDefinedAt(i)) Some(toVector.updated(i, a)) else None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I guess that would need to wrap it in a NonEmptyVector too.


/** Updates the element at the index, or throws an exeption if none exists */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo: exeption -> exception.

def updatedUnsafe(i: Int, a: A):
NonEmptyVector[A] = NonEmptyVector(toVector.updated(i, a))

def head: A = toVector.head

def tail: Vector[A] = toVector.tail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's worth supporting apply and updated for this structure, since indexed access is one of the best reasons to want to be using a vector. I'd probably use the standard signatures but I could be persuaded to have them return Option[A] and Option[Unit] respectively.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or both? :-) But agree on wanting indexed access.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@non you mean Option[NonEmptyVector[A]] for updated right? :)

I'd be in favor of updated that returned an Option and updatedUnsafe that had the behavior of the the std lib (throwing an exception if the index doesn't exists). Similarly for accessing an index, but the name may be a little awkward there if we use apply. Maybe atIndex/atIndexUnsafe or at or elem or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Sorry, in my defense it was late! ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not have:

def get(i: Int): Option[A]
// This is unsafe, it may throw on out-of-bounds, see `get` for a safe variant.
def apply(i: Int): A

for updated I would like:

def updatedOption(i: Int, a: A): Option[NonEmptyVector[A]]
// this will throw is i is out of bounds.
def updated(i: Int, a: A): NonEmptyVector[A]

I feel like adding Option as a suffix signals (like headOption, reduceOption is pretty standard), and I think the unsafe methods should keep the same name (lest we train people that this method is safe in some contexts but not others. Reading the code you have to know exactly if it is NonEmptyVector or Vector to see if apply or updated is safe, I'd rather always make them unsafe so we are consistent).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, since toVector is free, we could have only get and require ne.toVector(12) for unsafe apply.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tough. I think that updated and apply have names that seem too benign for throwing exceptions in common cases. It's nice to be consistent with the standard library, but I also have seen lots of scala devs mess up with head and reduce when I don't think they would have if these had been the safe options and separate headUnsafe and reduceUnsafe options had existed. I'm hesitant to pull in gotchas just to have consistent naming with the standard library. Hmmm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see your point on safety, but there is also a cost (especially to reviewers of future code) of using names that are unsafe on Vector to be safe here (if they change the type signature).

My bias is using cats in a large codebase with many novices that mostly work with the standard library. I'd rather avoid confusion in naming (not at the cost of safety though). So, I would say updateOption and maybe updateUnsafe would be the best solution.

Maybe I would say: when we can safely follow standard naming conventions, we should. Otherwise, if we want to add methods that can throw, use Unsafe as a suffix (or prefix).

head has the same type signature, so I am not worried about it. Changing updated to be optional seems like it will create reading difficulties, so I prefer a new name for a safe variant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek everything that you said sounds good to me. So it sounds like it'll be updateOption and updateUnsafe for the update methods. And maybe get and getUnsafe for index access? Or it could be getOption to be symmetric if you'd prefer. Either one is fine with me.

/**
* remove elements not matching the predicate
*/
def filter(f: A => Boolean): Vector[A] = toVector.filter(f)

/**
* Append another NonEmptyVector to this
*/
def concat(other: NonEmptyVector[A]): NonEmptyVector[A] = NonEmptyVector(toVector ++ other.toVector)

/**
* Alias for concat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to make this Alias for [[concat]] so that the scaladoc has a link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see that we are using overloading, so I don't know how that works with scaladoc. I'm generally skeptical of overloading, though it's probably fine in this instance. I think it would prevent us from being able to turn NonEmptyVector into a value class though, wouldn't it?

*/
def ++(other: NonEmptyVector[A]): NonEmptyVector[A] = concat(other)

/**
* Append another Vector to this
*/
def concat(other: Vector[A]): NonEmptyVector[A] = NonEmptyVector(toVector ++ other)

/**
* Alias for concat
*/
def ++(other: Vector[A]): NonEmptyVector[A] = concat(other)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again some minor nitpicking about formatting, but both the ++ methods ScalaDoc's can be changed to

/**
 * Alias for concat
 */


/**
* find the first element matching the predicate, if one exists
*/
def find(f: A => Boolean): Option[A] = toVector.find(f)

/**
* Check whether at least one element satisfies the predicate.
*/
def exists(f: A => Boolean): Boolean = toVector.exists(f)

/**
* Check whether all elements satisfy the predicate.
*/
def forall(f: A => Boolean): Boolean = toVector.forall(f)

/**
* Left-associative fold using f.
*/
def foldLeft[B](b: B)(f: (B, A) => B): B =
toVector.foldLeft(b)(f)

/**
* Right-associative fold using f.
*/
def foldRight[B](lb: Eval[B])(f: (A, Eval[B]) => Eval[B]): Eval[B] =
Foldable[Vector].foldRight(toVector, lb)(f)

/**
* Applies f to all the elements
*/
def map[B](f: A => B): NonEmptyVector[B] =
NonEmptyVector(toVector.map(f))

/**
* Applies f to all elements and combines the result
*/
def flatMap[B](f: A => NonEmptyVector[B]): NonEmptyVector[B] =
NonEmptyVector(toVector.flatMap(a => f(a).toVector))

/**
* Left-associative reduce using f.
*/
def reduceLeft(f: (A, A) => A): A =
tail.foldLeft(head)(f)

/**
* Left-associative reduce using the Semigroup of A
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove this comment: "Left-associative" since Semigroup is associative, it does not matter which order you do the combine, the result is the same.

*/
def reduce(implicit S: Semigroup[A]): A =
S.combineAllOption(tail).foldLeft(head)(S.combine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I would do: S.combineAllOption(toVector).get since we know the get can't throw since toVector.nonEmpty.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about reduce? Since, we know that this is non-empty the reduce(fn: (A, A) => A): A is a safe signature. That or we could have: def combineAll(implicit: sg: Semigroup[T]): T (or we could have both).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduce and several other methods are available via the Reducible instance, but I agree that it would probably be nice to have some of these helpers on the class itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be called reduceLeft(f: (A, A) => A): A and reduce(implicit S: Semigroup[A]): A to be consistent with cats.Reducible. AFAK, the standard library has similar names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add def zip[B](that: NonEmptyVector[B]): NonEmpty[(A, B)] which could be nice.

/**
* Typesafe equality operator.
*
* This method is similar to == except that it only allows two
* NonEmptyVector[A] values to be compared to each other, and uses
* equality provided by Eq[_] instances, rather than using the
* universal equality provided by .equals.
*/
def ===(that: NonEmptyVector[A])(implicit A: Eq[A]): Boolean = Eq[Vector[A]].eqv(toVector, that.toVector)

/**
* Typesafe stringification method.
*
* This method is similar to .toString except that it stringifies
* values according to Show[_] instances, rather than using the
* universal .toString method.
*/
def show(implicit A: Show[A]): String =
s"NonEmptyVector(${Show[Vector[A]].show(toVector)})"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will give us something like NonEmptyVector(Vector(1, 2, 3)) for NonEmptyVector(1, 2, 3).
Maybe we could do "NonEmpty" + Show[Vector[A]].show(toVector) which should give us NonEmptyVector(1,2,3) ?

}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this show correct?
Or should I have s"NonEmptyVector(${Show[Vector[A]].show(vector)})" instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better than the cons like show.


private[data] sealed trait NonEmptyVectorInstances {

implicit val catsDataInstancesForNonEmptyVector: SemigroupK[NonEmptyVector] with Reducible[NonEmptyVector]
with Monad[NonEmptyVector] with Comonad[NonEmptyVector] with Traverse[NonEmptyVector] with MonadRec[NonEmptyVector] =
new NonEmptyReducible[NonEmptyVector, Vector] with SemigroupK[NonEmptyVector] with Monad[NonEmptyVector]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be a MonadRec using the same style as:
https://github.com/typelevel/cats/blob/master/core/src/main/scala/cats/instances/list.scala#L31

and also I am pretty sure MonadCombine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do MonadCombine because it has an empty method that you can't write for non-empty structures.

👍 for the MonadRec instance though.

with Comonad[NonEmptyVector] with Traverse[NonEmptyVector] with MonadRec[NonEmptyVector] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: since MonadRec extends Monad, you can replace the with Monad[NonEmptyVector] in the two places it shows up here.


def combineK[A](a: NonEmptyVector[A], b: NonEmptyVector[A]): NonEmptyVector[A] =
a concat b

override def split[A](fa: NonEmptyVector[A]): (A, Vector[A]) = (fa.head, fa.tail)

override def size[A](fa: NonEmptyVector[A]): Long = 1 + fa.tail.size.toLong

override def reduceLeft[A](fa: NonEmptyVector[A])(f: (A, A) => A): A =
fa.reduceLeft(f)

override def reduce[A](fa: NonEmptyVector[A])(implicit A: Semigroup[A]): A =
fa.reduce

override def map[A, B](fa: NonEmptyVector[A])(f: A => B): NonEmptyVector[B] =
fa map f

def pure[A](x: A): NonEmptyVector[A] =
NonEmptyVector(x, Vector.empty)

def flatMap[A, B](fa: NonEmptyVector[A])(f: A => NonEmptyVector[B]): NonEmptyVector[B] =
fa flatMap f

def coflatMap[A, B](fa: NonEmptyVector[A])(f: NonEmptyVector[A] => B): NonEmptyVector[B] = {
@tailrec def consume(as: Vector[A], buf: VectorBuilder[B]): Vector[B] =
as match {
case a +: as => consume(as, buf += f(NonEmptyVector(a, as)))
case _ => buf.result()
}
NonEmptyVector(f(fa), consume(fa.tail, new VectorBuilder[B]))
}

def extract[A](fa: NonEmptyVector[A]): A = fa.head

def traverse[G[_], A, B](fa: NonEmptyVector[A])(f: (A) => G[B])(implicit G: Applicative[G]): G[NonEmptyVector[B]] =
G.map2Eval(f(fa.head), Always(Traverse[Vector].traverse(fa.tail)(f)))(NonEmptyVector(_, _)).value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly nitpicking but the 2 spaces for indentation seem to be missing.



override def foldLeft[A, B](fa: NonEmptyVector[A], b: B)(f: (B, A) => B): B =
fa.foldLeft(b)(f)

override def foldRight[A, B](fa: NonEmptyVector[A], lb: Eval[B])(f: (A, Eval[B]) => Eval[B]): Eval[B] =
fa.foldRight(lb)(f)

def tailRecM[A, B](a: A)(f: A => NonEmptyVector[A Xor B]): NonEmptyVector[B] = {
val buf = new VectorBuilder[B]
@tailrec def go(v: NonEmptyVector[A Xor B]): Unit = v.head match {
case Xor.Right(b) =>
buf += b
NonEmptyVector.fromVector(v.tail) match {
case Some(t) => go(t)
case None => ()
}
case Xor.Left(a) => go(f(a).concat(v.tail))
}
go(f(a))
NonEmptyVector.fromVectorUnsafe(buf.result())
}
}

implicit def catsDataEqForNonEmptyVector[A](implicit A: Eq[A]): Eq[NonEmptyVector[A]] =
new Eq[NonEmptyVector[A]]{
def eqv(x: NonEmptyVector[A], y: NonEmptyVector[A]): Boolean = x === y
}

implicit def catsDataShowForNonEmptyVector[A](implicit A: Show[A]): Show[NonEmptyVector[A]] =
Show.show[NonEmptyVector[A]](_.show)

implicit def catsDataSemigroupForNonEmptyVector[A]: Semigroup[NonEmptyVector[A]] =
catsDataInstancesForNonEmptyVector.algebra

}

object NonEmptyVector extends NonEmptyVectorInstances {

def apply[A](head: A, tail: Vector[A]): NonEmptyVector[A] =
NonEmptyVector(head +: tail)

def apply[A](head: A, tail: A*): NonEmptyVector[A] = {
val buf = Vector.newBuilder[A]
buf += head
tail.foreach(buf += _)
NonEmptyVector(buf.result)
}

def fromVector[A](vector: Vector[A]): Option[NonEmptyVector[A]] =
vector.headOption.map(apply(_, vector.tail))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More performance speculation: if (vector.isEmpty) None else Some(new NonEmptyVector(vector)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I think we should do the horrible, but fast stuff internally when we can prove it is correct (and then test it to be sure).


def fromVectorUnsafe[A](vector: Vector[A]): NonEmptyVector[A] =
fromVector(vector) match {
case Some(v) => v
case None =>
throw new IllegalArgumentException("Cannot create NonEmptyVector from empty vector")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably just do an isEmpty check here to avoid the extra Option allocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

}
6 changes: 0 additions & 6 deletions core/src/main/scala/cats/data/package.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ package cats

package object data {
type NonEmptyList[A] = OneAnd[List, A]
type NonEmptyVector[A] = OneAnd[Vector, A]
type NonEmptyStream[A] = OneAnd[Stream, A]
type ValidatedNel[E, A] = Validated[NonEmptyList[E], A]

Expand All @@ -11,11 +10,6 @@ package object data {
def NonEmptyList[A](head: A, tail: A*): NonEmptyList[A] =
OneAnd[List, A](head, tail.toList)

def NonEmptyVector[A](head: A, tail: Vector[A] = Vector.empty): NonEmptyVector[A] =
OneAnd(head, tail)
def NonEmptyVector[A](head: A, tail: A*): NonEmptyVector[A] =
OneAnd(head, tail.toVector)

def NonEmptyStream[A](head: A, tail: Stream[A] = Stream.empty): NonEmptyStream[A] =
OneAnd(head, tail)
def NonEmptyStream[A](head: A, tail: A*): NonEmptyStream[A] =
Expand Down
6 changes: 3 additions & 3 deletions docs/src/main/tut/oneand.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ type NonEmptyList[A] = OneAnd[List, A]

which is the actual implementation of non-empty lists in cats. By
having the higher kinded type parameter `F[_]`, `OneAnd` is also able
to represent other "non-empty" data structures, e.g.
to represent other "non-empty" data structures e.g.
Copy link
Contributor

@kailuowang kailuowang Jun 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about this but shall we replace this example with NonEmptyStream, instead of just removing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that after this PR (or possibly as a part of this PR) we can probably rip out OneAnd as discussed here. Someone can follow up with a NonEmptyStream if it's something they really want -- I suspect it's not used as commonly as NonEmptyList and NonEmptyVector.


```tut:silent
import cats.data.OneAnd

type NonEmptyVector[A] = OneAnd[Vector, A]
```
type NonEmptyStream[A] = OneAnd[Stream, A]
```
3 changes: 3 additions & 0 deletions laws/src/main/scala/cats/laws/discipline/Arbitrary.scala
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ object arbitrary extends ArbitraryInstances0 {
implicit def catsLawsArbitraryForOneAnd[F[_], A](implicit A: Arbitrary[A], F: Arbitrary[F[A]]): Arbitrary[OneAnd[F, A]] =
Arbitrary(F.arbitrary.flatMap(fa => A.arbitrary.map(a => OneAnd(a, fa))))

implicit def catsLawsArbitraryForNonEmptyVector[A](implicit A: Arbitrary[A]): Arbitrary[NonEmptyVector[A]] =
Arbitrary(implicitly[Arbitrary[Vector[A]]].arbitrary.flatMap(fa => A.arbitrary.map(a => NonEmptyVector(a, fa))))

implicit def catsLawsArbitraryForXor[A, B](implicit A: Arbitrary[A], B: Arbitrary[B]): Arbitrary[A Xor B] =
Arbitrary(Gen.oneOf(A.arbitrary.map(Xor.left), B.arbitrary.map(Xor.right)))

Expand Down
Loading