-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a cumulative sum to KeyedList #1085
Conversation
def cumulativeSum[U, V](implicit ev: T <:< (U, V), sg: Semigroup[V], ordU: Ordering[U], ordK: Ordering[K]): SortedGrouped[K, (U, V)] = { | ||
toTypedPipe.group | ||
.sortBy { case (u: U, _) => u } | ||
.scanLeft(Nil: List[(U, V)]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be Option[(U, V)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to recall flattenValues
was giving me grief when I was using an Option
... I'll confirm though
These look like useful functions, but the placement does not seem quite right. The first thing you do is call .toTypedPipe on these things. This is because KeyedListLike does not have any notion of sorting because (sadly) Cascading does not support sorting during a join. So, I think this might be a case for enrichment: package com.twitter.scalding.typed
/**
* Extension for TypedPipe to add a cumulativeSum method
*/
object CumulativeSum {
implicit class Extension[K, V](val pipe: TypedPipe[(K, V)]) extends AnyVal {
def cumulativeSum[......
}
} Then a user accesses it with: import CumulativeSum.Extension. It's tough to draw the line between what the core methods are and what should be on the edge. This feels like the latter to me. |
This rings true to me that this doesn't feel like a core method. I'll move this around. |
0737c66
to
f788c5c
Compare
Moved the files around. Note this breaks building the 2.9 target. Also, |
bump |
This is failing the tests, we still have a 2.9 build that isn't working here. |
@ianoc that's from using an implicit class. Is there a 2.9 way to add methods like this? |
implicit def toMyType(a: BaseType) = new MyType(a) |
implicit def toMyType(a: BaseType) = new MyType(a) On Mon, Nov 3, 2014 at 2:50 PM, snoble notifications@github.com wrote:
|
tests are passing now |
Added a new test for this method |
import com.twitter.scalding.typed.CumulativeSum._ | ||
|
||
class AddRankingWithCumulativeSum(args: Args) extends Job(args) { | ||
TypedTsv[(String, Double)]("input1", ('gender, 'height)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need field names when using the typed api
you have added scalding-hadoop-test/NOTICE also, looks by mistake... |
} | ||
} | ||
.flattenValues | ||
.toTypedPipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we remove this, you could return an SortedGrouped, I think, which can still do reduces on, if people want that. I think returning the TypedPipe is only a loss of power. Consider if you want to follow up with a fold or another scan (or a filter followed by a fold or scan).
Make the simple case return type SortedGrouped: https://github.com/twitter/scalding/blob/develop/scalding-core/src/main/scala/com/twitter/scalding/typed/Grouped.scala#L55 and I'll click merge. |
Done. (as an aside I'm still confused why Options don't seem to be working with |
@snoble because Option is not TraversableOnce, there is an implicit to that. You could have done: |
oops, fixed |
Add a cumulative sum to KeyedList
Thank you, sir! |
Not sure where the best place to put this is (or where to add tests) but this is a method for taking cumulative sums that doesn't require all the entries to go through a single scanLeft. It accomplishes this by
This has been compiled against master.