Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shift should accept negative n #1708

Closed
eantonya opened this issue May 17, 2016 · 17 comments
Closed

shift should accept negative n #1708

eantonya opened this issue May 17, 2016 · 17 comments

Comments

@eantonya
Copy link
Contributor

eantonya commented May 17, 2016

I think using "type = lag/lead" is a lot of unnecessary typing, and using a negative shift would be much simpler and more intuitive. It would also allow for more nifty multiple shifts going from lead to lag by typing e.g. shift(vec, -5:5).

The "type" argument can still stay and negative shift would simply reverse the meaning of "type".

n = 0 should also be accepted and simply return the original vector.

@MichaelChirico
Copy link
Member

Could make moving-average windowing very simple

@arunsrinivasan
Copy link
Member

Agreed that this looks much better. At the time, I added type= with more types in mind. Particularly cyclic shift. But that doesn't seem to come up much (on SO). Perhaps type= could be removed altogether.

@eantonya
Copy link
Contributor Author

Having a cyclic shift would be interesting, and that would also benefit from having negative n's. I agree though that it doesn't come up much - I can't think of any real life use cases right now.

@MichaelChirico
Copy link
Member

could you elaborate what you guys have in mind with cyclic shift?

@franknarf1
Copy link
Contributor

@MichaelChirico I think they just mean that the unused values are shifted around to "fill", like

cyc_shift = function(x, n) c(tail(x,-n), head(x,n))

cyc_shift(LETTERS[1:5], -2) # DEABC lag  by 2
cyc_shift(LETTERS[1:5], 2)  # CDEAB lead by 2

@arunsrinivasan
Copy link
Member

arunsrinivasan commented May 19, 2016

Michael, google for circshift function in Matlab. That's more or less the functionality I had in mind.

@MichaelChirico
Copy link
Member

Would still love to have this -- so many times I want to facilitate moving averages like sapply(-2:2, shift, x = V1)

@jangorecki
Copy link
Member

jangorecki commented Jun 30, 2018

Is there agreement on negative n mapping to type?
Default type is "lag", so it make sense to treat negative n as type="lead", while @franknarf1 example of cyc_shift propose the opposite.
So n=-2 would be translated to n=2 and type "lead" or "lag"?
@arunsrinivasan what is your preference?

@DavidArenburg
Copy link
Member

I think that having negative n translated to lead would be very confusing to quite about everyone. This will bring issue/questions on SO and a lot of fuss in general IMO.

@jangorecki
Copy link
Member

jangorecki commented Jul 1, 2018

It will be more consistent to current code, as described above.
Also more conceptually when you compare output of lag and lead you can observe than lag adds extra elements in front of vector, while lead removes them. And that somehow maps to lag -> + and lead -> -.
No matter what we will decide it will be backward compatible anyway.

@DavidArenburg
Copy link
Member

Yeah, I guess you are right if we want to keep this backward compatible. I was thinking it could be a good thing to deprecate type altogether, but that will break too much code.

@jangorecki
Copy link
Member

We can just map it change n argument, so we don't need to support type in C api. Also mention in docs to not use type. Then after few years should be safe to add warning and after another few to error.

@jaapwalhout
Copy link

jaapwalhout commented Jul 1, 2018

@jangorecki stats::lag also accepts negative values to get a lead. So, I would use negative values here as well to get a lead (which makes it consistent with the behavior of stats::lag).

@MichaelChirico
Copy link
Member

MichaelChirico commented Jul 1, 2018

i think because it's ambiguous it makes sense to keep so that code can be more readable. shift(x, 1, type = 'lead') in anyone's code can instantly be understood, I think.

Unless we want to go the NAMESPACE-heavy route & add alias functions lead and lag.

That said, I think most of the use cases involving mixed indices are of the form -n:n for which type is irrelevant.

If we're going to deprecate anything, maybe the name shift in favor of the more universal lead and/or lag (e.g. they're SQL-standard). I think lag(x, -3:5) is also pretty clear.

@jangorecki
Copy link
Member

jangorecki commented Jul 1, 2018

I don't see anything wrong about name shift, good thing is that it is not masking other functions, thus better to keep it instead of lag. Eventually flag for fast-lag, but shift is pretty fine, and flag is already another existing word :)
Lets put it into 1.12.0.

@jangorecki jangorecki added this to the 1.12.0 milestone Jul 1, 2018
@jangorecki jangorecki self-assigned this Jul 1, 2018
@franknarf1
Copy link
Contributor

Default type is "lag", so it make sense to treat negative n as type="lead", while @franknarf1 example of cyc_shift propose the opposite.

That looks like a typo on my part; sorry for the confusion.

@franknarf1
Copy link
Contributor

SO post to update, https://stackoverflow.com/a/52432516

# now ...
n0 <- V2 != 0
n0 | shift(n0) | shift(n0, type = 'lead')

# should be ... 
Reduce(`|`, shift(V2 != 0, -1:1))

Tangent: This "shift a single logical vector and then collapse by OR or AND" use case is pretty common for me at least. It is essentially a rolling any/all and can be more done efficiently than via shift-Reduce, I guess. (For OR, find positions where the input is TRUE, subtract n to navigate to rows in output that should be TRUE.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants