abbreviated output for data.frame is wrong #28

jankatins · 2016-02-25T08:44:50Z

I've a data frame (> 100 rows) which is abbreviated when I print it with full_df %>% select(raw_lname, raw_fname). But the index is not showing the right numbers but the numbers in the current data frame:

1
2
[...]
29
30
...
32
33
[...]
60
61

-> It doesn't show the real row number but the one from the printed frame :-(

The text was updated successfully, but these errors were encountered:

karldw · 2016-02-26T01:15:49Z

It's possible I broke this when I was fixing things in the array abbreviating function. Can you tell me more about how this output is wrong? In particular, I'm not sure I understand the phrase "the index is not showing the right numbers but the numbers in the current data frame".

jankatins · 2016-02-26T06:59:35Z

I would expect that the index shows the row number in the old dataframe or none. So

1
2
[...]
29
30
...
150
151
[...]
180
181

karldw · 2016-03-07T15:49:55Z

I think this was fixed along with #31. @JanSchulz, can you confirm?

jankatins · 2016-03-07T17:03:47Z

Nope, still the same behavior, I think I tracked it down to the use of the readr package:

ensure.package("dplyr")
ensure.package("readr")
df = data.frame(a=c(1:1000, 1111111111111), b=1:1001)
write_csv(df, "test_delete.csv")
df2 = read_csv("test_delete.csv")
#df2 %>% select(a) %>% glimpse()
df %>% select(a) # ok
df2 %>% select(a) # no ok...

karldw · 2016-03-07T17:16:59Z

Interesting, thanks. I think it's something with the way repr shortens dplyr's tbl_dfs. For instance, this code has the same problem:

library(dplyr)
df <- data.frame(a=c(1:1000, 111111), b=1:1001)
df2 <- tbl_df(df)
df  # ok
df2  # not ok

karldw · 2016-03-10T04:52:34Z

The problem is that dplyr automatically reassigns row names, which makes a ton of sense, but causes a bit of hassle here. The problem should be fixed whenever my current set of commits is merged in. (#32)

jankatins · 2016-03-30T15:22:46Z

Confirmed as fixed in a dataframe and with dplyr. Closing.

@karldw Thanks!

jankatins · 2016-03-30T16:21:07Z

Ok, I take that back:

library("dplyr")
options(jupyter.log_level = 3)
options(jupyter.rich_display = FALSE)
data.frame(a=1:200, b=201:400) %>% select(a)
[... shortend...]
27        27
28        28
29        29
30        30
<U+22EE>   <U+22EE>
171      171
172      172
173      173
174      174
[...shortend...]

In the html output it's fine...

jankatins · 2016-04-07T14:14:06Z

Is this #21?

karldw · 2016-04-11T15:36:10Z

Maybe? I don't run into your issue with Firefox or Chromium on Ubuntu.

flying-sheep · 2016-04-11T15:56:00Z

OK, looks like it. thanks!

jankatins · 2016-04-11T17:20:57Z

With me, it's still broken!

@karldw are you on linux? If yes, this is "just" #21 again... If so we should replace the char with something which works on windows as well... At least as long as we don't have a fix for #21...

flying-sheep · 2016-04-11T18:06:39Z

@karldw are you on linux? If yes, this is "just" #21 again...

this is why i closed it:

I don't run into your issue with Firefox or Chromium on Ubuntu

you have missed that part 😁

jankatins · 2016-04-11T18:13:02Z

Oups, yes... should go home...

jankatins · 2016-04-11T21:42:21Z

Ok, it doesn't look like #21 will be fixed soon, so we should change the char: r-lib/evaluate#66 (comment)

flying-sheep · 2016-04-12T07:33:48Z

nooooooooooooooooooooooooooooooooooooooooooooooooooo

flying-sheep · 2016-04-12T07:33:50Z

no

flying-sheep · 2016-04-12T07:36:03Z

no fucking windows preventing us from having pretty things.

if you want you can create code that detects if the character is supported and replaces it if it isn’t. with your strlen checking, there’s a good avenue for this.

like:

ellip.h <- if (strlen('\u22EF') == 1) '\u22EF' else '...'
ellip.v <- if (strlen('\u22EE') == 1) '\u22EE' else ellip.h
ellip.d <- if (strlen('\u22F1') == 1) '\u22F1' else ''

jankatins · 2016-04-12T08:18:23Z

This check doesn't work, you would have to print it in a capture.output... Is it such a problem to just use three dots?

flying-sheep · 2016-04-12T08:22:45Z

yes!

we can encapsulate the correct check in a function:

char_fallback <- function(char, default) {
    real_len <- nchar(char)
    r_len <- nchar(capture.output(cat(char))
    if (real_len == r_len) char else default
}
ellip.h <- char_fallback('\u22EF', '...')
ellip.v <- char_fallback('\u22EE', ellip.h)
ellip.d <- char_fallback('\u22F1', '')

would that be correct?

jankatins · 2016-04-12T09:18:00Z

This works on my windows box:

char_fallback <- function(char, default) {
  real_len <- nchar(char)
  r_len <- nchar(capture.output(cat(char)))
  if (real_len == r_len) char else default
}
ellip.h <- char_fallback('\u22EF', '...')
ellip.v <- char_fallback('\u22EE', ellip.h)
ellip.d <- char_fallback('\u22F1', '')

If you go that way, you could also do something like this in the executer:

if(.Platform$OS.type == "windows") {
  issue_warning_if_bad_unicode <- function(code){
    real_len <- nchar(code)
    r_len <- nchar(capture.output(cat(code)))
    if (real_len != r_len){
       # Whatever is the right method here...
      display_error("Your code contains a unicode char which is not displayable in your current locale. This can lead to subtile errors if you use such chars to do comparisons. Please see ...")
    }
  }
} else {
   # Nothing on other systems
  issue_warning_if_bad_unicode <- function(code) NULL
}
issue_warning_if_bad_unicode(code)

flying-sheep · 2016-04-12T10:10:04Z

display_error() would be warning() 😉

and your idea is to call that function every time something is executed?

jankatins · 2016-04-12T11:29:21Z

display_error() would be warning() 😉

I'm not sure if the warning would end up in the right place in the frontend... -> Similar to handle_display_error:

 if (!silent) {
                send_response('stream', request, 'iopub', list(
                    name = 'stderr',
                    text = msg))
            }

and your idea is to call that function every time something is executed?

Yes, only on windows and maybe configurable.

I'm a big fan of "explicit error messages": better do a check too many (or a fallback) than getting obscure error messages (or even silent errors, like here) in places where it's not obvious what went wrong...

flying-sheep · 2016-04-12T12:36:58Z

ah, good. we should create a display_error method for the Executor then and call it from handle_display_error and when the code has unicode and people are on shitty OSs 😜

jankatins · 2016-04-12T13:17:42Z

more display_stream(..., stream=c('stdout', 'stderr')), but this could also simple be a private helper in the irkernel package.

jankatins · 2016-04-12T13:18:39Z

If you want, just go ahead, I won't work on this until wednesday evening...

jankatins · 2016-04-13T17:38:12Z

Ok, on it...

flying-sheep · 2016-04-13T17:41:14Z

👍 sorry, wasn’t in the mood

On windows, ellipses are usually outside the chars which can be displayed in our way to display plain text /with `capture.output(print(...))` in the current locale. The leads to strangely escaped chars and we would have such a escaped chars each time we display a dataframe. This problem is probably not going away soon: IRkernel#28 (comment) Work around this by simple using three dots.

karldw mentioned this issue Mar 2, 2016

Fixes for array indexing #31

Merged

jankatins closed this as completed Mar 30, 2016

jankatins reopened this Mar 30, 2016

jankatins added the bug label Apr 7, 2016

jankatins added this to the 0.5 milestone Apr 7, 2016

jankatins mentioned this issue Apr 7, 2016

broken unicode uses <U+884C> which needs to be escaped #21

Open

flying-sheep closed this as completed Apr 11, 2016

jankatins reopened this Apr 11, 2016

flying-sheep closed this as completed Apr 11, 2016

jankatins mentioned this issue Apr 13, 2016

Add workaround for displaying ellipsis on windows #43

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

abbreviated output for data.frame is wrong #28

abbreviated output for data.frame is wrong #28

jankatins commented Feb 25, 2016

karldw commented Feb 26, 2016

jankatins commented Feb 26, 2016

karldw commented Mar 7, 2016

jankatins commented Mar 7, 2016

karldw commented Mar 7, 2016

karldw commented Mar 10, 2016

jankatins commented Mar 30, 2016

jankatins commented Mar 30, 2016

jankatins commented Apr 7, 2016

karldw commented Apr 11, 2016

flying-sheep commented Apr 11, 2016

jankatins commented Apr 11, 2016

flying-sheep commented Apr 11, 2016

jankatins commented Apr 11, 2016

jankatins commented Apr 11, 2016

flying-sheep commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

jankatins commented Apr 12, 2016

jankatins commented Apr 13, 2016

flying-sheep commented Apr 13, 2016

abbreviated output for data.frame is wrong #28

abbreviated output for data.frame is wrong #28

Comments

jankatins commented Feb 25, 2016

karldw commented Feb 26, 2016

jankatins commented Feb 26, 2016

karldw commented Mar 7, 2016

jankatins commented Mar 7, 2016

karldw commented Mar 7, 2016

karldw commented Mar 10, 2016

jankatins commented Mar 30, 2016

jankatins commented Mar 30, 2016

jankatins commented Apr 7, 2016

karldw commented Apr 11, 2016

flying-sheep commented Apr 11, 2016

jankatins commented Apr 11, 2016

flying-sheep commented Apr 11, 2016

jankatins commented Apr 11, 2016

jankatins commented Apr 11, 2016

flying-sheep commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

flying-sheep commented Apr 12, 2016

jankatins commented Apr 12, 2016

jankatins commented Apr 12, 2016

jankatins commented Apr 13, 2016

flying-sheep commented Apr 13, 2016