-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
timevis can leak (potentially) private information #44
Comments
A bit more context: In order to create a timevis widget, the user calls the
results in Since this package is a binding to a third party javascript library and that javascript library is responsible for building the visualization, the I want to hear from others if they think this is a big security issue and if timevis should explicitly remove any non standard columns before passing the data to javascript. I don't love that idea because 1. then the package will have to play catchup with visjs to make sure we always have a full list of all the supported I think simply documenting the I wonder if other htmlwidgets whitelist columns in the data they use? |
This is entirely my opinion, but for what it is worth...
|
@greg-minshall My thoughts are along the same as what @timelyportfolio said. For now I would feel better about just adding a few words in the parameter documentation rather than making assumptions about and changing the data being passed in |
hi. so, i sort of convinced myself other than the two of you. let me see if i can explain, both my proposed solution, and (harder) my rationale. my proposed solution:
the warning in #1 is to (try to) prevent mystification of the user when their "mightynew" column doesn't seem to be having any effect. "passcolumns=" then decouples, to some extent, releases of timevis() from upgrades in vis.js() Timeline. my reasoning is that security is a hard problem. while we can't fix the problem, we can semi-plug one hole, maybe stopping some very embarrassing leak from happening in the future. (if we do prevent it, we'll never know it; if we don't, we'll possibly hear about it.) on the other hand, every small possibility increases the overall likelihood of security failures. |
I do understand your viewpoint eg. this is a security issue that timevis can potentially take some measure against. But in my opinion this is overstepping what timevis is, it's simply a visualization, if someone has sensitive data then it's on the user to not pass that sensitive data around if not needed, to any function. You shouldn't pass it to R functions if it's truly sensitive information that is not needed for the function you're calling, because every function you call is a blackbox inside and you don't know what it does with it (unless you actually look at the source code). To me it doesn't seem like it makes sense for such a simplistic project that's an R binding to a JS library to try to solve this problem. If others will agree that it's the correct thing to do, I'm open to changing my mind, which is why I specifically asked on Twitter for people to comment here if they have an opinion. Or, alternatively, if you'll find several other htmlwidgets that also have that line of thinking that variables should be whitelisted for security reasons, then it would also prompt me to reconsider. But the argument of a security hole sounds too weak for me because it seems like the user should just not use a dataframe with information that they really don't want to share. I'll leave this issue open for a while to see if others chime in! |
Dean, @timelyportfolio, one thing to keep in mind is that, by definition, the user we are protecting is the user who (after the warning is on the man page) is, in fact, doing the wrong thing, making a mistake. the user who reads, understands, and pays attention to the warning is not the person we are trying to protect. rather, it is the person who, for whatever reason, doesn't do all those things. but, you know how we are when we are trying to get something to work, we read this and that, look at google, stackexchange, etc. then, we try this and that, possibly forgetting -- certainly at my age! -- some things we previously read while doing so. the programmer who reads all the documentation, follows all the guidelines, imho, is a bit like homo economicus. in a scenario, the department clerk is told to put together a presentation for the next City Council meeting of what how the department employees are scheduled for the next six months. remembering that Janet did something similar, he asks and she sends him some code, using timevis. working on that, the clerk notices Janet gets the employee task database using the office API (which maybe involves, behind the curtains, a few joins), and then forms a new datafram with "name", "group", etc. but, the clever clerk says, i already have a dataframe, i'll just rename a few columns ("Name" to "name"; "Section" to "group", etc.), pass it to timevis(), and wow! since the meeting is public, the resulting web page is published online for the council members, and interested members of the public, to see. to see (well, for hackers), including, the non-timevis, non-vis.js columns such as mobile phone numbers, salary, etc. timevis is a really nice package, and hopefully it will get an expanding user base. but, the likelihood of a leak is, i would guess, proportional to the number of users. (of course, i already did a leak; luckily not that horrible a one, just some e-mail address that, presumably, nobody noticed.) so, obviously it's you decision, but i still offer up my solution (which i'm happy to code up): new columns can be passed, but they must be explicitly specified as new columns. |
i think i did a boo boo. reopening. |
You did in fact boo boo, let's keep it open :) I completely understand your argument. However, I still personally think that would be overstepping the boundary of the package, and that if someone has sensitive info the burden of being careful with it and not giving it away to other services is on them, not on the service to try to throw away what's given to it. This is clearly not a black and white matter and involves some personal judgment on what the solution (or no solution) is, so it would help if I would see that other htmlwidgets do take this into consideration. I haven't looked, but I don't think they do, and I imagine it's from similar reasons to mine. But I may be wrong . |
As mentioned in PR #42 , instead of keeping this open forever, I'm closing this issue as a non-fix |
the entire dataframe provided as the 'data' argument to timevis (same for 'groups' argument) ends up in the resulting HTML file, including columns not actually used by timevis/vis.js's Timeline. if the user has passed an existing dataframe, this can cause information leakage.
The text was updated successfully, but these errors were encountered: