-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let's do the rank job! #4323
Comments
As stated multiple times, homebrew-cask is not about app discoverability — this would fall into that category. |
Okay……but just collect the data. let someone else do this. THIS IS NOT THE SAME |
@vitorgalvao Correct me if I'm wrong, but I think @viztor is going for statistics collection (or at least he wants to provide an interface for such collection), not necessarily for the use of app discoverability. That being said, I agree with @vitorgalvao that even such a feature would probably fall outside of this project's scope. |
@alebcay Thank you! cask may got the condition better than any other program. |
@vitorgalvao I read the provided refers... and I can't agree with you to put these things together the reason why cask is not to do the category and describe is mainly because of human problems(to describe, to category? only man do this). and opinion might be too diversity to process. and this program can't hang for this. It's not about homebrew-cask is not about app discoverability. but statistic? write the code, and let the machine run. |
Yes, but the main (only?) use case for collecting these statistics is app discoverability (as the original premiss implies, to build a rank). There’s still so much left to do right now (this is very much alpha software), diverting resources (time) to implement such a non-essential feature would be, in my view, counter-productive. Naturally, if someone wants to give their time to this feature — and only to this feature — that’s a different story. There’s a reason (with every software) we choose to implement some features and not others, though — be it amount of code (increased complexity and possibility of bugs), speed, interface, or others. In addition, I don’t see this being as useful as envisioned. For this to be remotely useful, we have to collect data from users (naturally). As stated, since we care about privacy, this should be done with user consent only. Now, try to picture a typical user you believe would give consent, and other who would not. These users are likely very different; these users likely have different concerns about sharing their information with a service, and use different apps and services as a result. What this means is this would only be useful to getting usage of a specific subset of users, it wouldn’t be representative of the real numbers at all. What would those values be useful for? |
@vitorgalvao You said 'diverting resources', so we may just count the number. Count doesn't bring any privacy problems, the (only) privacy problem this may bring is that commits history is open to everyone, so if we do statistic by git, problem may arise. Why all the successful company go open? Let others use these numbers and referred to caskroom. Just thought, we can do this in one really easy way like when I ran
after finished, automatically do
md5: generated from machine's MAC address or so,that to identify the machine We can also do
Just like this. think about the costs and benefits, I'm not to decide plus:the value that passes to the address may also contains the version of brewcask to assured using the latest version or commit. and may this feature bring update-alert( or we can do auto-update?) |
@viztor While your idea sounds pretty interesting, @vitorgalvao's point was that Homebrew Cask is still alpha-stage software, and we still have many other lacking features that need to be implemented first, such as a much requested However, I'd like to note that the Node.js package manager |
@alebcay I've already pointed my view.
(these may doesn't need build you own server, if someone got the api) that's the most-simple and junior way to archive our goal. We can promote this time by time, just like I've posted. and people are skilled at different areas, that's good. |
before we do this - think about privacy concerns |
@muescha covered in the following comments |
That was not the only reason given, so I’ll ask you to not hang on that. However, I’ll reopened this so we can get a few more opinions. @alebcay’s point about I’ll point in advance (following @muescha’s comment) that privacy concerns were not reasonably addressed, and I’ll give my reasoning below.
Not if they don’t represent reality (which was my final point in the previous comment).
I meant development resources (time), not bandwith or system resources. Privacy is a concern, from the moment (as you suggested), we have a unique identifier to a certain user/computer. Naturally, we can’t simply count download numbers, as that’s pretty useless — scripts (like @alebcay’s own cask-tasting) would skew the numbers unrealistically. Consider this, let’s say popcorn time finally gets merged in, and we’re tracking users who downloaded the app. Suddenly, it gets into even more legal issues, and they go after every user of the app — we just put our users at legal risk for a dubiously useful feature. There’s no such thing as tracking anonymously — metadata can be used to identify you uniquely (there are tons of relevant examples).
No, we are not. A lot of users of homebrew-cask are not on github, nor are they developers, even. GUI apps like cakebrew will only expand its reach of non-technical users.
I think a considerable number might (previous popcorn time example).
This is not a company, nor is it a business. It is an open-source project that aims to make app installations easier. In this sense, the uses of statistics you mention are meaningless. Showing how good and useful the project is, is incredibly more valuable that spewing numbers of downloads/installations done with it. Why would any user care how many times a specific app was dowloaded via homebrew-cask? What matters is if they can get the ones they want.
You’ve mentioned my app discoverability links repeatedly, so I’ll clarify. Granted, I gave you the links more at hand at the time, so I’ll show you a more relevant one (start on the fourth paragraph). It’s not only about categories and other human-needed-decisions, it’s also about that issue being already tackled by other services — other services, mind you, geared specifically for this. That is not the goal of homebrew-cask, it’s goal is on the README. It is not about difficulty. Please realise this is not simply a technical issue. |
@vitorgalvao Thank you, kind of admirable in some level |
@vitorgalvao cask can be the data source for them. we needn't actual do the rank job =Popcorn Time?= JUST SKIP Yes, the company like to see their customers(potential-buyers) got angry, If they already get Popcorn time DOWN. What one would do this... costs their work hour and money and =As you mentioned:=
these metadata are collected by us. so we can decide how much information it contains. If you do worry about a HASHED string would sold you out,You should drop off google chrome because that's absolutely more dangerous than pop-corn time. And as I've said just left the idea of identify machines behind. I think I've gone too far |
Be very careful not to make the same mistakes as bower: bower/bower#1102 You do not want an interactive prompt in a tool that is used with automation. Why not just output to the console instructions on how to manually opt-in (or opt-out, I have no opinion on that)? Any human being who reads the output can run the command when they want. |
@viztor You completely missed my points.
Again, it’s not “just” doing it, this is not simply a technical issue.
Yes, it is. This tool has a goal — features that bring it closer to that goal should be included, while the ones that distance it should not. Regarding popcorn time, the point is not piracy; that was one example (chosen for being at hand), and focusing solely on that would be extremely short-sighted. Let’s say a country employs censorship. Let’s say that country decides you deserve to be in jail for using twitter, or a DRM removal tool — things that are fine in some countries, but not in others. Do you really believe the “I’ve downloaded it but didn’t use it” defence would be an effective one? And don’t forget litigation is expensive, even basing a court defence on what the other side can or cannot prove can be monetarily draining (hence unfeasible).
That is not enough, because we can never be sure how little is little enough. We don’t decide how the wrong hands use the data — they could deduce more from it than what we could anticipate. As soon as we have a unique identifier, all bets are off.
You mean one of those that are purposefully flawed, or one that was flawed by accident and might’ve been exploited for years?
Sure there is. With enough patience, and specially if you’re after someone specifically, you can definitely do it.
You’re basically making the point that users who are concerned about this type of things use different types of software, which is precisely what I pointed out two comments ago. This isn’t black and white either, different users will have different levels of concern. Which leads us back to the usefulness of the data.
Why would we want the data if it isn’t accurate? What good would it do? If the data doesn’t represent reality, why collect it all? That’d be equivalent to making up numbers, it would be worthless. I’d also like to point out that the amount of cryptography technical discussion already present is a fine example of why we can’t “just” do this. I like @leoj3n’s idea, though. Making it into a separate project that you specifically have to get and start would solve a lot of concerns. |
@vitorgalvao Sure you are right, of course, nothing is impossible. |
"It is never too late. Even if you are going to die tomorrow, keep yourself straight and clear and be a happy human being today." © Lama Yeshe |
@vitorgalvao That kind of privacy problems you are talking about now, they already exists. And please don't mix safety and privacy. Though, |
Nor did I claim it would.
Again, that was an example, it’s not meant to be taken as the end-all. Here, they are intertwined. I was merely answering your points, showing an extreme case of how a privacy breach can be harmful.
Once more, nor did I said you did. I did, however, say it is needed to get relevant results, and that if we’re not doing it to get accurate, relevant results, it isn’t worth doing at all.
We don’t, since a lot of casks (last time I counted, about 42%, if I recall correctly), use an always up-to-date url; those are never updated for versioning. Why would update frequency matter, anyway? And that has noting to do with the original idea.
That’s a long quote, which is why I’m truncating and addressing it at the end. When a user gets homebrew-cask — or any other software — it’s a tradeoff. You're connecting yourself to the internet (and using computers) for a myriad of reasons, (hopefully) understand the risks and rewards, and decide it's worth it (or not). It’s about convenience, and about if that convenience is worth it. Statistics (in the scope of this project, and as far ranking goes), I argue, are not. Here’s what matters, and we’re getting sidetracked from. App statistics are worthless in the scope of this project. At least in so far as the arguments given. You clamour for them, but haven’t argued for a good use case; ranking (your original idea) doesn’t count, as we already decided discoverability is not part of our goal. Why are they worthless? Because we cannot get accurate results without compromising user privacy. I’ve asked this multiple times already, and you’ve never addressed it. It’s no longer rhetorical. If we cannot get accurate results, why measure them at all? I reiterate, @leoj3n’s idea is a good one. Having a different project on top of this one that tracks those operations is more than fine. |
Firstly, you do told my something that I would never know, like 1.Is it can be of use (for whom?) About privacy problem, I’ve said, with user’s consent, then we can collect data to identify the machine. and if you are not comfortable with the idea of collecting things from the user’s computer, we can just drop the idea out for time being. As you said And I’ve said whether data is accurate is NOT that important as you thought, pure data to an ordinary user, come on, is naturally of no use, but adjusted, analyzed one would be amazing. Since you said that the rank job would strayed out from the cask’s goal(Why I didn’t know this before, and cask is really a setted-one-goal-driven program?) Would you like to tell my why this program is for? and what benefits it bring to users? and what all this is for? that’s the purpose. If do as you said as an GUI software. Nobody else out of brew cask can provide data like this.
|
I collect those statistics as a by-product of my bot checking the download of every Cask on a daily basis. These statistics are not collected from users, and are not a privacy threat.
Homebrew Cask, like it's cousin Homebrew, aims to be automated. That means as little interactivity as possible (e.g. yes/no prompts, etc.). Asking for consent is therefore not an option. To manually specify to allow data collection would mean that it might be more appropriate for this feature to be it's own Cask subcommand, which is beyond the scope of this project.
@vitorgalvao's point is that there is no plausible way to "adjust" the data to a degree of usefulness.
If you remove the mechanism to obtain statistics, then what good are methods to adjust and analyze them?
From the README (second paragraph):
Any deviation from the goal would justify a separate project. Hence my own project, Cask Developer Tools is separate from this project. |
We’ve kept ourselves as not a discoverability service from the start, and that seems like it’s the way it’s going to stay. Multiple requests for this have been made, and it was always decided the best course of action would be to not pursue that route. Maybe that discussion will become relevant again in the future, but it’s unlikely. Progress in the project itself will tell if it’s an idea worth revisiting. |
=why not homebrew-cask counts=
how many times the script is ran.
then produce an rank?
Surely most of us use homebrew-cask is kind of experienced or geek.
WITH USER'S AGREE,of course.
we may counts the users of softwares by the number of "script-ranners"
or by the machine?
Can we get the computer name?
or simply count the time(but softwares are updating frequency is quite different,
so every time the software updated,archive the old number and use an algorithm to process? or treat every edition as an single software? or can be set in the .rb file)
use git
Just add an database-like file.(or use gist?)
every time the script ran, commit it.
OR JUST USE A SERVER
Attach my own comments below here(stylished a bit)
==about privacy==
Count doesn't bring any privacy problems, the (only) privacy problem this may bring is that commits history is open to everyone, so if we do statistic by git, problem may arise.
but also, I doubt that share your app lists would bring concerns, since we are all on github, people like to share(like their dotfiles). even to everyone, I don't think people would mind to share what software they are using.
and if you worry about privacy like this, your "Repositories contributed to" just sold you out.
Actually, thanks to many download site ever existed, I really didn't know that caskroom didn't keep numbers like 'download times' before I start this issue(☆_☆). I did look for it for times.
==About use==
Why all the successful company go open?
providing statistics is liking providing api.
Let others use these numbers and referred to caskroom.
and have benefits of supporting app discoverability ever brought to your mind?(many are doing this).
App discoverability is not bad, reasons not to do so is provided by you refers and that doesn't contained this.
Maybe this is an easy way for caskroom to go more social(that means go into "everyone's computer")
statistics can be referred by articles, tweets, statistic analyzation organizations's reports or so.
In this way, can bring more people to use this tool, can bring people to contribute to. and maybe, would get ordinary users interested in ruby and go to learning(maybe this is too further but the possibility isn't light).
==About Actual Plan==
Just thought, we can do this in one really easy way
You said 'diverting resources', and now I won't be agreeble anymore. the easiest way is to build an url-shorten-like service(many of this kind of services provided URL counts, counts how many times the url is visited.)just add an script that automatically visit an 'url', this thing is done.
**This solved the privacy concerns by using git to do so mentioned before
like when I ran
after finished, automatically do
md5: generated from machine's MAC address or so,that to identify the machine
We can also do
plus:the value that passes to the address may also contains the version of brewcask to assured using the latest version or commit. and may this feature bring update-alert( or we can do auto-update?)
=="Actual" Plan==
I bring the url-shorten service up means...open source! We've already have the code, I believe there are many url shortener codes exists(like Your Own URL Shortener it contains statistics).
Now that's about one-line thing
(these may doesn't need build you own server, if someone got the api)
that's the most-simple and junior way to archive our goal. We can promote this time by time, just like I've posted. and people are skilled at different areas, that's good.
The text was updated successfully, but these errors were encountered: