-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per user filtering #30
Comments
ech3> At work I have to monitor files that differ on a per-user basis, I have a similar issue with my users as well. This is an obvious ech3> The basic use case is I want to have a cron job that sends out For my users, I use Netapps, so I turn on unlimited quotas, which gets I also use yet another script which looks for the top N largest files Unfortunately, keeping all this kind of data is expensive. Both in I think for now, this suggestion will have to go into the nice to have John |
We use use Netapps as well and have an internal script which produces a per user statistics, probably similar to what you are using. I saw philesite, but when I actually went to implement a test machine, I found duc and have a bias towards speed since the IT folks around here don't like us scanning the disk. When I started at my present job, I was assigned as being the maintainer of our script which goes and looks for big files, so we have a script that does something similar in that regard, it's just that presenting this visually makes it easier for the user to take action. It would be nice if I could pull from the scanned database the bigfiles as well, but that would probably be another request, and it seems you would want that as well. I am fine with this getting put on the way back burner if that's where it needs to go, but I get none of things I don't ask for, and thought it would be a good feature. I am not concerned with the DB size personally, but I may once I found out just how much bigger it was ;). Thanks for looking this issue over, and thanks for this program. |
ech3> We use use Netapps as well and have an internal script which Well, duc will be faster than philesight, and the DBs are much I'm happy to share my perl script which pulls out quota reports and Now how to present that visually will be hard to do clearly and One of the simpler options we've been thinking about is adding a ech3> I am fine with this getting put on the way back burner if that's Let's see, my largest philesight DB is 24G in size. I have a 3.6G Speedwise, I find that duc is faster by about 50% or so, but I don't John |
I'm not in your league yet in terms of DB size, the largest duc database I have laying around is 72MB at present. We mostly have a bunch of source code and a few really big files on our drives. If that perl script extracted the data from the tokyodb, it would be interesting, but otherwise I will go with our legacy system due to institutional momentum. All I am looking to do is to consolidate my drive scanning to one scan and get all the data I need so I don't have to do multiple passes over the drive. If it can't be done at this point, it can't be done at this point. If you guys don't want to gold plate and you considered any of this gold plating, I wouldn't blame you. Right now my plan is to only run duc on drives that are getting pretty full so that we get pretty graphs when bad stuff is going on. I would agree with you that getting the CGI working is a higher priority, I have not messed with that, but I could see how it would be much more valuable than what I am requesting. I know that just e-mailing a link to the CGI would probably be a lot less headache than getting the mime working for attaching the duc png since I have done that before. Plus interactivity is really nice if you guys have that and blows a static picture out of the water. |
Hi ech3, is this request still relevant for you? I'm considering adding an option to add user information, although the implementation will probably get quite hairy. |
I would still like to have this. For my case, I would like to have it even at the cost of having a super-huge database, a very long run time, and/or having to use non-standard switches. My situation is I have a NetApp filer which tells me what users are the major user of the files on a particular partition, but it gives me no idea of where those files are located. 9 times out of ten it's easy to figure out where the files are thanks to a normal duc graph, but every so often it's a royal pain. My use case is to send an individually tailored e-mail with a graph limited to only their user files so they know where their overages are located. I am fine with individually making each graph manually by specifying the user via command line. If you need me to provide any additional info let me know. |
Well, probably far from your ideal solution, but would it be acceptable :wq |
Well my understanding of the UNIX file structure was that the real problem with this feature was that you had to do an additional request for inode information for each file to get file ownership info. Sorry, I would have to pull my APUE to remember the exact way things are organized and I don't have it handy. The problem is that if I have to re-index for each user, I would probably have to beat on the filers with more traffic is something that my IT department will not be particularly overjoyed about. I will admit that most of the time I am doing this I care mostly about one user who is particularly offensive in terms of disk usage. Since I don't grok the code the way you do, I don't know how is easy or difficult it is to add this, and don't have the ability to invest the time you do, I won't criticize you if you decide to implement this whatever way you need to. |
Understandable, having to run multiple indexes kind of defy the whole I do like the feature myself, and I think it would make a valuable
The problem is not so much the gathering or storing of the user data per Having only uids/gids is not practical, since these will likely not map I think the following would make for the best solution:
The extra complexity is mostly in the handling of the names. Any thoughts? I'll give it a go and see if I can come up with a nice implementation. Thanks for the feedback! :wq |
I've been experimenting a bit with this feature, the results so far: I That was the easy part. Now I've run into an interesting problem with generating the graph If it wants to know this size, it has to calculate the graph from the A pragmatic solution would be to simply leave the graph shape intact, Example output: http://i.imgur.com/rlbBsXZ.png That's a directory structure with files owned by 3 different users. It's Not sure if this is usable at all in this state though? Another way of handling this would be to change the indexing and :wq |
Zevv:
Mostly I am on NIS managed systems, so I hadn't thought that far ahead. Zevv:
How hard would it be to create a "shadow" database that inserted the files one by one into the database, doing "appropriate" calculations for the directory sizes, then calculated the graph based on the "shadow", finally deleting it. |
Side note, Zevv, your sig reminds me of this page: https://www.gnu.org/fun/jokes/ed-msg.html |
That would be the kind of workflow indeed; the whole tree needs to be I'm planning to create import functionalty for reading out from tools
This way we keep duc simple, and with a small wrapper shell script :wq |
You saw that just right, and I did not make the signature up. One day I :wq |
Hello Zevv, Thank for your work on DUC. It is actually really helpful for us. We would be also interested in having this per user filter. Is there any ETA on the feature? Regards, |
Not yet; I'm still not quite sure how to implement this. It seems that This could be combined with an export/import function with filter index db1 ---> export db1 (user=john) --> import db2 --> graph db2 It's quite cumbersome though because an export and import needs to be :wq |
Closing this ticket for now. The request has been added to |
Adding user information to the index is a big +10 from me. Our use case is similar in that we want to know which users to notify that are using the most the disk space. I agree that per-user graphs can get complicated. Maybe a more immediate solution would be to color the files in the graph by the user that owns them. It could be a toggle between the default graph color scheme and the user color scheme. This would be for admin use to get an idea of which users own what. This still needs to mapping between uid and user names though. |
At work I have to monitor files that differ on a per-user basis, so it would be nice to produce graphs that are limited to a single user and / or use the data to produce a graph that showed the percentage of the disk space used by a specific user. This way I could send users graphs of the disk usage limited to their user.
The basic use case is I want to have a cron job that sends out an e-mail to each user telling them how much disk they are using as a percentage in a pie chart with all users listed, and below that another graph that shows them where their specific user's disk usage is concentrated in, which is a normal duc graph filtered for their user. The user could tell from the first graph whether they needed to take action, then from the second graph they could figure out where big files were if they wanted to delete them.
The text was updated successfully, but these errors were encountered: