Skip to content

Conversation

@viper-kun
Copy link
Contributor

link #2391

@SparkQA
Copy link

SparkQA commented Sep 20, 2014

Can one of the admins verify this patch?

@viper-kun viper-kun changed the title Periodic cleanup event logs [SPARK-3562]Periodic cleanup event logs Sep 20, 2014
@mattf
Copy link

mattf commented Sep 21, 2014

i strongly suggest against duplicating functionality that is already provided by the system where these logs are written.

however, if you proceed with this, the logic for triggering a clean needs to be improved. first, interrupts to sleep need to be considered. second, clean should occur on startup and the interval, otherwise it may never occur at all. third, if cleaning may be an expensive operation, it more desirable to trigger at a known off peak time / predictable time, instead of X seconds since startup (default every day keyed off startup).

if there's special handling that should be done with these log files, i'd suggest a log clean utility that can be triggered by cron.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a.k.a.

Duration(1, TimeUnit.DAYS).toSeconds()

@vanzin
Copy link
Contributor

vanzin commented Sep 23, 2014

I think there's a very unlikely race in your code: it's possible, if things are messed up just right, that the reader thread might try to read a log file that is being deleted by the cleaner thread. I believe that the code will handle that correctly, but it doesn't hurt to check.

@mattf don't know what you mean by "functionality that is already provided by the system". I'm not aware of HDFS having any way to automatically do housekeeping of old files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a lot like logCheckingThread. Maybe it makes sense to make an abstract parent thread that both of these extend.

@viper-kun
Copy link
Contributor Author

Thanks for your options. @vanzin @andrewor14 .i have changed code according your options.

@mattf
Copy link

mattf commented Oct 2, 2014

don't know what you mean by "functionality that is already provided by the system". I'm not aware of HDFS having any way to automatically do housekeeping of old files.

a system approach means using something like logrotate or a cleaner process that's run from cron.

such an approach is beneficial in a number of ways, including reducing the complexity of spark by not duplicating functionality that's already available in spark's environment - akin to using a standard library for i/o instead of interacting w/ devices directly. in this case the context for the environment is the system, where you'll find things like logrotate and cron readily available.

as for rotating logs in hdfs - i wouldn't expect hdfs to provide such a feature, because the feature serves a specific use case on top of hdfs. some searching suggests that there are a few solutions available for doing rotation or pruning of files in hdfs and points out that rotating/pruning/cleaning/purging can be done remotely and independently from spark since hdfs is distributed.

@vanzin
Copy link
Contributor

vanzin commented Oct 2, 2014

a system approach means using something like logrotate or a cleaner process that's run from cron.

The only thing you can really use system utilities for is cron, which is the least important part of this change. Really, this is not an expensive process that will bring down the HDFS server, and it's scheduled to run at very long intervals. The constant polling for new logs is orders of magnitude more disruptive than this cleanup thread.

AFAIK, logrotate doesn't work on HDFS. Now you'd be asking for people to set us the NFS bridge or even fuse-hdfs just to clean up Spark event log files.

Finally, Spark theoretically supports Windows. This is a simple way to achieve compatibility with that. And it doesn't require people to set things up outside of their Spark ecosystem, meaning it's easier to maintain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use a thread factory so that you can set the thread name and daemon status. See com.google.common.util.concurrent.ThreadFactoryBuilder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, another thing: you should override stop() and shut down this executor cleanly (it's mostly a "best effort" thing, but still).

@vanzin
Copy link
Contributor

vanzin commented Oct 16, 2014

@viper-kun mostly good, just a few minor things left as far as I'm concerned.

@viper-kun
Copy link
Contributor Author

@vanzin. is it ok to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add $dir to the log message, in case it does not show up in the exception.

@vanzin
Copy link
Contributor

vanzin commented Oct 21, 2014

@viper-kun lgtm, but you'll need to get the attention of a committer. :-)

@viper-kun
Copy link
Contributor Author

@vanzin @andrewor14. is it ok to go?

@viper-kun
Copy link
Contributor Author

@vanzin @andrewor14 @srowen . is it ok to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: you could use Utils.namedThreadFactory() here (just noticed that method the other day).

@vanzin
Copy link
Contributor

vanzin commented Nov 19, 2014

LGTM. Everybody else is kinda busy with releases so I doubt they'll look at this in the next several days...

@suyanNone
Copy link
Contributor

@vanzin Is this patch ok to merge?

@vanzin
Copy link
Contributor

vanzin commented Jan 22, 2015

I'm not a committer so I can't merge the patch. But it has merge conflicts now, so that at least needs to be fixed.

@suyanNone
Copy link
Contributor

@vanzin = =! I got it, sigh~

@viper-kun
Copy link
Contributor Author

I have file a new pr #4214

@vanzin
Copy link
Contributor

vanzin commented Jan 27, 2015

@viper-kun could you close this one in that case? thanks!

@viper-kun viper-kun closed this Jan 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants