Skip to content

Improve logging accuracy#299

Merged
dlang-bot merged 1 commit intodlang:masterfrom
marler8997:rdmdFixLogging
Feb 8, 2018
Merged

Improve logging accuracy#299
dlang-bot merged 1 commit intodlang:masterfrom
marler8997:rdmdFixLogging

Conversation

@marler8997
Copy link
Contributor

No description provided.

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@WebDrake
Copy link
Contributor

Thanks, this looks much more tractable.

I'll have a proper look tomorrow, but in the meanwhile:

  • it'd be a good idea to write up a more detailed commit message summarizing the problems this patch is trying to address, the rationale behind the current approach, and its limitations (e.g. it doesn't address the difference between Windows and POSIX commands)

  • the existsAsFile function outputs one stat report but invokes 2 functions that use stat under the hood

  • do you need generic wrapper functions? Will it suffice to use declarations like bool exists (string name) ... ?

@marler8997
Copy link
Contributor Author

...it doesn't address the difference between Windows and POSIX commands...

I don't see this as a problem. I want to know that rdmd is "poking" files, but it doesn't really matter to me what specific functions it's using. This seems consistent with how the logging currently exists.

the existsAsFile function outputs one stat report but invokes 2 functions that use stat under the hood

In my mind, the point of the logging is to tell the developer what's going on. Yes rdmd actually calls stat twice, but that detail isn't important when it comes to see what rdmd is doing "functionally".

@WebDrake
Copy link
Contributor

I don't see this as a problem.

It's not a big deal; I used it as an example. The point is that it's helpful if you can write up a more detailed commit message describing in more detail the motivation for these changes, how these changes address those concerns, and so on.

In my mind, the point of the logging is to tell the developer what's going on. Yes rdmd actually calls stat twice, but that detail isn't important when it comes to see what rdmd is doing "functionally".

This seems at odds with what you said in the previous PR:

I noticed that rdmd was doing multiple stat operations one after the other on the same file. Anyone looking at the logs would immediately question why it was doing the exact same thing twice.

... which suggests that you wanted to ensure that verbose output matched rdmd behaviour. Can you clarify?

In any case, the existsAsFile function is unnecessary. All the other functions make sense as direct wrappers of a call to a similarly named function in std.file. existsAsFile is superfluous given that you already have exists and isFile wrappers. It's only used in one place in your PR (leaving other places where the exists && isFile combo are used). All the other functions can be introduced without any need to modify their call sites.

So, I'd suggest dropping existsAsFile. It just gets in the way and leaves confusion about what functions to use.

Copy link
Contributor

@WebDrake WebDrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more thoughts and notes on the changes proposed here.

I kind of feel that this approach is right but not necessarily complete enough. In particular, there's a discussion to be had about the way that dryRun interacts with verbose output.

I'm going to try to summarize this in a few issues.

rdmd.d Outdated
auto cutoff = Clock.currTime() - dur!"hours"(lifetimeInHours);
yap("dirEntries ", pathname);
foreach (DirEntry d; dirEntries(pathname, SpanMode.shallow))
foreach (d; std.file.dirEntries(pathname, std.file.SpanMode.shallow))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd: why not create a similar wrapper function here that handles the yap under the hood? Why preserve a manual yap call in the code as is done here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seemed unnecessary here. this function is only called once and the wrapper function would be larger than then other functions.

rdmd.d Outdated
std.digest.md, std.exception, std.getopt,
std.parallelism, std.path, std.process, std.range, std.regex,
std.stdio, std.string, std.typetuple;
static import std.file;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that an alternative to static import std.file could be to use selective imports that very carefully determine which std.file functions are allowed to be used unwrapped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'm aware. I think static in this case makes more sense.

rdmd.d Outdated
try
remove(fullExe);
catch (FileException e)
std.file.remove(fullExe);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use std.file.remove here (with a manual yap above) rather than a call to the remove wrapper function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the code and I think you'll see why.

rdmd.d Outdated
{
yap("mkdirRecurse ", name);
if (!dryRun)
return std.file.mkdirRecurse(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This touches on a rather interesting problem of rdmd: dry-run mode is implemented manually throughout the codebase. This means that in various places if (!dryRun) blocks mean that different output will actually be produced for --dry-run as opposed to just --chatty.

It might therefore be worth rethinking the wrapper functions as not just encapsulating the needed yap calls but also encapsulating the if (!dryRun) as and where needed (i.e. for those filesystem calls that actually change the filesystem, as opposed to just querying it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only function affected by dryRun. this approach makes sense to me

rdmd.d Outdated
string absPath = buildPath(envPath, path ~ extension);
yap("stat ", absPath);
if (exists(absPath) && isFile(absPath))
if (existsAsFile(absPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use if (exists(absPath) && isFile(absPath)) here. It's simpler, it saves you defining an unnecessary extra wrapper function, and it makes the verbose report more correct.

rdmd.d Outdated
//
// std.file wrappers that also log their operations
//
auto exists(R)(R name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any need for the templates here? It's easier to just support string and modify if another use-case ever comes up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates are easier for me

@marler8997
Copy link
Contributor Author

So, I'd suggest dropping existsAsFile. It just gets in the way and leaves confusion about what functions to use.

The point of existsAsFile is to make both checks without having to have 2 log statements. Otherwise you'll get a bunch of duplicate stat log statements.

@WebDrake
Copy link
Contributor

The point of existsAsFile is to make both checks without having to have 2 log statements. Otherwise you'll get a bunch of duplicate stat log statements.

The absence of those 2 log statements was something that you were complaining about in the previous PR. Can you please clarify?

@marler8997
Copy link
Contributor Author

I was complaining about the logging accuracy. File operations that were being done with NO log statements at all. Combining 2 log statements for the same 2 operations is not a problem.

@WebDrake
Copy link
Contributor

What you said at the time was:

Once I did this, I noticed that rdmd was doing multiple stat operations one after the other on the same file. Anyone looking at the logs would immediately question why it was doing the exact same thing twice.

... which seemed to suggest this was the focus of your problem.

Since you now say it's other things that were concerning you, can you give some specific examples of the missing yap statements?

The concern I have is that in some cases it might be deliberate that nothing gets reported.

@marler8997
Copy link
Contributor Author

marler8997 commented Jan 25, 2018

Since you now say it's other things that were concerning you, can you give some specific examples of the missing yap statements?

Look at the code yourself. I don't want to spend the time to explain every little thing to you.

@WebDrake
Copy link
Contributor

Look at the code yourself. I don't have to to explain every little thing to you.

Gosh, you are a charmer.

You are the one who is trying to justify making a change to rdmd. It's therefore in your interest (as well as common courtesy) to try to provide as much detail as possible to your reviewers about what you want to change and why.

I'm asking you a polite question to try to better understand your perception of why this change is necessary, so that I can offer more constructive advice. I've also given a motivation for that question which is quite reasonable. So please, just answer it. I'm asking you to save me time by telling me what you already know, instead of wasting my time by asking me to second-guess what your concerns are or what bits of the code you're really trying to fix.

Your refusal (in the previous PR) to answer my questions wound up wasting not only my time, but that of two other reviewers. We're all volunteers here, so our time is precious and limited; anything you can do to save it matters, and anything you do that deliberately wastes it is frankly antisocial.

You've already behaved towards me, on multiple occasions now, with no small amount of arrogance, rudeness, and personal attacks. Despite that, I've consistently offered constructive and detailed feedback on your code and concerns -- feedback that seems to have been useful to you, since this PR takes ideas from it. So I think that, just maybe, it might be reasonable to expect you to start engaging a bit more constructively yourself.

@WebDrake
Copy link
Contributor

I'm asking you to save me time by telling me what you already know, instead of wasting my time by asking me to second-guess what your concerns are or what bits of the code you're really trying to fix.

By the way, if the concern really is about missing yap statements, then there is a much simpler solution than this PR.

Just add them.

That will fix the immediate problem, in a completely non-intrusive way. It will also be much more easy to spot if there are any cases where the omission may be deliberate. And then, after that, we can have a much more effective discussion about how best to refactor the code to avoid relying on developer virtue in reporting chatty output.

@marler8997
Copy link
Contributor Author

I don't mean to be rude. I just don't have time to explain everything. It's a waste of both our times. In my opinion, you're discussing this PR much more than it needs to be. Please limit your suggestions as much as you can. Since I can't spend all the time it takes to address all of your questions, please consider making this change yourself. It will save both of us time.

@WebDrake
Copy link
Contributor

OK. I can understand limited time and its frustrations.

I do strongly advise, though, that in that case you try to communicate much more up-front about the specific reasons and thinking behind your PRs. It takes a little more time when preparing them, but saves a LOT of time (yours too!) when they are being reviewed.

@marler8997
Copy link
Contributor Author

Could we get this one merged? The new logic for rdmd I'm working on is quite complex and fixing the logging would be very helpful.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 3, 2018

The bottom line of this is that it feels like a more intrusive change than is necessary, without a really good sense of the impact it has.

The newly introduced functions are not just well-defined verbose wrappers for std.file functions: some of them intertwine with the dryRun settings, one of them combines two options (divorcing us from the wrapper-of-std.file logic), and their usage elsewhere in rdmd.d is inconsistent (sometimes you use the wrappers, sometimes std.file functions directly). So, without a more careful think through what is wanted and why, it seems more confusing than helpful.

I'd suggest just creating a new PR with the missing yap statements added directly. That's simple and doesn't change anything about the app logic. We can then come back at some later time to work out if it's worth reworking how dry-run and chatty behaviour is managed.

It's better to make the simplest change to get the desired result, rather than add to the existing complexity of rdmd.d.

@marler8997
Copy link
Contributor Author

I respectfully disagree with just about everything you said. If a maintainer could take a look and either merge or provide improvement suggestions it would be great to get this logging fixed.

@marler8997 marler8997 changed the title Improve logging accuracy and make it harder to violate. (Option 2) Improve logging accuracy Feb 3, 2018
@marler8997 marler8997 force-pushed the rdmdFixLogging branch 2 times, most recently from 60cfac1 to 48db854 Compare February 3, 2018 15:55
@WebDrake
Copy link
Contributor

WebDrake commented Feb 3, 2018

Yes, it would be good to get a third viewpoint on this PR.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 3, 2018

FWIW, I'm not opposed to the broad thrust of what you're trying to achieve here. I just think it'd be a good idea to distinguish between the short term need (fixing missing verbose output) and the long term one (providing a more robust and consistently reported structure for rdmd's interactions with the filesystem).

@wilzbach
Copy link
Contributor

wilzbach commented Feb 6, 2018

It's a shame there's no set -x for compiled languages. Imho this PR is worthwhile as it moves all these littered Yap statements in a common block.

Ideally it should could/should use static foreach or opDispatch (I personally wouldn't mind seeing remove in the log instead of rm - what do you think @CyberShadow?)

In general don't think we should waste too time on such bike-shedding discussions. Let's tackle the big things, not the color of the shed!

@WebDrake
Copy link
Contributor

WebDrake commented Feb 6, 2018

Imho this PR is worthwhile as it moves all these littered Yap statements in a common block.

It doesn't though (it moves some, but not all). That's the problem :-(

I agree the aim is worthwhile but I think the issue of chatty output needs a more comprehensive and considered think-through to try to separate out all the concerns. I'll try to file an issue or two on this in the next days.

I don't agree that this is bikeshedding -- code consistency and clarity is not a trivial issue.

@wilzbach
Copy link
Contributor

wilzbach commented Feb 6, 2018

I don't agree that this is bikeshedding -- code consistency and clarity is not a trivial issue.

I agree in general, but from my experience on the effort I spent on introducing and improving the code consistency at phobos, the time is better spent on fixing actual bugs.

I agree the aim is worthwhile but I think the issue of chatty output needs a more comprehensive and considered think-through

I agree that if the yap struct introduces new functions that might have its downsides as a reader isn't familiar with them, but then again existsAsFile is called twice so while not a huge refactoring, it does remove duplicated code.

Anyhow independent to the discussion of whether existsAsFile is worth the troubles,
I'm on the side of using static foreach or opDispatch to generate these forwarding.
There is also the possibility to do yap!exists which would work too.

@CyberShadow
Copy link
Member

I left a review on #295, most of which applies to this PR as well:
#295 (review)

@marler8997 marler8997 force-pushed the rdmdFixLogging branch 2 times, most recently from 2cf20c2 to d556590 Compare February 6, 2018 04:17
@marler8997 marler8997 changed the title (Option 2) Improve logging accuracy Improve logging accuracy Feb 6, 2018
Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you'll do another pass, a few more nitpicks:

struct Filesystem
{
static:
static auto opDispatch(string func, T...)(T args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static on function is redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the redundancy is ok....if this struct had really big functions, and you couldn't see the static: modifier, having static in the function definition is helpful. But if you really don't like the redundancy, I'll remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK, I didn't realize it was intentional.

rdmd.d Outdated
/**
Update a file's timestamp.
*/
static void touch(string name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto and below

rdmd.d Outdated
Operates on the file system without logging its operations or being
affected by dryRun.
*/
struct DirectFilesystem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we shouldn't be using this directly, maybe move it under Filesystem?

@wilzbach
Copy link
Contributor

wilzbach commented Feb 6, 2018

BTW I created #tools on Slack in case you want to have a more real-time discussion (it even supports calls :P)

@joseph-wakeling-sociomantic

BTW I created #tools on Slack in case you want to have a more real-time discussion (it even supports calls :P)

Unfortunately I can't get on dlang slack right now :-\

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it 👍
Thanks for all the hard work!

@joseph-wakeling-sociomantic any final words or can we focus on more impactful PRs? ;-)

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

Well, my objections haven't changed, but if everyone else approves I don't see any reason to stand in the way.

I still strongly dislike the introduction of existsAsFile, not because of 1:1 matching with syscalls, but because it takes away from 1:1 matching with how one would do things if just using std.file functionality directly. I really don't see the point in introducing a new function/method just to avoid an extra line of yap output. touch is different because at least it does something reasonable and unique.

I think the approach taken here can be tightened up further and made into a more rigorous "safety wrapper" around std.file functionality (cf. earlier remarks about guaranteeing dry-run by default), but I guess I can always follow up on that later.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

(Basically, I don't see the point in writing more code for a less accurate and informative outcome. And I don't think I've ever had actually had an explanation for why that's worth doing.)

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 8, 2018

And I don't think I've ever had actually had an explanation for why that's worth doing.

Must be "goldfish syndrome". Here's 4 times I've explained it:

the point of the logging is to tell the developer what's going on. Yes rdmd actually calls stat twice, but that detail isn't important when it comes to see what rdmd is doing "functionally".

The point of existsAsFile is to make both checks without having to have 2 log statements. Otherwise you'll get a bunch of duplicate stat log statements.

I was complaining about the logging accuracy. File operations that were being done with NO log statements at all. Combining 2 log statements for the same 2 operations is not a problem.

The reason for it is to reduce noise in the log that doesn't help understand what rdmd is doing. That's it...it's that simple.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

Yes rdmd actually calls stat twice, but that detail isn't important when it comes to see what rdmd is doing "functionally".

No, but if the detail isn't important, it's also not important enough to write extra code to prevent it happening.

The point of existsAsFile is to make both checks without having to have 2 log statements. Otherwise you'll get a bunch of duplicate stat log statements.

I'm aware that that's the point of it. I'm suggesting that's a bad tradeoff (more code for a less comprehensive result).

Combining 2 log statements for the same 2 operations is not a problem.

No, but why put effort into actively creating that situation when you don't have to?

The reason for it is to reduce noise in the log that doesn't help understand what rdmd is doing.

That it's "noise in the log" is a contention ... not a fact. (For example, part of the reason it may seem "noise" is because yap is asked to report stat for both exists and isFile checks. If instead it was asked to report exists and isFile directly -- and why not? -- it would be more useful and informative.)

@marler8997
Copy link
Contributor Author

Ok it looks like we disagree in this case, and that's ok.

I don't have a problem with disagreement, but I do have a problem with the way you discuss these disagreements. You tend to bring up a few valid points, but then sprinkle in demonstrably false claims and inferences (i.e. no one has explained to you the reason for existsAsFile). You're very quick to make assertions without knowing all the details, declaring how things should be done without first hearing the other side. As tails would say, "dis is not da wae". I hope our conversation can be more constructive in the future.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

I didn't say that no one had explained to me the reason for existsAsFile. I said that I didn't think I'd been given an explanation for why it was worth writing more code for a less informative and accurate outcome.

@marler8997
Copy link
Contributor Author

I didn't say that no one had explained to me the reason for existsAsFile. I said that I didn't think I'd been given an explanation for why it was worth writing more code for a less informative and accurate outcome.

Yes I suppose that's true. So you were saying that the reasons I gave weren't enough to justify a new function with 1 line of code. Ok, glad we got that one cleared up.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

So you were saying that the reasons I gave weren't enough to justify a new function with 1 line of code.

Yes, especially given that the introduction of this function creates a situation where there are 2 different ways of doing the same thing instead of 1 (i.e. a dev can do either Filesystem.exists && Filesystem.isFile or Filesystem.existsAsFile). Given that both options are used even within this patch, it would be nicer to just have one consistent way of doing this for every use case.

@marler8997
Copy link
Contributor Author

@WebDrake We've already established that we fundamentally disagree on this. Rehashing the arguments doesn't help, it just wastes time. I would suggest learning when to move on.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

We've already established that we fundamentally disagree on this.

These discussion points aren't just made for you. It matters that other reviewers understand why there is a disagreement, because this helps them make a more informed decision.

There's a slightly bigger picture line of thought I'm having, which it might be good to explain. I'm considering whether it might be possible to replace the encapsulation-breaking yap usage, and instead of "syscall-like" output, give a higher-level description that will be accurate on both Windows and Posix systems. The most obvious higher-level description is to just yap the names of the std.file functions that are invoked. And then, separate exists and isFile reports start making sense (more sense than two stats).

So, I'd like to ask you to not impose something that I'll have to either unpick or separately extend if I want to do that.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

I would suggest learning when to move on.

I'll just add: I've already stated above, "If everyone else approves I don't see any reason to stand in the way."

Maybe it would be a good idea to drop the ad-hominem?

@dlang-bot dlang-bot merged commit 9ceca03 into dlang:master Feb 8, 2018
@marler8997
Copy link
Contributor Author

If attempting to promote constructive discussion is "ad-hominem", then I think it's a good idea to do so every once in a while when discussion "devolves" :)

I hope there's no ill feelings and we can continue to work together. I'm glad you're looking into adding more testing, that's a great way to help. We may differ on some things but I can see that you're a bright person, even though I think some of your effort is a bit misguided. Anyway, I hope our discussions are more productive in the future.

@WebDrake
Copy link
Contributor

WebDrake commented Feb 8, 2018

Thanks for taking the time to write this. No hard feelings -- constructive disagreement isn't a bad thing, and I always have a lot of gratitude for anyone who takes the time to work on any part of D. I look forward to those productive future discussions :-)

@marler8997 marler8997 deleted the rdmdFixLogging branch September 26, 2019 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants