Skip to content

Isolate tests to one file using HAR#7905

Closed
marler8997 wants to merge 1 commit intodlang:masterfrom
marler8997:har
Closed

Isolate tests to one file using HAR#7905
marler8997 wants to merge 1 commit intodlang:masterfrom
marler8997:har

Conversation

@marler8997
Copy link
Contributor

@marler8997 marler8997 commented Feb 16, 2018

The HAR format allows a test with multiple files to be contained within a single file. This makes it easy to see all the files involved in a single test, making it easier to understand the test and modify it. This also allows tests to use simple filenames without having to worry about stomping on files from other tests.

I've noticed that many of the tests are made up of multiple small files...having the ability to manage them in one file seems easier than dealing with a bunch of little files.

Also note that using the HAR format would allow one to copy/paste a tests with multiple files to/from online posts. A nice convenience.

HAR: https://github.com/marler8997/har
forum discussion about HAR: https://forum.dlang.org/thread/ndgdqraxjkuvfsjhewgy@forum.dlang.org

@dlang-bot
Copy link
Contributor

dlang-bot commented Feb 16, 2018

Thanks for your pull request and interest in making D better, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + dmd#7905"

@marler8997 marler8997 force-pushed the har branch 4 times, most recently from 5ad1011 to ab415ad Compare February 17, 2018 21:32
@MartinNowak
Copy link
Member

We do use ---- in ddoc to delimit code examples, any chance of conflict?
Somewhat of a nice idea, but it's also a new format that needs to be learned, and using files from imports and extra-files has hardly been a problem so far.
I do like the perspective of self-contained test-cases that might be copy+pasted from Bugzilla, and maybe publishing d_do_test to a small dub app to run them.
So if you can teach&convince lots of people to use this format (maybe a forum post), that would definitely pan out in the long-term.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 17, 2018

We do use ---- in ddoc to delimit code examples, any chance of conflict?

The default HAR delimiter requires 3 dashes, a space (ascii 0x20) and then a filename character. This doesn't conflict with ddoc as far as I can tell. However, if a test did have a conflict, HAR allows for a file to use it's own delimiter (https://github.com/marler8997/har#custom-delimiters) even though the standard --- delimiter should be used whenever possible.

Somewhat of a nice idea, but it's also a new format that needs to be learned, and using files from imports and extra-files has hardly been a problem so far.

Not a big problem but a small annoyance for me. Managing tests with multiple files is a pain sometimes. A pain to know all the files involved in the test, a pain to make tests with multiple files. I'm going to need to add some tests pretty soon with up to 5 small files (2 to 5 lines each) and I'm not looking forward to that. I'll probably need to add 5 new tests, each with their own set of files...that's like 25 files...bleh (this is to test the new import order invariance stuff...requires test cases with multiple files).

So if you can teach&convince lots of people to use this format (maybe a forum post), that would definitely pan out in the long-term.

Yeah there's a forum link in the description. @timotheecour and I just came up with the format a couple days ago so it's too soon to know what people think.

@JinShil
Copy link
Contributor

JinShil commented Feb 18, 2018

I very much like the idea of putting multiple modules in a single file. IMO the strict mapping of one module to one file is an arbitrary limitation that doesn't need to exist.

This PR highlights an interesting use case for adding multiple modules to one file, namely making it easier to write, read, and maintain our tests. The ability to create these file hierarchies in a forum post, and the ability to use such examples at http://run.dlang.io to demonstrate and share ideas is another.

I also had an interesting use case that would have benefited from having the ability to put multiple modules in a single file. I wrote a memory-mapped IO library in which I was trying to model peripherals, registers, and bitfields in a memory-mapped hierarchy. Everything about the memory map is known statically. I wanted to use modules in D for this to create a device tree (https://en.wikipedia.org/wiki/Device_tree). However, if would have used modules for that, I would have ended up with 100s of really small files. So, instead, I simulated it with an abstract final class; let the ridiculousness of that sink in for a minute. In the end the abstract final class worked out well, but I think it would have be more elegant to just remove the arbitrary limitation of mapping modules to files so we can have a more flexible and convenient language, in general.

Using the "har syntax" demonstrated in this PR is an interesting approach; it's essentially introducing a preprocessor, which will probably not sit well with @WalterBright. I would prefer if we could address this more fundamentally in the language, rather than introduce such a language "add-on".

Here's what I'm thinking:

package std                // maps to directory `std` and `std/package.d`
{    
    // code

    package algorithm      // maps to directory `std/algorithm` and `std/algorithm/package.d`
    {
        // code

        module comparison  // maps to `std/algorithm/comparison.d`
        {
            // code
        }
    }
    
    module csv             // maps to `std/csv.d`
    {  
        // code
    }
}

Something like this would probably require a DIP, and probably be a much more difficult implementation than what's been done in this PR. But that's the cost of due diligence, and the language would be better off with it, IMO.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 18, 2018

Not bad ideas. But I do want to say that this is definitely not a preprocessor nor a language add on. The compiler knows nothing about the HAR format, it only exists as a tool in the test suite.

Changing the language to support multiple modules in a file is a much larger endeavor...something that would be great to investigate, but I don't want to be conflated with this effort. Even if that feature was implemented, it would add a new mechanism to pass modules to the compiler, but the current mechanism of putting modules in different files would still need to be tested, and that's what HAR can be used for. These are essentially orthogonal features, even though both can be used to put multiple modules in a single file.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 18, 2018

BTW, if the language were changed to support multiple modules per file, your approach makes much more sense than using HAR as a preprocessor. Your approach limits the feature to only allow nested modules, which provides more flexibility without losing the intuitive organization which makes it easier to find source code. It also wouldn't allow weird stuff like making files whose module names are different from their filenames.

@JinShil
Copy link
Contributor

JinShil commented Feb 18, 2018

What about modifying HAR to use the syntax I proposed?

@marler8997
Copy link
Contributor Author

HAR is a general syntax for representing multiple files in a single block of text. Not specific to any language or format. The syntax you presented is for representing D modules in a single file.

@JinShil
Copy link
Contributor

JinShil commented Feb 18, 2018

Ok, then, perhaps we could make DAR (D programming language Archive Format) and use that instead of HAR.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 18, 2018

Sorry I'm failing to see the reason...are you wanting to add a new file format that the compiler can read multiple modules from? or are you talking about a file format for the test suite to represent multiple files? Keep in mind, this isn't just for D files, you could put JSON files/html files/BASH files, etc

@JinShil
Copy link
Contributor

JinShil commented Feb 19, 2018

What I'm proposing is to replace the --- modulename.d syntax with the syntax module modulename { }, but otherwise keep everything else basically the same. Let's call this syntax "DAR" for the sake of this discussion.

I realize that implementing the DAR syntax in the language would be a major undertaking requiring a DIP and a number of other hurdles. So, to keep with the spirit of this PR, while also setting a precedent for a potential language modification, change the implementation, currently in har.d, to support the DAR syntax instead.

That is, instead of the ---- modulename.d syntax, modify har.d (could be renamed to dar.d) to support the syntax module modulename { }. It should produce the same output as any equivalent HAR file.

The parser would probably be more difficult to implement, but it would set a precedent that could motivate a future DIP for incorporating the syntax into the language with potential future benefits such as having accurate line numbers in error messages, just to name one.

The syntax can still be used for the same use cases as HAR (at least in the context of D) For example: improving our test implementation as demonstrated in this PR, sharing code in forum posts, illustrating concepts in DLang Tour, and sharing code and ideas through http://run.dlang.io, just to name a few.

Yes, DAR would only be useful for D source code, unfortunately. Though HAR could still be used to archive DAR files together with the other file formats HAR would support.

@marler8997
Copy link
Contributor Author

I'm not necessarily against these ideas, but they seem orthogonal to me. These 2 formats can coexist and serve different purposes. I don't want this idea to be conflated with a much larger endeaver.

@JinShil
Copy link
Contributor

JinShil commented Feb 19, 2018

The two ideas (HAR and DAR) are orthogonal in that they can coexist with one another, but, in the context of D and in the spirit of this PR as demonstrated in importorder.har, they overlap in use case and purpose (e.g. single-file tests, sharing code in forum posts, illustrating concepts in DLang Tour, and sharing code and ideas through http://run.dlang.io, to name a few)

@marler8997
Copy link
Contributor Author

They would be testing different things. All har does is archive files. Your vision for DAR would be to have the compiler read it directly, but thats different than testing multiple files. These are different use cases.

@JinShil
Copy link
Contributor

JinShil commented Feb 19, 2018

I don't envision the compiler reading the .dar files. At this time, I'm simply proposing replacing the --- modulename.d syntax with module mymodule { }, keeping everything in this PR essentially the same.

Perhaps at some time in the future the DAR syntax can be incorporated into the compiler, so .dar files would no longer be needed, and we could simply use ordinary .d files. But, I'm not proposing that at this time, for this PR.

@marler8997
Copy link
Contributor Author

I'm going to have to respectfully disagree on this one. I think your syntax idea could be a good addition to the D language, but I dont see any benefit in its use here.

@wilzbach
Copy link
Contributor

@JinShil have a look at the HAR parser. It's super-simple at the moment and still general purpose. Your idea of DAR requires at least a D lexer.

Anyhow, I like the idea and imho it would be nice to have it on dlang.org, DFeed and Bugzilla, s.t. all these sites have an easy way to send multiple files to run.dlang.io while still being readable for humans.

(and of course on the test suite here too)

@marler8997 marler8997 force-pushed the har branch 2 times, most recently from fffa7cc to 35d2e20 Compare February 20, 2018 03:04
@marler8997 marler8997 changed the title [RFC] Isolate tests to one file using HAR Isolate tests to one file using HAR Feb 20, 2018
@marler8997 marler8997 force-pushed the har branch 2 times, most recently from 647de54 to 1d633cf Compare February 20, 2018 03:57
@wilzbach
Copy link
Contributor

Anyhow, I like the idea and imho it would be nice to have it on dlang.org, DFeed and Bugzilla, s.t. all these sites have an easy way to send multiple files to run.dlang.io while still being readable for humans.

Turns out it's really simple to do it for run.dlang.io (and thus the DTour, DSpec, and Phobos documentation:

https://run.dlang.io/is/uearnw

-> I vote in favor of giving this a try. How else are people going to know about it and use it?

Also the implementation of har is really simple and it wouldn't be much work to extract all har tests to their files in the source code if we ever decide to drop it in the future.

@JinShil
Copy link
Contributor

JinShil commented Feb 21, 2018

I don't object at all to implementing this feature, and am even willing to pull it as is, but I recognize that the only reason this is required is because of a flaw in the design of the language.

The fact that the source code must mirror the way it is physically stored on the file system is an arbitrary limitation. HAR would not be required if the syntax I proposed earlier were incorporated into the language. Even the shell script tests would be able to generate their requisite files using the shell language itself.

I am well aware that doing the work to incorporate said syntax into the language would be a major undertaking, and the HAR solution is vastly simpler by comparison, but that is the cost of due diligence.

If I'm the only one that sees this, then so be it; it is on me to do something about it, So, I'll support this HAR format as it does indeed provide a simple and immediate solution to a number of inconveniences caused by the aforementioned design flaw.

@marler8997
Copy link
Contributor Author

HAR is useful even if the language supported multiple modules per file. It allows you to create tests with mutiple files. This is useful even if your syntax was integrated.

@JinShil
Copy link
Contributor

JinShil commented Feb 21, 2018

I noticed this PR creates a new directory for the HAR code: test/archive/har.d Can we just put har.d in the root of test, instead? I don't see the need for a separate directory to house one file.

@marler8997
Copy link
Contributor Author

marler8997 commented Mar 19, 2018

Rebased to see if the OSX failure goes away. This one is more curious since only test files were modified but the failure occurs when compiling dmd-unittest which doesn't use any test files as far as I'm aware.

@ibuclaw I didn't quite understand what you were trying to say. You mentioned something wouldn't work on your side..what were you referring to? I also wasn't sure what you were trying to say when talking about how this is adding more time to the test suite...?

@marler8997
Copy link
Contributor Author

Rebase didn't fix it...it looks like there's a bunch of failures with OSx 32 bit right now. Maybe some change got into the compiler that usually fails OSx 32 bit but got lucky and made it through integration testing...

@ibuclaw
Copy link
Member

ibuclaw commented Mar 20, 2018

@ibuclaw I didn't quite understand what you were trying to say. You mentioned something wouldn't work on your side..what were you referring to?

Granted that main affiliation is GDC, what else would I be referring to? ;-)

I also wasn't sure what you were trying to say when talking about how this is adding more time to the test suite...?

This will slow down the testsuite run time. We've already dumped TravisCI because the time it took to finish slowed down so much we couldn't function anymore. There's just a wish of not repeating the same with SemaphoreCI as there probably won't be a faster platform to turn to next time.

@marler8997
Copy link
Contributor Author

marler8997 commented Mar 20, 2018

GDC, what else would I be referring to?

I don't appreciate the "snarky" response. What specifically does this break in GDC?

This will slow down the testsuite run time.

By far the biggest time sink with the test suite is invoking the compiler. Especially when a test is permuting its arguments. I don't anticipate this having a noticeable impact on overall performance, but it is better to measure than theorize.

@ibuclaw
Copy link
Member

ibuclaw commented Mar 21, 2018

This is shared code and every change potentially breaks another compilers workflow. Just be aware of that.

@marler8997
Copy link
Contributor Author

This is shared code and every change potentially breaks another compilers workflow. Just be aware of that.

Oh thanks for telling me that...I had no idea.

@marler8997
Copy link
Contributor Author

I think you just like trolling my pull requests making bogus claims like "this is going to break things for me". Then when I ask what you to explain your claim you go silent or ignore the question.

Then you come back with a comment like "be aware that this code is shared between code bases".
How is it helpful to tell me something so obvious? Seems like you're just sidestepping the question because you know there's no substance behind your claim. Again...how is this breaking GDC? You made the claim, back up what you say or don't say it at all.

@ibuclaw ibuclaw added Review:Needs Work and removed Merge:72h no objection -> merge The PR will be merged if there are no objections raised. labels Mar 21, 2018
@marler8997
Copy link
Contributor Author

Ping

@JinShil
Copy link
Contributor

JinShil commented Apr 5, 2018

@ibuclaw it appears you have reservations about this and had labeled it as "Needs Work" What specifically needs to be done to make this acceptable, or are you against this approach in general?

It will likely add time to the test suite, but I don't know if the time would be significant.

Assigning to @ibuclaw for further consideration and clarification; how do we get to a yay or nay?

@wilzbach
Copy link
Contributor

wilzbach commented Apr 5, 2018

But do bear in mind that this will just add more time onto the testsuite run when - depending on the bug - doing as per this issue might be better.
https://issues.dlang.org/show_bug.cgi?id=18235

FWIW I think the costly bit are the runnable tests.
All 1283 fail_compilation tests compile in ~11s on my machine.
The 774 compilable tests are slightly slower with ~25s ... and you don't want to know how long runnable takes :/
So if someone wants to make the testsuite faster, the runnable tests are the real bad boys.
A simple optimization could be to only link druntime if nothing from Phobos is imported.

Anyhow, I don't think that a creating a few files will lead to a measurable effect - compared to all the crazy things DMD and the linker do. I think @marler8997 even measured it once (?), but I can't find the reference right now.

@marler8997
Copy link
Contributor Author

I analyzed timing on the autotester once upon a time (in #7099). The trick was to make sure you were comparing the same machine.

@marler8997
Copy link
Contributor Author

@ibuclaw could you let me know what you were wanting to be changed? This PR is one that I'm waiting for before I add more testing (i.e. tests for import order invariance which require many small files per test). Thanks.

@marler8997
Copy link
Contributor Author

Pinging @ibuclaw

You last activity on this PR was 20 days ago where you added the "Needs Work" label without clarifying what needed work. Now this PR has been assigned you to and I'm waiting for your feedback. This is blocking other work, please let us know what you'd like changed.

@marler8997 marler8997 force-pushed the har branch 3 times, most recently from 40c0a1c to 47036d3 Compare April 10, 2018 20:44
@ibuclaw
Copy link
Member

ibuclaw commented Apr 11, 2018

I could see this being useful on run.dlang.io, so you should try submitting it there.

Regarding inclusion here, the testsuite is not the place to add more tooling into the mix without good reason, to which I can only see negative points to this.

  1. Testing single test that has been "packed" becomes non-trivial.

  2. Makes the test run more complex by adding in a new step, rather than taking one away.

  3. Said step exposes a new point of failure that didn't exist before.

  4. Intention for the change is sketchy.

  • Is the idea is to make it more clearer to a reviewer what are all the files are in a single test?
    • Can be solved in another way that doesn't alter the way tests are written.
  • Is the idea to make it easier to write/manage tests?
    • This is not a problem that needs solving.
  • Is the idea to obsolete the EXTRA_SOURCES test runner option?
    • No, no, NO!
  1. Breaks downstream for no reason or gain. Files that were once useable can no longer be tested within the testing frameworks used by downstream compilers, i.e: this bespoke format will have to be implemented in dejagnu so that remote execution and emulation continue to work.

This is not something worth pursuing.

@ibuclaw ibuclaw closed this Apr 11, 2018
@marler8997
Copy link
Contributor Author

@ibuclaw

You are free to disagree with this PR but closing it with multiple people in support of it is unreasonable. It's unfortunate that we can't come to an agreement which means we have to escalate this decision up the tree.

I'm not sure if I can "unclose" this PR...am I able to do that or do I have to create another PR?

@andralex andralex reopened this Apr 11, 2018
@andralex
Copy link
Member

Hi folks, please don't close one another's PRs. I know from personal experience it creates a lot of animosity instantly, even when done by folks with authority such as Walter himself. What I do after a number of mistakes was to discuss with people so they close their PRs instead of me closing them. Thanks. I'm reopening this as a courtesy.

That said, I agree with @ibuclaw 's assessment. This PR is illustrative of a trend that goes somewhat sideways relative to what we want. The trend has some of the following:

  • Large code reorganizations of which benefit is tenuous to argue, and therefore needs vigorous argumentation.
  • Change is labor-intensive but of the mechanical kind, often doable with a simple tool.
  • Change reflects more of a preference of the author, than a universally desirable trait that all reasonable people agree with.

We are looking for the opposite:

  • Small code changes of obvious and high impact
  • Change is creative and ingenious
  • Change is motherhood and apple pie: make all of this state private, improve information hiding, improve modularity, decouple modules, measurably improves a desirable metric, etc.

I've reopened this in the spirit of fair play but suggest it be closed by the author.

@marler8997
Copy link
Contributor Author

@andralex thanks for taking a look. I think all arguments have been presented so I'll accept your decision to close.

@marler8997 marler8997 closed this Apr 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants