-
-
Notifications
You must be signed in to change notification settings - Fork 667
fix Issue 10441 - Static libraries too big #2550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today, Module::root field is not used properly. Instead, you can check whether a module m is a root by using m->importedFrom == m.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll deal with this after it passes the test suite, maybe refactor things a bit.
|
In #endif
// Instantiate on the root module of import dependency graph.
Scope *scx = sc->push(sc->module->importedFrom);
+ scx->instantiatingModule = sc->module->importedFrom;
ti->semantic(scx);
... |
|
I think that this PR is a right direction. |
Thanks, @9rnsr, I thought it would likely be something like that but had been procrastinating narrowing it down. I just uploaded your fix. |
|
@WalterBright, during I've investigated around template semantics, I got one question about instantiation and codegen strategy. When two modules exist like follows: module a;
import b;
void main() { A!().foo(); }module b;
private alias B!() ti_b;
template A() { void foo() { B!().bar(); } }
template B() { void bar() {} }
Obviously, libaray code could have such a private alias, e.g. for code readability. However currently it would change the compilation behavior on library user side. I'd like to ask you, is this designed behavior, or a problem of unfinished dmd implementation? |
|
It's designed to work that way. The scheme relies on the idea that somewhere, module b is going to be compiled as a root module, which will instantiate B!(), and then b.obj will be linked in. |
|
Good stuff! There's incredible potential in this direction. |
|
Not good... In more complicated case, the design would cause (mostly) unintended link-failure. module a;
import b;
import c;
void main() { A2!().foo(); }module b;
import c;
alias a1 = A1!();
template A1() { alias b = B!(); }module c;
template A2() { void foo() { B!().bar(); } }
template B() { void bar() { C!().baz(); } }
template C() { void baz() { } }
The critical issue is that the link-failure occurrence depends on the semantic analysis order. When you remove Walter, I and you have much knowledge about compiler implementation, therefore we can understand the behavior. But it would be impossible for most D programmers. |
|
The error is in: because nowhere in that command was b.d and c.d compiled and linked in. This PR relies on at some point the programmer is going to compile all the modules and link them in, not just 'import' them. |
|
After few months wasted on experiments in that direction it is great relief to see someone knowledgeable to pay attention to this problem :) I have some concerns about this change in regard of separate compilation though, testing right now. |
|
@WalterBright This fails: module a;
import b;
void main()
{
boo(1);
foo();
}module b;
public import c;
auto foo()()
if (boo(1))
{
return 1;
}module c;
auto boo(T)(T t)
{
return 1;
}$ dmd -c a.d
$ dmd -c b.d
$ dmd -c c.d
$ dmd a.o b.o c.o
a.o: In function `_Dmain':
__entrypoint.d:(.text._Dmain+0xa): undefined reference to `_D1c10__T3booTiZ3booFNaNbNfiZi' |
|
P.S. |
|
Here's a few tests, with build times and static library/object file size printouts:
import std.stdio;
void main() { }2.063.2: 2.064 git-head: Pull 2550:
2.063.2: 2.064 git-head: Pull 2550:
Without unittests: 2.064 git-head: Pull 2550: With unittests: 2.064 git-head: Pull 2550:
2.063.2: 2.064 git-head: Pull 2550:
2.063.2: 2.064 git-head and Pull 2550 - Can't be built, I get: I hope someone can fix this so we can test GtkD against this pull, GtkD was why the initial bug report was made.
2.063.2: 2.064 git-head: Pull 2550: Notice that the average build times when using 2.064+ are twice as slow than using 2.063.2, except for the outlier WindowsAPI which is much faster in 2.064+. This is Issue 10866 which I've filed a while ago. Even with all the recent optimizations in the compiler we've still made 2.064 slower for building many libraries (this is unrelated to this Pull 2550). However the object files do seem to be smaller with this pull. |
It does look worrying that even with this pull, 2.064 compiles 2x slower than 2.063.2... |
Yes, it does. I'd like to see this as a separate bugzilla issue. At least bisect needs to be done to see what caused it. |
|
@Dicebot fixed and added test case. |
Yeah I'm bisecting right now. |
|
I've reduced it to a Phobos commit: If you check out the following commits for DMD+Druntime+Phobos: Then compiling the following: Takes ~1270 msecs. If you checkout one Phobos commit earlier (which is |
|
Btw I can also reproduce this commit is the offending one with other libraries, like DCollections. I'm pretty sure most libraries import |
|
Thanks @Dicebot . The new "instantiatingModule" looks promising for fixing: http://d.puremagic.com/issues/show_bug.cgi?id=9948 & the linking issues you are working on. I'll check it out tomorrow. |
|
Thank you, @AndrejMitrovic , can you please post your findings to a Bugzilla issue as a regression? |
|
Thank you, that'll do nicely. |
|
Let's merge! Pull away, me hearties! |
|
This is exactly the type of pull that I would make in a different branch, then posted a binary of it online, posted in the newsgroup saying "Please try this new experimental compiler build, it's much faster", and receive more bug reports and usage reports for several weeks before merging it into master. But then again I'm not the puppet-master here. Let's do it the old fashioned way and break people's code, let them receive Optlink errors. |
|
Anyone pulling HEAD has to expect the possibility of breakage. We need this pull, and the only practical way to get some mileage into it before release is to put it in HEAD. |
|
@WalterBright |
|
@Dicebot sure, no problem. |
|
@Dicebot not entirely true. There's several tests that use separate compilation w/in the dmd test suite: grep -ir separate test// (I forget the exact tag used). |
The idea is that if an imported module generates the instantiation, then we don't have to generate it for a root module. So, if the instantiator is already listed as a root, and the new instantiator is not root, switch it to the new instantiator. The logic can be shortened to remove the second clause. |
My plan for that, as mentioned earlier in this thread, is to do semantic analysis lazily rather than eagerly. |
|
The issue with that is that if a template is instantiated in both an imported module and a root module, and they both import each other, both will determine that the other is supposed to be the one generating the instantiation, and it will never be generated. The solution to that is to have both generate it. A test that should cause the issue: Module 1: module a;
import b;
struct SomeStruct(T)
{
T field;
T getInnerField()
{
return field;
}
}
immutable globalField = SomeStruct!string("Hello!");
void main()
{
globalField.getInnerField();
anotherGlobalField.getInnerField();
}Module 2: module b;
import a;
immutable anotherGlobalField = SomeStruct!string("Hello Again!");Then compile them with: dmd -c a.d
dmd -c b.d
dmd a.o b.oThat should produce an undefined reference to |
I can't duplicate any error with that, but of course, I am not using the precise gc. |
|
Thank you, @Orvid , that makes sense. I'll figure out a fix. I can duplicate the error, thanks also for the nice test case. |
That won't help in throwing out the code that was semantically analyzed during contraint checking and CTFE. |
Actually, using the druntime patches for the precise GC seems not enough, it probably also needs patches for better RTInfo (#2480). |
Also it won't help to eliminate one of most most problematic sources of code bloat in template-abusing programs - unused bodies of template symbols that got inlined everywhere. I guess it can't be done without some feedback from backend to frontend. |
|
Another way that could decrease the final size is something that no compiler I know of does currently, and would definitely need to have it's own switch, and that's merging duplicate method bodies. It would of course be a backend change, because you'd be comparing the hash of the generated assembly code, and simply refer to only one of the generated bodies, never actually writing the duplicates to file. The only thing is, it's a bit more complicated than it would sound at first, due to the fact you have to account for any relative addresses referenced in the method that point to a location outside of the bod of the method. You'd also have to account for the differences in absolute addresses generated for call instructions to the current method. (methods that call themselves) |
|
@Orvid I actually have sort of a design for that, but it's in the front end. |
Shouldn't that belong into the linker?
I still find it problematic, that we generate different objects for |
|
Looks like this has broken vibe.d: import vibe.d;
void handleRequest(HTTPServerRequest req, HTTPServerResponse res)
{
}
shared static this()
{
listenHTTP(new HTTPServerSettings(), &handleRequest);
}on master results in $ rdmd -L-levent -L-lcrypto -L-lssl -release -O -inline -J. -J./views/ -J../views/ -I/usr/share/vibed/source -version=VibeLibeventDriver source/app.d
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.8.1/../../../../lib/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
/tmp/.rdmd-1000/rdmd-app.d-9069AA21EB13E759B2C02C48539EC93D/objs/app.o: In function `_D4vibe4inet7message13decodeMessageFxAhAyaZAya':
/usr/share/vibed/source/deimos/openssl/ecdh.d:(.text._D4vibe4inet7message13decodeMessageFxAhAyaZAya+0x18b): undefined reference to `_D3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder8popFrontMFNaZv'
/tmp/.rdmd-1000/rdmd-app.d-9069AA21EB13E759B2C02C48539EC93D/objs/app.o: In function `_D3std6base6424__T10Base64ImplVa43Va47Z153__T7decoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7decoderFNaS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZS3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder':
/usr/share/vibed/source/deimos/openssl/ecdh.d:(.text._D3std6base6424__T10Base64ImplVa43Va47Z153__T7decoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7decoderFNaS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZS3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder+0x3d): undefined reference to `_D3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder6__ctorMFNaNcS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZS3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder'
/tmp/.rdmd-1000/rdmd-app.d-9069AA21EB13E759B2C02C48539EC93D/objs/app.o:(.data._D211TypeInfo_S3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder6__initZ+0x38): undefined reference to `_D3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7Decoder11__xopEqualsFKxS3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7DecoderKxS3std6base6424__T10Base64ImplVa43Va47Z153__T7DecoderTS4vibe4inet7message13decodeMessageFxAhAyaZAya79__T12FilterResultS544vibe4inet7message13decodeMessageFxAhAyaZAya9__lambda3TAxhZ12FilterResultZ7DecoderZb'
collect2: error: ld returned 1 exit status
--- errorlevel 1Trying to reduce (no idea how to do it properly). That may also be any other recent change to symbol emitting, as bisecting can't be run because earlier vibe.d release was subject to http://d.puremagic.com/issues/show_bug.cgi?id=11062 |
The inline ICE doesn't happen without -inline, so you can try that for bisecting. |
|
@dawgfoto good catch, thanks, will try it |
|
Now that Issue 10866 has been resolved, here are some new timings and size comparisons: Hello-world:2.063.2:392 msecs - 55 KB object file 2.064 git-head:606 msecs - 6 KB object file WindowsAPI:2.063.2:20_000 msecs - 7,082 KB static lib 2.064 git-head:7708 msecs - 4,361 KB static lib DCollections:2.063.2:420 msecs 2.064 git-head:594 msecs GtkD:2.063.2:37_711 msecs 2.064 git-head:31_519 msecs |
This fix is based on the just merged pull request: dlang#2550 by Walter Bright. It made this fix trivial (at least the test case in the bug report is working correctly now).
|
Hi guys. I think this caused a regression: https://d.puremagic.com/issues/show_bug.cgi?id=11863 Code works fine in CTFE, but fails later at runtime. If the code is copy pasted in place ( Can you give it a look? |
|
I finally discovered codegen bug in glue layer. |
To detect speculative instantiations in codegen phase, `tnext` is neccesary for the `test6()` case `runnable/link13350.d`. That's a big difference from the PR #2550.
I have given up complaining about importing std.stdio (and others) and getting a vast object file of bloat from templates that were instantiated simply by the import, and decided it was a dmd problem. This change addresses it by not generating code from functions that were instantiated by templates from imports, if the standalone import instantiates those same templates.
This should also speed up compiles.
The semantic analysis is still done. In the future I'd like to address this by going "full lazy" on the semantic analysis of imports (currently it is done "full eager"). Such will also produce a big compile time speedup.
This change has a risk of my having missed cases, the symptom of which will be undefined symbols at link time.
This only deals with functions. If it works out, we can extend it to not generating data, either.