Skip to content

fix Issue 14555 - ModuleInfo should weakly link against classes#4638

Closed
MartinNowak wants to merge 2 commits intodlang:masterfrom
MartinNowak:fix14555
Closed

fix Issue 14555 - ModuleInfo should weakly link against classes#4638
MartinNowak wants to merge 2 commits intodlang:masterfrom
MartinNowak:fix14555

Conversation

@MartinNowak
Copy link
Member

No description provided.

@MartinNowak
Copy link
Member Author

What a pity for the time spent on maintaining an outdated backend.

Ironically now that weak external symbols work it turns out that weak linking imported ModuleInfo (here) doesn't work.
I'd be glad for any tip on how to solve this, not sure why it works on Win32.

@MartinNowak
Copy link
Member Author

Correction, I do know why it works on Win32, b/c we're skipping weak external definitions there too.
https://github.com/D-Programming-Language/dmd/pull/4638/files#diff-baf275e112ad2dda18779c8c3c5ed9f9R1249

@MartinNowak
Copy link
Member Author

A solution would be to emit an undefined external reference to the ModuleInfo into every multiobj, that way the ModuleInfo get dragged into any binary that uses part of a module.

@MartinNowak
Copy link
Member Author

Requires dlang/druntime#1265.

@CyberShadow
Copy link
Member

I think this approach will unfortunately break many uses of Object.factory. I imagine that most of its use case is about instantiating classes that are never otherwise referenced.

Question, can we instead create a separate ModuleInfo-like construct just for Object.factory, which is only pulled in if Object.factory is called anywhere in the program, and otherwise GC'd by the linker?

@MartinNowak
Copy link
Member Author

I think this approach will unfortunately break many uses of Object.factory. I imagine that most of its use case is about instantiating classes that are never otherwise referenced.

Yeah, it might cause some trouble. The main use-case for Object.factory is instantiating classes based on runtime data though. And already today you can only load classes whose modules are linked.

Question, can we instead create a separate ModuleInfo-like construct just for Object.factory, which is only pulled in if Object.factory is called anywhere in the program, and otherwise GC'd by the linker?

Pretty difficult because the use of Object.factory is separate from the classes, maybe by emitting a special section that is only referenced by the factory. GCing a reference to a class after the class already has been dragged into the binary doesn't GC that class IIRC.

@CyberShadow
Copy link
Member

Pretty difficult because the use of Object.factory is separate from the classes, maybe by emitting a special section that is only referenced by the factory.

That's what I meant (I think).

GCing a reference to a class after the class already has been dragged into the binary doesn't GC that class IIRC.

Let's see, as I understand we have:

ModuleInfo.localClasses -> ClassInfo.vtbl <-> virtual method table -> virtual method bodies -> ...

Which of these are emitted in their own section?

@MartinNowak
Copy link
Member Author

Which of these are emitted in their own section?

Most of them are in a separate object files (dmd's -lib switch outputs multilibs) except for ClassInfo and vtable.

@CyberShadow
Copy link
Member

So I guess that will need to be fixed. The ClassInfo can be in the same section / object file as the vtable as they both refer to each other.

@MartinNowak
Copy link
Member Author

Let's deprecate the current Object.factory functionality and only make sure that every linked class is contstructable. This change can be accompanied with a warning and fallback to the old solution for some time.
#4754 (comment)

@CyberShadow
Copy link
Member

How do you plan on making Object.factory know about the classes that have been "linked in"? Weak references to all ClassInfos? Aren't those arranged in a linked list right now?

@dnadlinger
Copy link
Contributor

Aren't those arranged in a linked list right now?

Not since the big ModuleInfo rewrite for proper shared library support.

@MartinNowak
Copy link
Member Author

As the title of this PR says "ModuleInfo should weakly link against classes".

- can't use weak linkage for ModuleInfo b/c of ctors/dtors
@MartinNowak
Copy link
Member Author

Note that Object.factory already doesn't work when the ModuleInfo is missing, i.e. b/c a module doesn't have a constructor and wasn't explicitly linked as object into the binary.

@MartinNowak
Copy link
Member Author

rebased and fixed test

@CyberShadow
Copy link
Member

As the title of this PR says "ModuleInfo should weakly link against classes".

Derp, thanks.

Tests are still failing:

backend\cgobj.c(1251) : Error: no instance of class 'Obj' for member 'Obj::external_def'

@jacob-carlborg
Copy link
Contributor

Note that Object.factory already doesn't work when the ModuleInfo is missing, i.e. b/c a module doesn't have a constructor and wasn't explicitly linked as object into the binary.

We cannot remove Object.factory. If something isn't working it should be fixed. Stop trying to sneak by language changes in pull requests without anyone noticing.

@jpf91
Copy link
Contributor

jpf91 commented Sep 21, 2016

We cannot remove Object.factory. If something isn't working it should be fixed. Stop trying to sneak by language changes in pull requests without anyone noticing.

Object.factory is a horrible feature when combined with static linking. This function can instantiate arbitrary classes at runtime (you could even get the class name from user input at runtime). If you now require that it can instantiate all classes you'll have to link at least all classes in phobos into every statically linked executable.

These classes depend on other non-classes phobos stuff, so in the end you'll end up pulling most of phobos into every executable. On my linux machine this increases the size of a empty main-function program from 500kB to 8,9MB. It also means you'll have to always link all libraries included by pragma(lib) even if you never use the modules containing that import.

ping @andralex @WalterBright

@andralex
Copy link
Member

The best solution to this is to only pay the cost of Object.factory in applications that actually use it. @WalterBright how can we do this?

@jacob-carlborg
Copy link
Contributor

Can we just check if Object.factory is called?

@timotheecour
Copy link
Contributor

Can we have an option to specify by name which classes can be constructed via Object.factory? (which could still default to all for backward compatibility)
For most applications, a (small) white list of such classes should be sufficient.
eg:
registerClassNames(["MyClass"]);
registerBaseClassNames(["MyBaseClass"]); // can also call Object.factory on derived classes

@andralex
Copy link
Member

Is Benjamin Thaut on github so we can ask him?

@MartinNowak
Copy link
Member Author

I don't see how we can do this for static libraries w/o resorting to custom linker scripts, only referenced object files will end up in a binary but the compiler doesn't necessarily know all classes.
Ultimately this is a linking problem, and it's a hacky proposal to let the compiler somehow make this work. Also I mentioned plenty of workarounds in Bugzilla (e.g. link object files or use --whole-archive) that are easy to use (the latter one being a simple linker switch).

@MartinNowak
Copy link
Member Author

Note that Object.factory already doesn't work when the ModuleInfo is missing, i.e. b/c a module doesn't have a constructor and wasn't explicitly linked as object into the binary.
We cannot remove Object.factory. If something isn't working it should be fixed. Stop trying to sneak by language changes in pull requests without anyone noticing.

See the proposed deprecation #4638 (comment).
Sorry that the language promised features that can't be properly implemented w/ static libraries.

@jacob-carlborg
Copy link
Contributor

See the proposed deprecation #4638 (comment).

Yeah, I've seen that. I don't agree that it should be deprecated.

Sorry that the language promised features that can't be properly implemented w/ static libraries.

It can if the compiler outputs the all the necessary data.

@CyberShadow
Copy link
Member

CyberShadow commented Sep 22, 2016

With time, it has become increasingly clear to me that Object.factory is a misfeature. Its unlimited scope, ability to bypass access restrictions, and lack of opt-in/opt-out mechanisms make it a bit too much like JavaScript's eval (yes, including some of its security issues). Frankly, I don't see why it can't be replaced with a user-implemented mechanism, either one that's opt-in or based on the language's existing compile-time introspection capabilities.

Jacob, can you describe your use case? Is there something non-obvious that's preventing migrating your code away from the built-in Object.factory?

@jacob-carlborg
Copy link
Contributor

Jacob, can you describe your use case? Is there something non-obvious that's preventing migrating your code away from the built-in Object.factory?

I use it in my serialization library.

@CyberShadow
Copy link
Member

CyberShadow commented Sep 22, 2016

I use it in my serialization library.

So... does that mean that I can feed any program that uses your serialization library a blob of serialized data, which would cause the program to instantiate an arbitrary class anywhere in the program (incl. private classes)?

If so, honestly, I think that only proves my point.

@jacob-carlborg
Copy link
Contributor

There are many languages that allow you do to this.

@CyberShadow
Copy link
Member

@jacob-carlborg
Copy link
Contributor

You can do a lot of stupid things in many ways in many programming languages, what's your point?

@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Sep 23, 2016

BTW, I forgot that my serialization library is driven by static typing. That is, it will only try to deserialize the types you give it. Example:

class Foo
{
    int a;
    int b;
}

class Bar
{
    Foo foo;
}

auto bar = deserialize!(Bar)(data);

The above code can only deserialize Bar and Foo regardless of what data contains. So if data contains a reference to Socket or some other class in Phobos it will either fail or it will be ignored.

@CyberShadow
Copy link
Member

what's your point?

That Object.factory is a misfeature. I thought we've already established that.

BTW, I forgot that my serialization library is driven by static typing.

Then why do you even need Object.factory? Just instantiate the type directly if you have access to it statically.

@jacob-carlborg
Copy link
Contributor

Then why do you even need Object.factory? Just instantiate the type directly if you have access to it statically.

That's a good point 😃, I need to check.

@jacob-carlborg
Copy link
Contributor

Currently it needs that due to how it's designed. The deserializer has the static type but not the unarchiver. The (un)archiver is format (XML, JSON) dependent and the (de)serializer is not. The (un)archiver is polymorphic and therefore cannot have any templated methods. But maybe I can move the instantiation of the object to the deserializer instead.

@CyberShadow
Copy link
Member

From your description, it sounds like you could also make the deserializer
build an object ClassInfo registry, which the unarchiver could use much
like Object.factory.

On Sep 23, 2016 19:41, "jacob-carlborg" notifications@github.com wrote:

Currently it needs that due to how it's designed. The deserializer has the
static type but not the unarchiver. The (un)archiver is format (XML, JSON)
dependent and the (de)serializer is not. The (un)archiver is polymorphic
and therefore cannot have any templated methods. But maybe I can move the
instantiation of the object to the deserializer instead.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#4638 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAJ0fraX1A4ztX8lp-h3lZ5qZHeCKhGtks5qtAFGgaJpZM4ETIIh
.

@jacob-carlborg
Copy link
Contributor

From your description, it sounds like you could also make the deserializer build an object ClassInfo registry, which the unarchiver could use much like Object.factory.

Yes, I already have something similar, the library requires to register sub classes that should be (de)serialized through a base class reference. I should be able to construct the actual instance using the same mechanism.

@wilzbach
Copy link
Contributor

As this is still based on the C++ frontend, it might take a serious effort to revive it.
Though there's some automation - see #4922 (comment)

@ibuclaw
Copy link
Member

ibuclaw commented Mar 16, 2018

As this is still based on the C++ frontend, it might take a serious effort to revive it.

61 line changes is nothing.

@wilzbach
Copy link
Contributor

(I just did a swipe through the PR queue to make sure that these instructions are easily accessible. Maybe I should have used a better text. Sorry)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants