Skip to content

Object representation

cadrian edited this page Sep 14, 2010 · 1 revision

Note: unfinished work

Memory layout

How will Liberty lay objects in memory? Single inheritance is not a problem; any of the the following proposal will fit its requirements.

Let’s say we have this hierarchy – Note: In this post the C language will be used as it was originally meant: a portable assembler. 1

class FOO feature a: INTEGER; b: STRING
end

class FROB feature fa: INTEGER
end

class BAR inherit FOO feature c: INTEGER
end
class MAN inherit FOO FROB feature d: INTEGER end
class BARMAM inherit BAR MAN feature
e: INTEGER
end

Meta-data

SmartEiffel used to store an “int id” at the beginning of the structure, a kind of “thin” meta-data that is easily implemented, easily understood but stands on the assumption that the program will always be a monolithic, non-extensible at runtime entity.

GObject instead store the address on a GObjectClass struct that will describe the actual type. This kind of approach is AFAIK the only feasible one when you want to provide some of the following features:

  • shared libraries support: putting a Liberty cluster or a Liberty class into a shared library
  • module loadable at runtime
  • introspection or access to rich meta-data
    In those cases the requires extra memory is not a burden but it may be desired – you may want to tell the user the name of the effective class you have just loaded. Actually GObject is quite rich, perhaps a little more that what we want….

Choosing between “fat” or “thin” meta-data may actually not be a issue; in fact we may use an integer to a table of classes or a pointer to a klass structure; in 32bit they have the same size; to allow modularity of compiled classes the full version is used. In case of boosted, not-extensible program the whole metadata may be dropped using pointers as unique ids.

Expanded classes

Expanded classes in Liberty Eiffel design plays a particular place. They do not mix at all with reference classes so essentially they are like a C struct which may be inserted into other classes, i.e. another class may contain all features – attributes, queries and commands – of the inserted class. Not being part of the inheritance tree makes them as easy as handling a structure in C.

Reference classes

SmartEiffel flattens the object hierarchy. Each live type contains all the data needed to build an object, including the data gathered from the type’s ancestors. Liberty Eiffel does the very same thing because it conveniently solves inheritance (simple as well as multiple) at compile time.

Flat structures provides an elegant solution for multiple and repeated inheritance because all issues are handled at compile time. Let’s make the hypothesis that FOO and BAR are defined into a library and that FROB, MAN and BARMAN into another. Code in the second library may safely assume that any MAN and BARMAN object start with the content of their FOO ancestor but any other attribute may have different offset when diffe
This solution requires that all queries for object’s attributes must be implemented functions that shall be redefined and in-line-able at link time. In fact attribute

LLVM linking types

LLLVM provides several way to link a symbol into a binary – be it an executable or a library. They are:

  1. private: Global values with private linkage are only directly accessible by objects in the current module. In particular, linking code into a module with an private global value may cause the private to be renamed as necessary to avoid collisions. Because the symbol is private to the module, all references can be updated. This doesn’t show up in any symbol table in the object file.
  2. linker_private: Similar to private, but the symbol is passed through the assembler and removed by the linker after evaluation.
  3. internal: Similar to private, but the value shows as a local symbol (STB_LOCAL in the case of ELF) in the object file. This corresponds to the notion of the ‘static’ keyword in C.
  4. available_externally: Globals with “available_externally” linkage are never emitted into the object file corresponding to the LLVM module. They exist to allow inlining and other optimizations to take place given knowledge of the definition of the global, which is known to be somewhere outside the module. Globals with available_externally linkage are allowed to be discarded at will, and are otherwise the same as linkonce_odr. This linkage type is only allowed on definitions, not declarations.
  5. linkonce: Globals with “linkonce” linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded.
  6. weak: “weak” linkage has the same merging semantics as linkonce linkage, except that unreferenced globals with weak linkage may not be discarded. This is used for globals that are declared “weak” in C source code.
  7. common: “common” linkage is most similar to “weak” linkage, but they are used for tentative definitions in C, such as “int X;” at global scope. Symbols with “common” linkage are merged in the same way as weak symbols, and they may not be deleted if unreferenced. common symbols may not have an explicit section, must have a zero initializer, and may not be marked ‘constant’. Functions and aliases may not have common linkage.
  8. appending: “appending” linkage may only be applied to global variables of pointer to array type. When two global variables with appending linkage are linked together, the two global arrays are appended together. This is the LLVM, typesafe, equivalent of having the system linker append together “sections” with identical names when .o files are linked.
  9. extern_weak: The semantics of this linkage follow the ELF object file model: the symbol is weak until linked, if not linked, the symbol becomes null instead of being an undefined reference.
  10. linkonce_odr and weak_odr: Some languages allow differing globals to be merged, such as two functions with different semantics. Other languages, such as C++, ensure that only equivalent globals are ever merged (the “one definition rule” – “ODR”). Such languages can use the linkonce_odr and weak_odr linkage types to indicate that the global will only be merged with equivalent globals. These linkage types are otherwise the same as their non-odr versions.
  11. externally visible: If none of the above identifiers are used, the global is externally visible, meaning that it participates in linkage and can be used to resolve external symbol references.
  12. dllimport and dllexport are targeted for Microsoft Windows platform only. They are designed to support importing (exporting) symbols from (to) DLLs (Dynamic Link Libraries).

Liberty linking requirements

  1. frozen
  2. redefine
  3. undefine
  4. multiple inheritance
    1. from different ancestors
    2. diamond
Possible usage frozen redefine undefine MI MI multiple ancestors MI Diamond inheritance
private May be used to build inner module-specific runtime functions, constants or values since it does not match any actual Liberty feature.
linker_private
internal a.k.a. “static” in C; useful to implement once features
available_externally
linkonce
common
weak
appending
extern_weak
linkonce_odr
weak_odr
externally visible
dllimport
| dllexport

Issues

Multiple modules: let’s say we want to o reuse our cluster and create an heir which redefines an attribute

Links

Some useful links:

  1. GObject
  2. GObjectClass struct
  3. Vtable
  4. Multiple dispatch
  5. Virtual inheritance
  6. SmartEiffel C code generation

1 TODO: provide a little graph on the inheritance tree