Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research the possibilities of statistically assisted type inference #17

Open
0x7CFE opened this issue Sep 16, 2013 · 0 comments
Open

Research the possibilities of statistically assisted type inference #17

0x7CFE opened this issue Sep 16, 2013 · 0 comments

Comments

@0x7CFE
Copy link
Owner

0x7CFE commented Sep 16, 2013

In statically typed environments Hindley-Milner algorithm may be used to infer the types of expression depending on it's parts. The question is, may this idea be applied to the Smalltalk's pure dynamic environment?

In case of JIT VM we have statistics of which call site affect what classes and potentionally object of what class is returned. Gathering this information we may find a places (call sites and methods) with classes tightly bound to one or more variables. If particular variable appeared to have only one class during the whole runtime, we may then perform an optimization that assumes that current variable always have this class. Thus, specializing the method. Inside, we may treat class as a statically assigned type. This allows us to apply type infering where it is possible.

0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
It allows to compare any two types, store them in STL
container such as std::set, use them as a key in std::map
and use composition operators during type inference procedure.

Operator | is a disjunction-like operator used to get sum of
several possible types within a type.

For example:
    2 | 2 -> 2
    2 | 3 -> (2, 3)
    2 | * -> (2, *)
    (Object) | (SmallInt) -> ((Object), (SmallInt))

This operator may be used to aggregate possible types within
a linear sequence where several type outcomes are possible:
    x <- y isNil ifTrue: [ nil ] ifFalse: [ 42 ].

In this case x will have composite type (nil, 42).

On the other hand, when dealing with loops we need some
kind of a reduction operator that will act as a conjunction:

    2 & 2 -> 2
    2 & 3 -> (SmallInt)
    2 & (SmallInt) -> (SmallInt)

    <any type> & * -> *
    (SmallInt) & (Object) -> *

This operator is used during induction run of the type analyzer
to prove that variable does not leave it's local type domain,
i.e it's type is not reduced to a *.

Issue: #17
0x7CFE added a commit that referenced this issue May 21, 2016
Meta info is very useful during type analysis.
It helps to make decisions based on graph structure.

In future, more flags will be added.

Issue: #17
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 21, 2016
It allows to compare any two types, store them in STL
container such as std::set, use them as a key in std::map
and use composition operators during type inference procedure.

Operator | is a disjunction-like operator used to get sum of
several possible types within a type.

For example:
    2 | 2 -> 2
    2 | 3 -> (2, 3)
    2 | * -> (2, *)
    (Object) | (SmallInt) -> ((Object), (SmallInt))

This operator may be used to aggregate possible types within
a linear sequence where several type outcomes are possible:
    x <- y isNil ifTrue: [ nil ] ifFalse: [ 42 ].

In this case x will have composite type (nil, 42).

On the other hand, when dealing with loops we need some
kind of a reduction operator that will act as a conjunction:

    2 & 2 -> 2
    2 & 3 -> (SmallInt)
    2 & (SmallInt) -> (SmallInt)

    <any type> & * -> *
    (SmallInt) & (Object) -> *

This operator is used during induction run of the type analyzer
to prove that variable does not leave it's local type domain,
i.e it's type is not reduced to a *.

Issue: #17
0x7CFE added a commit that referenced this issue May 21, 2016
Meta info is very useful during type analysis.
It helps to make decisions based on graph structure.

In future, more flags will be added.

Issue: #17
0x7CFE added a commit that referenced this issue May 21, 2016
0x7CFE added a commit that referenced this issue May 23, 2016
This code need to be refactored properly.
In case if both operands are literal, then
result may be defined as literal too.

Otherwise primitive should "fail" by allowing
control flow to pass further.

For literal calculation it is best to use existing
code for software VM.

Issue: #17
0x7CFE added a commit that referenced this issue May 25, 2016
0x7CFE added a commit that referenced this issue May 25, 2016
0x7CFE added a commit that referenced this issue May 25, 2016
0x7CFE added a commit that referenced this issue May 25, 2016
0x7CFE added a commit that referenced this issue May 25, 2016
0x7CFE added a commit that referenced this issue May 26, 2016
0x7CFE added a commit that referenced this issue Jun 18, 2016
0x7CFE added a commit that referenced this issue Jun 18, 2016
This information is very useful during type inference.
It may be gathered on-demand, but that would require
additional pass.

It is very cheap to collect meta info on the fly, so why not?

Issue: #17
0x7CFE added a commit that referenced this issue Jun 18, 2016
0x7CFE added a commit that referenced this issue Jul 1, 2016
For correct inference of the parallel paths in a graph
it is critical to perform traverse breadth-first.

Otherwise, phi nodes may have their incomings undefined.

Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 1, 2016
Some methods like Object>>error: do not return a value in a usual way.
In that case control flow is interrupted and all call chain aborts.

Type analyzer uses an empty composite type () to mark such special case.
Please note that () is not equal to * or ? types that still return a value.

Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 5, 2016
0x7CFE added a commit that referenced this issue Jul 5, 2016
0x7CFE added a commit that referenced this issue Jul 5, 2016
0x7CFE added a commit that referenced this issue Jul 9, 2016
0x7CFE added a commit that referenced this issue Jul 9, 2016
Previously if one includes inference.h and uses namespace type,
then it will result in a name collision if LLVM is in scope.

Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 10, 2016
0x7CFE added a commit that referenced this issue Jul 10, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 17, 2016
0x7CFE added a commit that referenced this issue Jul 21, 2016
Unlike statically inferred block, dynamic ones
need to create and carry their closure environment.

Two blocks with the same set of argument types may have
different closure types even if their closure context
is the same.

We add closure types to a dynamic block's cache key and
to block function name to disambiguate calls.

Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 21, 2016
Unlike statically inferred block, dynamic ones
need to create and carry their closure environment.

Two blocks with the same set of argument types may have
different closure types even if their closure context
is the same.

We add closure types to a dynamic block's cache key and
to block function name to disambiguate calls.

Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 21, 2016
0x7CFE added a commit that referenced this issue Jul 21, 2016
Issue: #17
Issue: #92
0x7CFE added a commit that referenced this issue Jul 21, 2016
0x7CFE added a commit that referenced this issue Jul 25, 2016
0x7CFE added a commit that referenced this issue Jul 25, 2016
0x7CFE added a commit that referenced this issue Aug 2, 2016
0x7CFE added a commit that referenced this issue Aug 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant