Disambiguation based on PartialOrd between choices of a certain node type. #42

eddyb · 2018-09-12T19:22:24Z

@eternaleye suggested that instead of having only "validators", that remove "invalid" choices from the SPPF, a more general approach involves PartialOrd, or more specifically, a function taking two Handle<X> and returning an Option<Ordering>, where disambiguation means "find the largest X".

When the ordering between two choices is equal, that means it's still ambiguous (i.e. the default for node types with no user-provided disambiguation logic), while None means neither choice is valid.

Validity can be embedded by mapping all possible pairs of validities to partial orderings:

impl<T: Validate> Disambiguate for T {
    fn disambiguate(a: Handle<Self>, b: Handle<Self>) -> Option<Ordering> {
        match (a.validate(), b.validate()) {
            (false, false) => None,
            (false, true) => Some(Ordering::Less),
            (true, false) => Some(Ordering::Greater),
            (true, true) => Some(Ordering::Equal),
        }
    }
}

The text was updated successfully, but these errors were encountered:

rljacobson · 2020-09-26T17:42:52Z

I think you might have to project into a partial order from a relation that lacks transitivity, because I feel like a rock-paper-scissors relationship is likely. The ideal properties would be a (upper-)semilattice, which has the additional property that for any two elements x and y, there exists a z such that z > x and z > y. (So z is a l.u.b. of {x,y}.)

If the goal is just to select a likely parse rather than the most likely parse, there probably isn't any harm in extending the partial order arbitrarily to a total order if it's convenient to do so. But that's somehow unsatisfying, isn't it...

A copy+paste of my note to Eddy: I am interested to know how a probabilistic model would perform. So you have a set of parse trees and want to find the "correct"/intended parse tree. Ahead of time you gather statistics about how language structures relate to each other locally by analyzing, say, all of crates.io. Check each parse tree against the statistical model to find the most likely. What I'm describing is syntactic, but it could in principle include other attributes like type, scope, some complexity metric, proximity... And your notion of validators is complementary, not mutually exclusive. In ML we tend to just dump all the factors into one big bucket and study the principle components of the resulting space. We can then remove any factors that appear useless or project to a lower-dimensional space. But, I mean, that's something you'd do, if you do it at all, only after you have a framework to implement any of this.

eddyb added the enhancement New feature or request label Sep 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disambiguation based on PartialOrd between choices of a certain node type. #42

Disambiguation based on PartialOrd between choices of a certain node type. #42

eddyb commented Sep 12, 2018 •

edited

Loading

rljacobson commented Sep 26, 2020 •

edited

Loading

Disambiguation based on PartialOrd between choices of a certain node type. #42

Disambiguation based on PartialOrd between choices of a certain node type. #42

Comments

eddyb commented Sep 12, 2018 • edited Loading

rljacobson commented Sep 26, 2020 • edited Loading

eddyb commented Sep 12, 2018 •

edited

Loading

rljacobson commented Sep 26, 2020 •

edited

Loading