Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenDF / Constraints #69

Open
meron-tl opened this issue May 12, 2022 · 2 comments
Open

OpenDF / Constraints #69

meron-tl opened this issue May 12, 2022 · 2 comments

Comments

@meron-tl
Copy link

Hello,

Thanks again for putting the tremendous effort to develop and release SMCalFlow!

Following the motto "you don't understand it until you implement it", we tried to reproduce the basic functionality of a system which would be able to execute some SMCalFlow dialogues. Trying to do this, really made it clear 1) how much work was put into building the full SMCalFlow application, and 2) that there are many design decisions to be made when developing such a system, which would result in different flavors of dataflow applications.

One central example of this is the way constraints are represented, handled, and executed. If you could give some more explanation, that would be very appreciated. For example, multiple constraints are often combined in a compositional pattern, or are surrounded by intension/extension markers. Does this mean you use lazy evaluation? And how do you handle contradictions between constraints?...

We tried to simplify the expressions in our implementation, to make them more understandable (somewhat along the lines of your macros in "Constrained Language Models Yield Few-Shot Semantic Parsers"), while still being executable. This is described in an upcoming paper in LREC 2022 : "Simplifying Semantic Annotations of SMCalFlow", and the corresponding package OpenDF - https://github.com/telepathylabsai/OpenDF, which includes the code to transform SMCalFlow expressions to a simplified form, and code to execute simplified expressions.

We hope OpenDF will increase interest in dataflow dialogues, and encourage development of new dialogue designs.

@hao-fang
Copy link
Contributor

Thanks for developing the OpenDF package! We are excited to see such work happening!

For your questions — the “intension” can be viewed as a pointer to the existing node/computatino on the dataflow graph (think it as & in C), and “extension” is like accessing the node/computation (like * in C). The evaluation does not need to be lazy.
When there are contradictions between constraints, no value will be allowed by the resulting constraint. The ReviseConstraint(c1, c2) is a special operator, which would analyze c1 and c2, and use c2 to overwrite part of c1 when some predicates are non-compatible. For example, ReviseConstraint((w: Widget) => w.title == "foo" && w.description == "bar", (w: Widget) => w.title == "baz")) would produce a new constraint like (w: Widget) => w.title == "baz" && w.description == "bar". This blog has provided some info about how we design about constraint and revision.

@meron-tl
Copy link
Author

meron-tl commented May 14, 2022

Thanks a lot for the encouragement and the answers!

Regarding intension/extension,
OpenDF has this feature - it's called the "view" mode of an input, which can be either VIEW_INT or VIEW_EXT.
(Strictly speaking there may be other view modes, corresponding to cases like deeper references in C - *x/**x, ***x..., but we didn't find any use cases needeing that yet)

By default, all inputs of a node are used as VIEW_EXT, but it's possible to set them to VIEW_INT in two ways:

  1. In the signature of a node - this will set the specified view mode for the specified input for all instances of this node type.
  2. There is syntax allowing to set the view mode for an input of a node instance (a specific computation).

Seeing the dataset, we expected that selecting the right "view" mode would be quite central in the dataflow expressions, but in fact we ended not having to explicitly mention it in expressions.

For example, in the dataset, most refer's appear in the context "Execute(intension=refer())". It is not clear why this 'Execute' was necessary - when doing the graph evaluation, is the refer not evaluated (executed) before evaluating the Execute? or does the execute evaluate the result of the refer? (and what if the refer has multiple results...) - that brought to mind delayed execution...

For the revise -

Thanks for the link to the blog, that really helps set the background more clearly, and describe what we've been struggling with - detecting and resolving contradicting constraints. I wish we had read this blog when we started writing the system - it would have saved us from exploring many dead ends before getting to where the system is now :)

Implementing this (which is left to the developers, as mentioned in the blog), is one of the most tricky part of the work.

For example, there are different "modes" of detecting(resolving) constraint contradiction(implication):

  1. doing it on specific object instances
  2. doing it "logically"/"symbolically"

simple example - if we wanted to look for an Int, and had two constraints - 1. it's an even number, 2. it's an odd number, then one way is to look at the ints in the graph/DB and check that the intersection is empty. The other is to symbolically detect (e.g. using rules) that odd/even are in principle incompatible, and then discard one.

The problem here is that if you have many such constraints (odd, even, prime, not divisible by 7, ...) they all need to know each other in order to symbolically decide if/how they contradict.
You could try to "normalize" some such constraints e.g. isDayOfWeek could be converted to OR(Mon,Tue,Wed,Thu, Fri), so you can use the simpler/base logic of a data type, but that has its limits and costs as well.

This is connected to pragmatics - deciding what a "typical" user would expect. with the same example - if a user says they want an even number, and in the next turn that they want an odd number, then the expected behaviour would be for the agent to understand that these are contradicting, and that the user changed their mind and now want an odd number. But if the user first said the number should be divisible by 381, and then that it should be divisible by 153, it is not clear that the user changed their mind - maybe we should keep both (unless, of course, the user indicated a "clobber" in the turn's text).

This can get quite elaborate, e.g. for types where different subfields interact with each other - e.g. in the TimeSlot class in OpenDF, constraints about the start / end / and duration of an event affect each other...

Could you share some insights here?

Another question regarding constraint data type -
In OpenDF, we decided not to have a separate data type for constraints, instead we have a flag ("constraint_level") in each Node which represent if it's a simple object, a constraint[], constraint[constraint[]]....

So far there weren't any use cases (i.e. dialogues from the SMCalFlow dataset) where this was a problem. Is it missing anything?

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants