Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Managed LU] Design for serialization & deserialization for dialog.lu file #733

Closed
hibrenda opened this issue Aug 21, 2019 · 6 comments
Closed
Assignees
Labels
Area: LU Belongs to LU feature area Area: Shell P1 Painful if we don't fix, won't block releasing R7 Release 7 - December 10th, 2019

Comments

@hibrenda
Copy link
Contributor

hibrenda commented Aug 21, 2019

  • Lu parser support for section (CRUD)
@vishwacsena
Copy link
Contributor

Suppor for nested intents

Nested intents help authors keep related intents + entity definitions together. This will also be helpful for tools like designer to capture and represent intents that aid and are relevant/ applicable to a specific conversation sub-flow.

In order to support this, we will leverage markdown's ability to represent sections and subsections.

# section
## intentDefintion1
- utterances
- utterances

@ simple entityName
> Additional entity or other LU constructs
> ...

## intentDefintion2
- utterances
- utterances

> additional entity definitions or other LU constructs
> ...

Rules for interpretation:

  • An .lu document that contains section definition must include a parser delcaration to turn on sections parsing. By default, sections are not supported in .lu format.
  • A section is identified by a single '#' section definiton and must contain another intent definition with '##' as the first line within that section.
  • Section names must be unique within the context of an .lu file being parsed.
  • All intent definitions within that section must start with '##'.
  • Use of '##' to represent an intent outside of a section behaves the same way as it does today (treated as a regular intent definition)
  • A section ends when
    • Another section definition begins
    • Another intent definition begins
  • Sections defined across LU files are not collated and conflicts result in parser error. As an example, if 1.lu defines section1 with intent1 and 2.lu defines section1 with intent2, ludown parse with these two files will result in a parser error - Inconsistent definition found for section1 - sections cannot be defined across multiple .lu files.
  • Parser can do one of two possible things with sections
    • [Default behvaior with @Sections parsing enabled] Return a LUIS model definition with all intents within a section merged into a single intent named after the section name
    • Return a LUIS model definition with all intents within a section preserved as separate intents with individual intent names as '<section-name>.<intent-name>'
    • All entities and roles defined within all sections are de-duped and merged with the entities list in the LUIS application.
> Enable section parsing (Disabled by default)
> !# @sections 

> Direct parser output to merge intents to a single intent named after the section-name (Default behavior)
> !# @sections.mergeIntents

> Provide a different name for merged intents for each section
> !# @sections.mergeIntents = sectionName : \<name-of-section>; intentName : \<name-of-output-intent>

> Valid section definition
# section
## intent1

> start and end of section1
# section1
## intent1
## intent2
> end of section1
# intent3 

> Invalid section definition
# section
### intent1

> Invalid section definition
# section
@ simple entityName

> Invalid section definition - in this case, both section and intent1 are interpreted as just two regular intent definitions. 
# section
# intent1

Parser interface:
We will add a new parser interface to break down a given .lu file content by intent/ sections. The required interfaces here need to be similar in capability offered by LG and should include the following -

  • Ability to parse a given .lu file content by intent name, body. Body should include all text content between intent definitions and includes any entity definitions, comments found within that block.
  • Each section is treated as if it was a separate .lu file in that a nested structure of intent name, body for content within a section is made available.
  • Provide an interface that supports find, read, write opearations for body content to a .lu file given the intent name and optionally a section name.
  • Add two new methods to the parser interface. Both will throw an exception on errors (e.g. duplicate intent definition found, no section exists by given name, no intent exists by given name, etc)
    • parser.extractBody(fileContent, intentName, sectionName = null)
    • parser.writeBody(fileContent, body, intentName, sectionName = null)

@vishwacsena
Copy link
Contributor

@boydc2014 please review. Emilio on our side is doing some work to decouple the core parser into its own library since it is not being used in several places (including LUIS portal, ludown cli, bf cli). We should coordinate this addition with him so we can get it into the right place (repo, branch).

@boydc2014
Copy link
Contributor

Functionally, it's all make sense, a few feeback in terms of engineering design

  1. options\commands.

1.1 Usually if you are specifying options\commands in comments section, it must be at beginning, like

> ! # options.a = true
# section or other intent
- contents

1.2 And it would be good to keep the format consistent like

>!# enableSections = true
>!# mergedSections = true

1.3 it probably not suitable for using comments to specific another merged intent name, because you have to duplicate the section name there, perhaps we either just use the section name, or use sectionName: intentName later, or consider not having this feature for now

  1. one important thing in the parser interface, no matter it's sectioned or not sectioned is that parser interface should not return a json structure, instead, it should return a LUResource (similar to LG), the LUResource interface will include a few points:
  • The LUResource will not contains comments\spaces
  • The LUResource will be read-only because the nature of parser is to parsing not alternating
  • The LUResource contains the original token streams allows to combine this with the original file to support editing experience.

So, the flow will always look like this
original file => LUResource (sectioned \not sectioned) => a json structure

@boydc2014
Copy link
Contributor

So, the first thing to me, is we define a LUResource structure that serves as the base of any toJson operation, and be able to support editing capability (like create\update "_interrupted" intent).

@yochay
Copy link
Contributor

yochay commented Oct 29, 2019

Moved to R7. Related to #690

@cwhitten cwhitten added P1 Painful if we don't fix, won't block releasing and removed P0 Must Fix. Release-blocker labels Nov 7, 2019
@hibrenda
Copy link
Contributor Author

as new bf lu (ludown) is implemented . so close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: LU Belongs to LU feature area Area: Shell P1 Painful if we don't fix, won't block releasing R7 Release 7 - December 10th, 2019
Projects
None yet
Development

No branches or pull requests

5 participants