-
Notifications
You must be signed in to change notification settings - Fork 24
Matching Multiple Templates
Tokenizer can match text against multiple templates, and use the template that is the best match to extract content. The best matching template is chosen using the following parameters:
- Template Tags
- Template Hints
- Number of tokens extracted from the text
A template can contain one or more tags. These can be used by code calling the TokenMatcher to restrict the matching process to only those templates that contain all the supplied tags. Tags can be defined in the template frontmatter, or programmatically added to the Template.Tags collection at runtime.
Use of tags when matching against multiple templates can optimise processing time if you are able to restrict the candidate template set ahead of time.
// Templates can contain one or more "tag" arguments in
// their frontmatter options.
var template1 = @"---
name: template1
tag: standard
outOfOrder: true
terminateOnNewLine: true
---
Name: {Name}
Age: {Age}";
var template2 = @"---
name: template2
tag: extended
outOfOrder: true
terminateOnNewLine: true
---
Name: {Name}
Age: {Age}
Address: {Address}";
// Create a TokenMatcher instance
var matcher = new TokenMatcher();
// Register our templates
matcher.RegisterTemplate(template1);
matcher.RegisterTemplate(template2);
var input = @"Name: Alice
Age: 30
Address: London";
// Matches all input against all templates containing the "standard" tag
var result = matcher.Match(input, new[] { "standard" });
// Get the best matching template
var match = result.BestMatch;
// Asset we matched template1, even though template2 would of been a better match
Assert.AreEqual("template1", match.Template.Name);
Assert.AreEqual("Alice", match.First("Name"));
Assert.AreEqual("30", match.First("Age"));Template hints are defined in the template. A template can contain on or more hints, and these can be defined as either being optional or required.