diff --git a/schemas/test-catalog.rnc b/schemas/test-catalog.rnc index bcbc7b4c..aa19b515 100644 --- a/schemas/test-catalog.rnc +++ b/schemas/test-catalog.rnc @@ -3,520 +3,697 @@ namespace unqualified = "" grammar { - # RNC grammar for test catalog. - # - # Revisions: - # 2023-03-13 : CMSMcQ : Add metadata for dependencies (and correct typos) - # 2022-05-31 : CMSMcQ : Make 'error' attribute obligatory, - # add 'wrong-error' as result. - # 2022-04-12 : CMSMcQ : Move base version of this to ixml repo - # 2022-04-11 : CMSMcQ : Add dynamic-error as expected result - # 2022-02-14 : CMSMcQ : Move metadata from attributes to elements - # 2022-02-06 : CMSMcQ : Add a quick and dirty report format. - # 2021-12-22 : CMSMcQ : Make 'created' optional on individual tests; - # notionally, let it be inherited from test set. - # 2021-11-11 : CMSMcQ : Revamp result to allow multiple results - # and include assert-not-a-grammar. - # Rewrite some comments. - # 2021-10-31 : CMSMcQ : Commit some changes: @name on test-case, - # allow at most one grammar for each test - # set (grammars may be inherited from ancestor - # test sets). - # 2021-01-25 : CMSMcQ : Sketch this out by hand. - # - # To do: - # - rewrite test-set to allow test cases only if a grammar is - # specified on the test-set or some ancestor. - # - allow description to be (p+ | xhtml:div+) HTML - # - supply types for tokenized attributes? - # - -# Notational convention: definitions starting in uppercase (e.g. -# Metadata, Grammar-spec) are for content-model expressions. -# Definitions starting in lowercase (e.g. test-catalog) are for -# individual elements, usually with the same name as the element. -# -# (Exception: element test-set has two definitions, test-set-0 -# and test-set-1.) - -# The normal starting points are test-catalog and test-report. -# But to allow individual test sets and tests to be reported -# separately, we also allow lower-level result elements as the -# start symbol. - - start = test-catalog | test-report - | test-set-results | grammar-result | test-result - -# test-catalog, test-report - - # A test catalog is a collection of test sets, with common - # metadata. - test-catalog = element test-catalog { - attribute name { text }, - attribute release-date { xsd:date }, - external-atts, - (Metadata - & - (test-set-0 | test-set-ref)*) - } - - # A test report is a collection of test set reports, with common - # metadata. - test-report = element test-report { - element metadata { - (element name { text }, - element report-date { xsd:date | xsd:dateTime }, - element processor { text }, - element processor-version { text }?, - element catalog-uri { text }, - element catalog-date { text }?) - & - Metadata - }, - external-atts, - (Metadata - & - test-set-results*) - } - -# Metadata - - # At various levels we allow metadata: prose descriptions, - # pointers to external documentation, or arbitrary XML - # elements ('application-specific information'), and - # miscellaneous technical details about dependencies of a test - # (or, usually, of the test result) and for a test result the - # environment within which a test was run. - - Metadata = (description | app-info | doc | dependencies)* - - # The 'description' element contains a prose description. - # Say what you think needs saying. - description = element description { - external-atts, - p* - } - - # The 'doc' element carries an 'href' attribute pointing to - # relevant external documentation. - doc = element doc { - external-atts, - attribute href { xsd:anyURI } - } - - # The 'app-info' is an escape hatch which can contain any XML - # at all. It can be used for processor-specific information. - # (Please document what you do!) - app-info = element app-info { - external-atts, - any-element* - } - - # The 'options' element (in the test-catalog namespace, but - # allowed only within app-info) is used to mark results which - # depend (for a given processor) on the options with which the - # processor was invoked. Options are assumed describable with - # name/value pairs encoded as namespace-qualified attributes. - # Typically the attribute name names the option, and the value - # says how to set it. Examples and some discussion are in - # ../tests/grammar-misc/test-catalog.xml - - # If all the option/setting pairs on any options element in - # the app-info element apply, then any of the results - # specified in that app-info element is acceptable. - - # So: for both the options elements and the results in the - # app-info there is an implicit disjunction: if any of the - # options elements applies, then any of the results is OK. - # For the various name/value pairs on an options element, - # there is an implicit conjunction: the options element - # applies if ALL of the name/value pairs apply. - - # N.B. The options element, and the method of handling options - # it represents, is to be regarded as experimental. - - options = element options { - external-atts, - empty - } - - # The environment element works much the same way as the - # options element; when results reported for a test depend on - # the environment (e.g. which version of Java is used, or - # which browser an in-browser processor uses, or ...), then - # the relevant information should be given on an 'environment' - # element wrapped in an 'app-info' element at the appropriate - # level of the test results. (Top level if applicable to all, - # test set if applicable only to that test set, test result if - # applicable to that result.) - - # The difference between options and environment is that - # options are assumed to be settable at parse time by whoever - # calls the ixml processor, and the environment is less likely - # to be settable that way. In case of gray areas, explain - # your usage in the test catalog. - - environment = element environment { - external-atts, - empty - } - - # The difference between options and environment is that - # options are assumed to be settable at parse time by whoever - # calls the ixml processor, and the environment is less likely - # to be settable that way. In case of gray areas, explain - # your usage in the test catalog. - - # The 'dependencies' element identifies conditions that must - # hold for the results given for a test to hold. Like - # 'options' and 'environment', it allows an arbitrary set of - # name/value pairs (namespace-qualified attributes). If all - # of them apply, the test result given is applicable. - - # Some dependencies are standardized: any processor must - # conform to some version of Unicode but we don't specify which, - # so the processor must specify. Test results must be labeled - # with the appropriate Unicode version(s). - - dependencies = element dependencies { - attribute Unicode-version { text }, - external-atts, - empty - } - - # The differences are: - # options - implementation-defined, typically settable by caller - # at parse time. Wrap in app-info to label results - # (often non-standard) which depend on how the - # processor was invoked. - # environment - relevant but not under implementation control. - # Wrap in app-info, use to label results which depend - # on the environment within which the processor is - # running (or within which a test result was obtained). - # dependencies - used to label test cases whose results - # depend on which version of another spec is applicable. - -# test-set, test-set-results - - # A test set is a collection of tests (or possibly subordinate - # test sets, or both) with common metadata and a common - # grammar. - - # Test cases are allowed only after a grammar is specified. - - # We keep track of whether an ancestor has specified a grammar - # by having two nonterminals for test sets: test-set-0 is used - # when no ancestor has specified a grammar, test-set-1 when - # at least one grammar has been specified. - - # If no ancestor has specified a grammar, test cases are allowed - # in this test set only if this test set does specify a grammar. - # Use test-set-0 or -1 to pass the news along. - - test-set-0 = element test-set { - attribute name { text }, - external-atts, - (Metadata - & - (History, - ( (test-set-0 | test-set-ref)* - | (Grammar-spec, (test-set-1 | test-set-ref | test-case)*) ))) - } - - # If an ancestor has specified a grammar, test cases are allowed - # in this test set even if there is no grammar at this level. - - test-set-1 = element test-set { - attribute name { text }, - external-atts, - (Metadata - & - (History, - Grammar-spec?, - (test-set-1 | test-set-ref | test-case)*)) - } - - test-set-results = element test-set-results { - attribute name { text }, - external-atts, - (Metadata - & - (Grammar-results?, - (test-set-results | test-result)*)) - } - - # Grammars can be in invisible XML or in visible XML. - # They can be inline or external. They can be marked - # as a grammar test or not. - - Grammar-data = (ixml-grammar - | vxml-grammar - | ixml-grammar-ref - | vxml-grammar-ref) - - Grammar-spec = (Grammar-data, grammar-test?) - - # In the results file, we may omit the grammar, or include - # it, possibly both reproducing the reference and giving - # the grammar inline. - Grammar-results = (Grammar-data*, grammar-result*) - - # Q. Why is the grammar optional? - # A. Because in a nested test set we may want to inherit the - # grammar from the parent test set. In a top-level test - # set with no direct test-case children, we may just be - # pointing to multiple test sets which each provide their - # own grammar. By the time we reach a test case we must - # have at least one grammar, but we don't need on at every - # level. - # Q. Why can't there be multiple grammars? - # A. First, it's error prone: it would work only if all of them - # were guaranteed equivalent. We don't want to have to check - # that, and we don't want the mess that will result if it - # turns out not to be true. Second, it complicates reporting - # unnecessarily. It's simpler when one test case is one - # grammar + input + result triple. - - test-set-ref = element test-set-ref { - external-atts, - attribute href { xsd:anyURI } - } - - # ixml-grammar: grammar in invisible-XML form - ixml-grammar = element ixml-grammar { - external-atts, - text - } - - ixml-grammar-ref = element ixml-grammar-ref { - external-atts, - attribute href { xsd:anyURI } - } - - # vxml-grammar: grammar in visible-XML form (either a parsed - # ixml grammar, translated into XML, or something created in - # XML) - # - # N.B. It is tempting to embed a schema for ixml grammars here - # to enforce the correct XML form. But we do not require a - # legal ixml grammar, because it may be a negative test case. - - vxml-grammar = element vxml-grammar { - external-atts, - any-element - } - - vxml-grammar-ref = element vxml-grammar-ref { - external-atts, - attribute href { xsd:anyURI } - } - - # grammar-test: signals that this grammar should be checked - # and either accepted or declined as a grammar. - - grammar-test = element grammar-test { - external-atts, - (Metadata & (History?, result)) - } - - grammar-result = element grammar-result { - attribute result { result-type }, - external-atts, - (Metadata & (result-report?)) - } - - -# test-case - - # test-case: describes one test case, with metadata, history, - # and expected result. - - test-case = element test-case { - attribute name { text }, - external-atts, - (Metadata & (History?, Test-string, result)) - } - - test-result = element test-result { - attribute name { text }, - attribute result { result-type }, - external-atts, - (Metadata & - (Grammar-data*, (Test-string)*, result-report?) - ) - } - - result-type = 'pass' # results are as expected - | 'fail' # results not as expected - | 'wrong-error' # right overall result, wrong error code - | 'wrong-state' # right overall result, wrong ixml:state value(s) - | 'not-run' - | 'other' - - # Test-string: in-line or external - - Test-string = (test-string | test-string-ref) - - test-string = element test-string { - external-atts, - text - } - - test-string-ref = element test-string-ref { - external-atts, - attribute href { xsd:anyURI } - } - -# result - - # result: specifies the expected result of a test; - # contains an assertion of some kind. - result = element result { - external-atts, - Assertion - } - - result-report = element result { - external-atts, - Assertion?, - Observation? - } - - -# Test assertions - - # Several kinds of result are possible. - # - # - In the common case we will have one expected XML result. We - # specify it with assert-xml or assert-xml-ref (inline or - # external). - # - # - For ambiguous sentences, we may and should specify several - # XML results, any of which is acceptable. So the XML - # assertions can repeat, with an implicit OR as their meaning. - # - # - In the case of infinite ambiguity, we can and should specify - # a finite subset of the expected results, which we add to as - # needed. - # - # - If the input is not be a sentence in the language defined - # by the grammar, we use assert-not-a-sentence. - # - # - If the grammar specified is not a conforming ixml grammar, - # then we use assert-not-a-grammar. - # - # - If the particular grammar + input pair would produce - # ill-formed output if the normal rules were followed, then - # we use assert-dynamic-error. - # - # Logically speaking, in the case of a grammar-test, there is no - # useful distinction between assert-not-a-sentence and - # assert-not-a-grammar. Casuists can argue over which makes - # more sense, but in practice they should be treated as - # equivalent. They are usefully different only for normal - # test cases. - # - # Since dynamic errors are allowed to be caught statically, - # some processors may return assert-not-a-grammar when the test - # catalog expects assert-dynamic-error. - # - # Errors in the grammar and dynamic errors may be associated - # with error codes. These are now required. - - Assertion = ((assert-xml-ref | assert-xml)+ - | assert-not-a-sentence - | assert-not-a-grammar - | assert-dynamic-error) - - Error-Code = attribute error-code { text } - - assert-xml-ref = element assert-xml-ref { - external-atts, - attribute href { xsd:anyURI } - } - assert-xml = element assert-xml { - external-atts, - any-element+ - } - assert-not-a-sentence = element assert-not-a-sentence { - external-atts, - Metadata - } - assert-not-a-grammar = element assert-not-a-grammar { - Error-Code, - external-atts, - Metadata - } - assert-dynamic-error = element assert-dynamic-error { - Error-Code, - external-atts, - Metadata - } - - Observation = ((reported-xml-ref | reported-xml)+ - | reported-not-a-sentence - | reported-not-a-grammar - | reported-dynamic-error) + # RNC grammar for test catalog. + # + # Revisions: + # 2024-05-01 : CMSMcQ : re-indent, use div for grouping, add + # double-hash comments for elements and + # important patterns + # 2023-03-13 : CMSMcQ : Add metadata for dependencies (and + # correct typos) + # 2022-05-31 : CMSMcQ : Make 'error' attribute obligatory, + # add 'wrong-error' as result. + # 2022-04-12 : CMSMcQ : Move base version of this to ixml + # repo + # 2022-04-11 : CMSMcQ : Add dynamic-error as expected result + # 2022-02-14 : CMSMcQ : Move metadata from attributes to + # elements + # 2022-02-06 : CMSMcQ : Add a quick and dirty report format. + # 2021-12-22 : CMSMcQ : Make 'created' optional on individual + # tests; notionally, let it be + # inherited from test set. + # 2021-11-11 : CMSMcQ : Revamp result to allow multiple + # results and include + # assert-not-a-grammar. Rewrite some + # comments. + # 2021-10-31 : CMSMcQ : Commit some changes: @name on + # test-case, allow at most one grammar + # for each test set (grammars may be + # inherited from ancestor test sets). + # 2021-01-25 : CMSMcQ : Sketch this out by hand. + # + # To do: + # - allow description to be (p+ | xhtml:div+) HTML + # - supply types for tokenized attributes? + # + + # Notational convention: definitions starting in uppercase + # (e.g. Metadata, Grammar-spec) are for content-model + # expressions. Definitions starting in lowercase + # (e.g. test-catalog) are for individual elements, usually + # with the same name as the element. + # + # (Exception: element test-set has two definitions, + # test-set-0 and test-set-1.) + + # The normal starting points are test-catalog and + # test-report. But to allow individual test sets and tests + # to be reported separately, we also allow lower-level result + # elements as the start symbol: test-set-results, + # grammar-rule, test-result. + + start = test-catalog | test-report + | test-set-results | grammar-result | test-result + + div { + + # test-catalog, test-report + + ## test-catalog: A test catalog is a collection of test + ## sets, with common metadata. + test-catalog = element test-catalog { + attribute name { text }, + attribute release-date { xsd:date }, + external-atts, + (Metadata + & + (test-set-0 | test-set-ref)*) + } + + ## test-report: A test report is a collection of test set + ## reports, with common metadata. + test-report = element test-report { + element metadata { + (element name { text }, + element report-date { + xsd:date | xsd:dateTime + }, + element processor { text }, + element processor-version { text }?, + element catalog-uri { text }, + element catalog-date { text }?) + & + Metadata + }, + external-atts, + (Metadata + & + test-set-results*) + } + } + div { + + # Metadata + + # At various levels we allow metadata: prose descriptions, + # pointers to external documentation, or arbitrary XML + # elements ('application-specific information'), and + # miscellaneous technical details about dependencies of a + # test (or, usually, of the test result) and for a test + # result the environment within which a test was run. + + ## Metadata: descriptions, documentation, dependencies, + ## or application-specific information + Metadata = (description | app-info | doc | dependencies)* + + ## description: a prose description of the item. + ## Say what you think needs saying. + description = element description { + external-atts, + p* + } + + ## doc: pointer to documentation relevant to the item. + ## The 'href' attribute gives the URI. + doc = element doc { + external-atts, + attribute href { xsd:anyURI } + } + + ## app-info: The 'app-info' element is an escape hatch which + ## can contain any XML at all. It can be used for + ## processor-specific information. (Please document what you + ## do!) + app-info = element app-info { + external-atts, + any-element* + } + + ## options: The 'options' element is embedded within app-info + ## to mark results which depend (for a given processor) on + ## the options with which the processor was invoked. + options = element options { + external-atts, + empty + + # N.B. The 'options' element is in the + # test-catalog namespace, but it is allowed + # only within app-info. + + # Options are assumed describable with + # name/value pairs encoded as + # namespace-qualified attributes. Typically + # the attribute name names the option, and the + # value says how to set it. + + # Examples and some discussion are in + # ../tests/grammar-misc/test-catalog.xml + + # If all the option/setting pairs on any + # options element in the app-info element + # apply, then any of the results specified in + # that app-info element is acceptable. + + # So: for both the options elements and the + # results in the app-info there is an implicit + # disjunction: if any of the options elements + # applies, then any of the results is OK. For + # the various name/value pairs on an options + # element, there is an implicit conjunction: + # the options element applies if ALL of the + # name/value pairs apply. + + # N.B. The options element, and the method of + # handling options it represents, is to be + # regarded as experimental. + + } + + + ## environment: describes possible dependency of a test result + ## on the environment within which the test is run. + environment = element environment { + external-atts, + empty + + # The 'environment' element works much + # the same way as the options element; + # when results reported for a test + # depend on the environment (e.g. which + # version of Java is used, or which + # browser an in-browser processor uses, + # or ...), then the relevant + # information should be given on an + # 'environment' element wrapped in an + # 'app-info' element at the appropriate + # level of the test results. (Top + # level if applicable to all, test set + # if applicable only to that test set, + # test result if applicable to that + # result.) + + + } + + # The difference between options and environment is that + # options are assumed to be settable at parse time by whoever + # calls the ixml processor, and the environment is less + # likely to be settable that way. In case of gray areas, + # explain your usage in the test catalog. + + ## dependencies: identifies conditions that must hold for the + ## results given for a test to hold. + dependencies = element dependencies { + attribute Unicode-version { text }, + external-atts, + empty + + # Like 'options' and 'environment', it + # allows an arbitrary set of + # name/value pairs + # (namespace-qualified attributes). + # If all of them apply, the test + # result given is applicable. + + # Some dependencies are standardized: + # any processor must conform to some + # version of Unicode but we don't + # specify which, so the processor must + # specify. Test results must be + # labeled with the appropriate Unicode + # version(s). + + } + + # The differences among these three elements for describing + # when a test or a result is relevant are: + + # options - implementation-defined, typically settable by + # caller at parse time. Wrap in app-info to label + # results (often non-standard) which depend on how + # the processor was invoked. + + # environment - relevant but not under implementation + # control. Wrap in app-info, use to label results + # which depend on the environment within which the + # processor is running (or within which a test + # result was obtained). + + # dependencies - used to label test cases whose results + # depend on which version of another spec is + # applicable. + + } + div { + + # test-set, test-set-results + + # A test set is a collection of tests (or possibly + # subordinate test sets, or both) with common metadata and a + # common grammar. + + # Test cases are allowed only after a grammar is specified. + + # We keep track of whether an ancestor has specified a + # grammar by having two nonterminals for test sets: + # test-set-0 is used when no ancestor has specified a + # grammar, test-set-1 when at least one grammar has been + # specified. + + # If no ancestor has specified a grammar, test cases are + # allowed in this test set only if this test set does specify + # a grammar. Use test-set-0 or -1 to pass the news along. + + ## test-set (pattern test-set-0): a test set with no + ## grammar inherited from any ancestor. + test-set-0 = element test-set { + attribute name { text }, + external-atts, + (Metadata + & + (History, + ( (test-set-0 | test-set-ref)* + | (Grammar-spec, + (test-set-1 + | test-set-ref + | test-case)*) ))) + } + + # If an ancestor has specified a grammar, test cases are allowed + # in this test set even if there is no grammar at this level. + + ## test-set (pattern test-set-1): a test set with a grammar + ## inherited from an ancestor. + test-set-1 = element test-set { + attribute name { text }, + external-atts, + (Metadata + & + (History, + Grammar-spec?, + (test-set-1 | test-set-ref | test-case)*)) + } + + ## test-set-ref: a reference to a test set located in + ## another test catalog; the 'href' attribute gives the URI. + test-set-ref = element test-set-ref { + external-atts, + attribute href { xsd:anyURI } + } + + ## test-set-results: contains reports of results from running + ## the test cases of a given test set. + test-set-results = element test-set-results { + attribute name { text }, + external-atts, + (Metadata + & + (Grammar-results?, + (test-set-results | test-result)*)) + } + + } + div { + + # Specifying the grammar for a set of tests + + # Grammars can be in invisible XML or in visible XML. They + # can be inline or external. They can be marked as a grammar + # test or not. + + ## Grammar-data: four ways to specify the grammar for a test + ## set. + Grammar-data = (ixml-grammar + | vxml-grammar + | ixml-grammar-ref + | vxml-grammar-ref) + + ## Grammar-spec: specification of the grammar for a test set, + ## optionally treating the grammar itself as a test case to + ## be parsed against the specification grammar. + Grammar-spec = (Grammar-data, grammar-test?) + + # In the results file, we may omit the grammar, or include + # it, possibly both reproducing the reference and giving + # the grammar inline. + + ## Grammar-results: optional reproduction of the grammar used + ## for the test set, and reports of any grammar tests. + Grammar-results = (Grammar-data*, grammar-result*) + + # Q. Why is the grammar optional? + # A. Because in a nested test set we may want to inherit the + # grammar from the parent test set. In a top-level test + # set with no direct test-case children, we may just be + # pointing to multiple test sets which each provide their + # own grammar. By the time we reach a test case we must + # have at least one grammar, but we don't need one at + # every level. + + # Q. Why can't there be multiple grammars? + # A. First, it's error prone: it would work only if all of + # them were guaranteed equivalent. We don't want to have + # to check that, and we don't want the mess that will + # result if it turns out not to be true. Second, it + # complicates reporting unnecessarily. It's simpler when + # one test case is one grammar + input + result triple. + + ## ixml-grammar: a grammar in invisible-XML form, given + ## inline in the test catalog. + ixml-grammar = element ixml-grammar { + external-atts, + text + } + + ## ixml-grammar-ref: a reference to a grammar in + ## invisible-XML form located elsewhere. The 'href' + ## attribute says where. + ixml-grammar-ref = element ixml-grammar-ref { + external-atts, + attribute href { xsd:anyURI } + } + + ## vxml-grammar: grammar in visible-XML form (either a parsed + ## ixml grammar, translated into XML, or something created in + ## XML), given inline in the test catalog. + vxml-grammar = element vxml-grammar { + external-atts, + any-element + } + + # N.B. It is tempting to embed a schema for ixml grammars here + # to enforce the correct XML form. But we do not require a + # legal ixml grammar, because it may be a negative test case. + + ## vxml-grammar-ref: reference to a grammar in visible-XML + ## form (either a parsed ixml grammar, translated into XML, + ## or something created in XML), given elsewhere (as + ## indicated by the 'href' attribute). + vxml-grammar-ref = element vxml-grammar-ref { + external-atts, + attribute href { xsd:anyURI } + } + + ## grammar-test: signals that this grammar should be checked + ## and either accepted or declined as a grammar. + grammar-test = element grammar-test { + external-atts, + (Metadata & (History?, result)) + } + + ## grammar-result: reports the result of a grammar test. + grammar-result = element grammar-result { + attribute result { result-type }, + external-atts, + (Metadata & (result-report?)) + } + } + div { + # test-case + + ## test-case: describes one test case, with metadata, + ## history, and expected result. + test-case = element test-case { + attribute name { text }, + external-atts, + (Metadata & (History?, Test-string, result)) + } + + ## test-result: reports the result of one test case. + test-result = element test-result { + attribute name { text }, + attribute result { result-type }, + external-atts, + (Metadata & + (Grammar-data*, + (Test-string)*, + result-report?) + ) + } + + ## result-type: keyword description of test result + result-type = ## results are as expected + 'pass' + | ## results not as expected + 'fail' + | ## right overall result, wrong error code + 'wrong-error' + | ## right overall result, wrong ixml:state value(s) + 'wrong-state' + | ## test case was not run (explain!) + 'not-run' + | ## none of the above + 'other' + + ## Test-string: input string, in-line or external + Test-string = (test-string | test-string-ref) + + ## test-string: this element contains the input string + ## for the test case. + test-string = element test-string { + external-atts, + text + } + + ## test-string-ref: this element carries a point to an + ## external resource which contains the input string for the + ## test case. + test-string-ref = element test-string-ref { + external-atts, + attribute href { xsd:anyURI } + } + } + div { + # result + + ## result: specifies the expected result of a test; contains + ## an assertion of some kind. + result = element result { + external-atts, + Assertion + } + + ## result-report: specifies the observed result of running a + ## test case. May repeat the assertion describing the + ## expected result, and may report what was actually observed + ## when the test was run. + result-report = element result { + external-atts, + Assertion?, + Observation? + } + + } + div { + # Test assertions + + # Several kinds of result are possible. + # + # - In the common case we will have one expected XML result. + # We specify it with assert-xml or assert-xml-ref (inline + # or external). + # + # - For ambiguous sentences, we may and should specify + # several XML results, any of which is acceptable. So the + # XML assertions can repeat, with an implicit OR as their + # meaning. + # + # - In the case of infinite ambiguity, we can and should + # specify a finite subset of the expected results, which we + # add to as needed. + # + # - If the input is not be a sentence in the language defined + # by the grammar, we use assert-not-a-sentence. + # + # - If the grammar specified is not a conforming ixml + # grammar, then we use assert-not-a-grammar. + # + # - If the particular grammar + input pair would produce + # ill-formed output if the normal rules were followed, then + # we use assert-dynamic-error. + # + # Logically speaking, in the case of a grammar-test, there is + # no useful distinction between assert-not-a-sentence and + # assert-not-a-grammar. Casuists can argue over which makes + # more sense, but in practice they should be treated as + # equivalent. The two assertions are usefully different only + # for normal test cases. + # + # Since dynamic errors are allowed to be caught statically, + # some processors may return assert-not-a-grammar when the + # test catalog expects assert-dynamic-error. + # + # Errors in the grammar and dynamic errors may be associated + # with error codes. These are now required. + + ## Assertion: things a catalog can say about an expected test + ## result. + Assertion = ((assert-xml-ref | assert-xml)+ + | assert-not-a-sentence + | assert-not-a-grammar + | assert-dynamic-error) + + ## Error-Code: an attribute for specifying an error code + ## expected for a test case, or observed in a test. + Error-Code = attribute error-code { text } + + ## assert-xml-ref: asserts that the result of the test case + ## is expected to match the external XML document pointed to + ## by the 'href' attribute. + assert-xml-ref = element assert-xml-ref { + external-atts, + attribute href { xsd:anyURI } + } + + ## assert-xml: asserts that the result of the test case is + ## expected to match the XML contained. + assert-xml = element assert-xml { + external-atts, + any-element+ + } + + ## assert-not-a-sentence: asserts that the input string is + ## not a sentence in the language defined by the input + ## grammar. + assert-not-a-sentence = element assert-not-a-sentence { + external-atts, + Metadata + } + + ## assert-not-a-grammar: asserts that the input grammar given + ## is not a conforming ixml grammar. This may be because + ## it's not a sentence in the language defined by the ixml + ## specification grammar, or for other reasons. + assert-not-a-grammar = element assert-not-a-grammar { + Error-Code, + external-atts, + Metadata + } + + ## assert-dynamic-error: asserts that when the input grammar + ## is parsed against the input grammar and written out as + ## XML, a dynamic error is expected to result. Note that + ## processors are allowed to detect dynamic errors statically + ## and report with a 'reported-not-a-grammar'. + assert-dynamic-error = element assert-dynamic-error { + Error-Code, + external-atts, + Metadata + } + + ## Observation: things a test report can say about + ## an observed test result. + Observation = ((reported-xml-ref | reported-xml)+ + | reported-not-a-sentence + | reported-not-a-grammar + | reported-dynamic-error) + + ## reported-xml-ref: reports that when the test case + ## was run, the processor produced the XML document + ## pointed to by the 'href' attribute. + reported-xml-ref = element reported-xml-ref { + external-atts, + attribute href { xsd:anyURI } + } + + ## reported-xml: reports that when the test case was run, the + ## processor produced the XML output contained in the + ## element. + reported-xml = element reported-xml { + external-atts, + any-element+ + } + + ## reported-not-a-sentence: reports that when the test case + ## was run, the processor reported that parsing failed (i.e. + ## that the input string is not a sentence in the language + ## defined by the input grammar). + reported-not-a-sentence = element reported-not-a-sentence { + external-atts, + Metadata + } + + ## reported-not-a-grammar: reports that when the test case + ## was run, the processor reported that the input grammar + ## was not a conforming ixml grammar. + ## Note that this may be reported when the processor + ## detects that serializing the result would raise a + ## dynamic error. + reported-not-a-grammar = element reported-not-a-grammar { + Error-Code, + external-atts, + Metadata + } + + ## reported-dynamic-error: reports that when the test case + ## was run, the processor reported a dynamic error. + reported-dynamic-error = element reported-dynamic-error { + Error-Code, + external-atts, + Metadata + } - reported-xml-ref = element reported-xml-ref { - external-atts, - attribute href { xsd:anyURI } - } - reported-xml = element reported-xml { - external-atts, - any-element+ - } - reported-not-a-sentence = element reported-not-a-sentence { - external-atts, - Metadata - } - reported-not-a-grammar = element reported-not-a-grammar { - Error-Code, - external-atts, - Metadata - } - reported-dynamic-error = element reported-dynamic-error { - Error-Code, - external-atts, - Metadata - } - - -# Common constructs - - # History: creation and modification history - History = (created, modified*) - - who-when = attribute by { text }, - attribute on { xsd:date } - - created = element created { - who-when - } - - modified = element modified { - who-when, - attribute change { text } - } - - # Elements for simple prose. - - p = element p { phrases } - - phrases = (text | emph | code)* - - emph = element emph { phrases } - - code = element code { text } - - # Arbitrary XML - - anything = (any-element | any-attribute | text)* - any-element = element * { anything } - any-attribute = attribute * { text } - - external-atts = nsq-att* - - nsq-att = attribute (* - unqualified:*) { text } - + } + div { + # Common constructs + + ## History: creation and modification history + History = (created, modified*) + + ## who-when: attributes for reporting who did + ## something and when they did it. + who-when = attribute by { text }, + attribute on { xsd:date } + + ## created: reports who created the item (test catalog, test + ## set, test case, ...) and when. + created = element created { + who-when + } + + ## modified: reports who changed the item (test catalog, test + ## set, test case, ...) and when. + modified = element modified { + who-when, + attribute change { text } + } + + # Elements for simple prose. + + ## p: a paragraph of simple prose. + p = element p { phrases } + + ## phrases: possible content of a paragraph. + phrases = (text | emph | code)* + + ## emph: marks a phrased emphasized either rhetorically or + ## typographically or both. (Expected rendering: italic.) + emph = element emph { phrases } + + ## code: marks material from a machine-processable language + ## of some kind (e.g. a program). (Expected rendering: + ## monospaced.) + code = element code { text } + + # Arbitrary XML + + ## anything: a pattern matching arbitrary XML + anything = (any-element | any-attribute | text)* + + ## any-element: a pattern matching one well-formed XML + ## element. + any-element = element * { anything } + + ## any-element: a pattern matching one XML attribute. + any-attribute = attribute * { text } + + ## external-atts: a pattern matchine zero or more + ## namespace-qualified attributes. + external-atts = nsq-att* + + ## nsq-att: a pattern matching one namespace-qualified + ## attribute. + nsq-att = attribute (* - unqualified:*) { text } + + } } diff --git a/tools/tsd/images/tsd-workflow.dot b/tools/tsd/images/tsd-workflow.dot new file mode 100644 index 00000000..80b7b282 --- /dev/null +++ b/tools/tsd/images/tsd-workflow.dot @@ -0,0 +1,27 @@ +digraph tsd_dfd { + // sketch of a data flow for managing tag set documentation + + subgraph { + node [shape=box]; + + rnc [label="RNC schema"]; + rng [label="RNG\n(auto-generated)"]; + auto [label="auto-TSD\n(tag-set description\n=Docbook refentry+\nauto-generated)"]; + manual [label="manual TSD\n(tag-set description\n=Docbook refentry+\neditable)" fontcolor=red]; + // tsd [label="TSD\n(partly editable)"]; + tsd [label="TSD\n(tag-set description:\nDocbook refentry+\nprose and auto-generated)"]; + + node [shape=oval]; + // editrnc [label="edit / regen"]; + edittsd [label="edit"]; + manual -> edittsd -> manual; + tsdxrng [label="auto-generate TSD"]; + merge [label="merge\nauto- and manual parts"]; + rnc -> trang -> rng -> tsdxrng -> auto -> merge -> tsd; + manual -> merge; + subgraph { rank = same; auto; manual; } + } + // rnc -> editrnc -> rnc [weight=0]; + tsd -> manual [style=dotted weight=0 label="Re-use"]; + +} diff --git a/tools/tsd/images/tsd-workflow.dot.png b/tools/tsd/images/tsd-workflow.dot.png new file mode 100644 index 00000000..6f9cb809 Binary files /dev/null and b/tools/tsd/images/tsd-workflow.dot.png differ diff --git a/tools/tsd/images/tsd-workflow.dot.svg b/tools/tsd/images/tsd-workflow.dot.svg new file mode 100644 index 00000000..e990ecce --- /dev/null +++ b/tools/tsd/images/tsd-workflow.dot.svg @@ -0,0 +1,139 @@ + + + + + + +tsd_dfd + + + +rnc + +RNC schema + + + +trang + +trang + + + +rnc->trang + + + + + +rng + +RNG +(auto-generated) + + + +tsdxrng + +auto-generate TSD + + + +rng->tsdxrng + + + + + +auto + +auto-TSD +(tag-set description +=Docbook refentry+ +auto-generated) + + + +merge + +merge +auto- and manual parts + + + +auto->merge + + + + + +manual + +manual TSD +(tag-set description +=Docbook refentry+ +editable) + + + +edittsd + +edit + + + +manual->edittsd + + + + + +manual->merge + + + + + +tsd + +TSD +(tag-set description: +Docbook refentry+ +prose and auto-generated) + + + +tsd->manual + + +Re-use + + + +edittsd->manual + + + + + +tsdxrng->auto + + + + + +merge->tsd + + + + + +trang->rng + + + + + diff --git a/tools/tsd/rng-to-TSD.xsl b/tools/tsd/rng-to-TSD.xsl new file mode 100644 index 00000000..a5dc3422 --- /dev/null +++ b/tools/tsd/rng-to-TSD.xsl @@ -0,0 +1,632 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Tag set documentation for + + + + + + + + + + + + + auto-generated from schema by rng-to-TSD.xsl + + + + + + + + + Introduction + + + This is a skeletal framework for documentation of + + . + It was generated automatically from the + schema by rng-to-TSD.xsl, + on + + at + + . + + + + + + + + + Alphabetical list of elements and patterns + + + + + + + + + + + + + Alphabetical list of elements + + + + + + + + + + + + Alphabetical list of patterns + + + + + + + + + + + + + + + + + + + + Tag set documentation for + + + + + + [None. This tag set documentation + will not be distributed in this form.] + + + + + + + + Generated automatically from + + + + + + + + + Document auto-generated from schema by rng-to-TSD.xsl + + + + + + + + + + Reference documentation (skeleton) + + + for + + + + + + + + + + + + + + Introduction + + + This is a skeletal framework for documentation of + + . + It was generated automatically from the + schema by rng-to-TSD.xsl, + on + + at + + . + + + + + + + + + Alphabetical list of elements and patterns + + + + + + + + + + + + + Alphabetical list of elements + + + + + + + + + + + Alphabetical list of patterns + + + + + + + + + + + + + + + + + + + + Tag set documentation for + + + + + + + + + + + + + + + + + + + + + Reference documentation (skeleton) + for + + + + + + + + + + + Introduction + + + This is a skeletal framework for documentation of + + . + It was generated automatically from the + schema by rng-to-TSD.xsl, + on + + at + + . + + + + + + + + + Alphabetical list of elements and patterns + + + + + + + + + + + + + Alphabetical list of elements + + + + + + + + + + + Alphabetical list of patterns + + + + + + + + + + + + + + + + + + + + + + + + + + + + (element) + + + + + + + + + + + [Description to be supplied.] + + + + + + + + + + + + + + Remarks + + + ... + + + + + + + + + + + + + + + + + + [Description to be supplied.] + + + + + + + + + ... + + + + + + + + + + + + + + + + + + + + + + (element) + + + + + + + + + + [Description to be supplied.] + + + + + + + + + watch this space + + + + + + + + Remarks + + + ... + + + + + + + + + + + + + + + + + + + + + (pattern) + + + + + + + + + + + + + + + + + + Remarks + + + ... + + + + + + + + + + + + + + + + + + + + ... + + + + + + + + + + + + + + + + + + + + + + (pattern) + + + + + + + + + + + + watch this space + + + + + + + + Remarks + + + ... + + + + + + + + + diff --git a/tools/tsd/tsd-planning.html b/tools/tsd/tsd-planning.html new file mode 100644 index 00000000..9bdddaaa --- /dev/null +++ b/tools/tsd/tsd-planning.html @@ -0,0 +1,700 @@ + + + + + + + +Tag set documentation project + + + + + +
+

Tag set documentation project

+
+

Table of Contents

+ +
+

+This document outlines a plan for a workflow to create and maintain +documentation for the XML vocabulary used for the XML form of ixml +grammars (here called VXML), and the XML vocabulary used for test +catalogs by the ixml Community Group. +

+ +

+In its current form this document is not complete and is binding on +no one. It is written to serve as a basis for discussion, and to +record some thoughts and expectations. +

+ +
+

1. Project overview

+
+
+
+

1.1. Primary deliverables

+
+

+The central deliverables are reference tag-set documentation (TSD) for +the XML vocabularies in question. +

+ +

+The tag-set documentation we wish to create consists of some +expository prose and a reference pages for the element types and +widely used attributes. +

+ +

+The crucial delivery format is XHTML; other XML vocabularies may be +used for maintenance, but is not expected to be of interest to others. +

+
+
+ +
+

1.2. Requirements

+
+

+Known requirements and desiderata: +

+
    +
  • It must be possible to update the documentation more or less +conveniently as the schemas change.
  • +
  • When the schema changes, human-supplied prose must be carried +forward easily.
  • +
  • Information derivable from the schema should be provided +automatically. Specifically: declarations, lists of parents, +lists of children, lists of attributes.
  • +
  • When schema-derived information changes, it is desirable that +the user be warned, so that any relevant prose can also be +updated.
  • +
+
+
+ +
+

1.3. Workflow

+
+

+The intended workflow is described in this diagram: +

+ +
+

tsd-workflow.dot.png +

+

Figure 1: Workflow plan

+
+ +

+That is: +

+
    +
  • +The RNC/RNG schemas are maintained independently. +

    + +

    +The test catalog schema is maintained by hand in RNC; the ixml +schema is generated automatically in RNG from the ixml grammar, +which is maintained by hand. We use trang to make an RNG form of +the test catalog schema, and an RNC form of the ixml schema. +

    + +

    +Not shown: we use jing -s to create a 'simplified' version of the +RNG schema. In some cases, this may require some hand work. (Jing +aborts with an error message if asked to simplify some schemas with +recursive patterns. The simplified schema also uses some rather +opaque names for patterns introduced by Jing.)) +

  • + +
  • +An XSLT stylesheet (rng-to-TSD.xsl) auto-generates tag-set +documentation for the schema. +

    + +

    +If names and short descriptions are provided in the RNG annotation +namespace (a:documentation elements), they should be carried over. +Otherwise, dummies should be provided. +

    + +

    +This stylesheet draws information about the vocabulary from both the +RNG schema and the simplified RNG schema. +

  • + +
  • An XSLT stylesheet (tsd-merger.xsl) reads the auto-generated +documentation and the previous hand-edited version of the same +documentation, and produces merged output. +
      +
    • For schema-derived information, the auto-generated documentation +is preferred; for other information (basically: the prose), +the hand-edited documentation is preferred.
    • +
    • If any schema-derived information differs between the two +sources, the stylesheet should report the fact to the user.
    • +
  • + +
  • An XSLT stylesheet (tsd-to-html.xsl) reads the merged tag-set +documentation and generates HTML with an appropriate stylesheet +(tsd.css).
  • +
+
+
+ +
+

1.4. Secondary deliverables

+
+

+We have some secondary deliverables, whose purpose is to occupy +the corresponding positions in the workflow. +

+ +
    +
  • Specification of the tag-set documentation vocabulary and +conventions to be used. Obvious candidates are Docbook, TEI P3, TEI +P5, and an ad hoc custom vocabulary.
  • + +
  • auto-tsd.xsl
  • + +
  • tsd-merger.xsl
  • + +
  • tsd-to-html.xsl
  • + +
  • tsd.css
  • +
+
+
+
+ +
+

2. Vocabulary for tag-set documentation

+
+

+We will use Docbook for the XML form of the tag-set documentation. +

+ +

+Because Docbook's reference entries are rather generic, it may be +helpful to specify the pattern to be followed there in more detail. +Each refentry element should contain: +

+ +
    +
  • refnamediv +
      +
    • refdescriptor containing either "(element)" or "(attribute)" or +"(pattern)"
    • +
    • refname with the element type name, attribute name, or pattern +name
    • +
    • refpurpose containing (a) an unabbreviated form of the element +type name, (b) a colon, and (c) a short description (typically one line) +of the meaning or use of the construct
    • +
  • +
  • refsynopsisdiv +
      +
    • +synopsis role="rng-raw" +

      +
        +
      • info containing the 'raw' Relax NG declaration for the +construct: +
          +
        • For elements, the rng:element element.
        • +
        • For attributes, the rng:attribute element.
        • +
        • For patterns, the rng:define element.
        • +
      • +
      +

      +For comparison: this is similar to inclusion of an element +declaration from a DTD with parameter entity references +unexpanded. +

    • +
    • synopsis role="rng-simplified" (for elements and attributes only) +containing the corresponding declaration from the 'simplified' +for the construct. +For comparison: this is similar to inclusion of an element +declaration from a DTD with all parameter entity references +expanded.
    • +
    • (optional, for elements) synopsis role="structured" containing a +structured description in English of the content model of the +element. Not required, because in the usual case an English +summary can be generated from the simplified RNG without trouble. +Not forbidden, because it may be better to do this upstream rather +than in the creation of the HTML delivery form.
    • +
  • +
  • (optional, for elements) refsection entitled "Contents" containing +a prose description in English of the allowed contents of the +element. element. Not required, because not always useful.
  • +
  • (for attributes) refsection entitled "Data description" +with informal prose description of the attribute's datatype.
  • +
  • (for elements) refsection entitled "Attributes" listing all +attributes defined for the element. If the attribute is used on +more than one element, then we want just the attribute name with a +hyperlink to the reference entry for the attribute; if the attribute +is used only on this element, or should be given custom +documentation for this parent, a version of the documentation +pattern for attributes (perhaps attenuated) should be given.
  • +
  • (optional) refsection entitled "Remarks" with prose describing +relevant information – whatever the user will need to know. For +elements and attributes this includes recognition criteria, +distinctions from similar elements or attributes, usage.
  • +
  • refsection entitled "Examples" with prose and examples. In +some cases, this may just consist of references to examples +given in other reference entries.
  • +
  • (optional) refsection entitled "Processing expectations".
  • +
+ +

+For example: +

+
+
<refentry xml:id="element.assert-xml">
+   <refnamediv>
+      <refdescriptor>(element)</refdescriptor>
+      <refname>assert-xml</refname>
+      <refpurpose>Assert-xml:  asserts that the expected
+      output of a conforming ixml processor will be (or,
+      in cases of ambiguity, may be) the child element
+      of the /assert-xml/ element.</refpurpose>
+   </refnamediv>
+   <refsynopsisdiv>
+     <synopsis role="rng-raw">
+       <info>
+         <element xmlns="http://relaxng.org/ns/structure/1.0"
+                  xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
+                  name="assert-xml">
+            <ref name="external-atts"/>
+            <oneOrMore>
+               <ref name="any-element"/>
+            </oneOrMore>
+         </element>
+       </info>
+     </synopsis>
+     <synopsis role="rng-simplified">
+       <info>
+         <element xmlns="http://relaxng.org/ns/structure/1.0"
+                  xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
+                  name="assert-xml">
+           <group>
+             <zeroOrMore>
+               <attribute>
+                 <anyName>
+                   <except>
+                     <nsName ns=""/>
+                   </except>
+                 </anyName>
+                 <text/>
+               </attribute>
+             </zeroOrMore>
+             <oneOrMore>
+               <ref name="_1"/>
+             </oneOrMore>
+           </group>
+         </element>
+       </info>
+     </synopsis>
+   </refsynopsisdiv>
+   <refsection>
+      <title>Contents</title>
+      <para>Any well-formed XML</para>
+   </refsection>
+   <refsection>
+      <title>Remarks</title>
+      <para>If the test catalog has a default namespace
+      declaration, it will be necessary to undeclare it in order
+      to avoid namespace capture of the asserted result. (IXML
+      output has no identified namespace.)</para>
+      <para>When comparing output of an ixml processor to the
+      asserted result, namespace declarations are to be
+      ignored.</para>
+   </refsection>
+</refentry>  
+
+
+
+
+ +
+

3. Auto-generation of TSD (auto-tsd.xsl)

+
+

+[To be drafted.] +

+
+
+ +
+

4. Merger of TSDs (tsd-merger.xsl)

+
+

+[To be drafted.] +

+
+
+ +
+

5. HTML translation and display (tsd-to-html.xsl and tsd.css)

+
+

+[To be drafted.] +

+
+
+ +
+

6. Related work

+
+

+There has been reference documentation for SGML and XML tag sets for +about as long as there have been SGML and XML tag sets intended for +serious use, but there has been very little standardization on the +form of such documentation. Among the examples which have influenced +this work are: +

+ +
    +
  • +Formex (1985). Formex served "formalized exchange of electronic +publications". The manual includes expository prose reference +documentation includes two lists of data elements, one for a format +called CCF (common communications format, an implementation of +ISO 2079) and one for an SGML document type definition. The +reference page for each element type includes: +

    +
      +
    • A symbolic representation of the element's generic identifier +(element type name) and attributes e.g. <AB LA = ...> for +the AB (abstract) element with its LA (language) attribute.
    • +
    • A definition (or terse prose description) of the element.
    • +
    • A data description of the element, describing its content and +format.
    • +
    • A usage note specifying whether the element is mandatory or +optional, repeatable or non-repeatable.
    • +
    • A grouping note specifying what 'groups' the element is part +of (this appears to be a list of possible parents or possibly +higher-level containers).
    • +
    • An example (not always present).
    • +
    +

    +For each attribute, the reference page gives: +

    +
      +
    • A definition (or terse prose description) of the attribute.
    • +
    • A data description defining the set of possible attribute +values.
    • +
    +

    +Not listed here but prominent on each page are cross references to +the corresponding CCF data elements. +

  • + +
  • Maler and El Andaloussi (1996). Maler and El Andaloussi recommend +the following as the "minimal information" for element reference +documentation: +
      +
    • Short name or actual generic identifier (e.g. olist).
    • +
    • Full name: descriptive phrase that explains the short name +(e.g. "An ordered list of related items.").
    • +
    • Synopsis: rules for using the element, perhaps including tree +diagrams showing possible parents and children.
    • +
    • Description: purpose, how and where it should be used, +recognition criteria, etc.
    • +
    • Attributes: reference description for each attribute.
    • +
    • Contents and Contexts (if not already present in the description +and if not clearly conveyed by the synopsis).
    • +
    • Examples.
    • +
    • Processing notes, including notes on how to work around +shortcomings in current tools.
    • +
  • + +
  • +JATS documentation (current). This is a reasonably typical example +of the tag-set documentation supplied by at least some commercially +active SGML and XML consultants. Reference material for an element +includes: +

    +
      +
    • Generic identifier / element type name
    • +
    • Full name
    • +
    • Annotation (to specify which DTDs in a set contain the element)
    • +
    • Definition
    • +
    • Remarks
    • +
    • Related elements
    • +
    • Content model
    • +
    • Content description (prose)
    • +
    • Presentation information (expected styling)
    • +
    • Examples (with prose commentary)
    • +
    • Related resources (pointers to other relevant information)
    • +
    • Source (if adapted from some other tag set)
    • +
    • Module (in a multi-module vocabulary)
    • +
    • Revision history
    • +
    +

    +There are similar structures for attributes and parameter entities. +

  • + +
  • +TEI P3 (1994). The auxiliary document type for 'tag set +documentation' allows for each element type: +

    +
      +
    • generic identifier
    • +
    • full name
    • +
    • short description (typically a one-liner)
    • +
    • list of attributes (with reference information for each)
    • +
    • examples (with commentary and explanation)
    • +
    • remarks
    • +
    • information on the part of TEI where the element is defined, +the classes it belongs to, and the file(s) it is defined in
    • +
    • a data description (in prose)
    • +
    • a list of parents
    • +
    • a list of children
    • +
    • the text of the element's declaration
    • +
    • the text of the element's attribute-list declaration
    • +
    • hyperlinks to relevant documentation
    • +
    • a list of equivalent elements (in other vocabularies)
    • +
    +

    +TEI P5 (current) has modified the tag-set documentation of P3 and +made many of the elements less specific. +

  • +
+ +

+The reference material of Docbook has also had an obvious influence. +

+
+
+ +
+

7. References

+
+
    +
  • Formex: formalized exchange of electronic publications, +ed. C. Guittet. Luxembourg: Office for Official Publications of the +European Communities, 'New Technologies – Project Management' +Department, 1985. 243 pp.
  • + +
  • Maler, Eve, and Jeanne El Andaloussi. Developing SGML DTDs: from +text to model to markup. Upper Saddle River, NJ: Prentice Hall PTR, +
      +
    1. +
  • +
+
+
+
+
+

Date: 28 May 2024

+

Author: CMSMcQ

+

Created: 2024-05-28 Tue 18:10

+

Validate

+
+ + diff --git a/tools/tsd/tsd-planning.org b/tools/tsd/tsd-planning.org new file mode 100644 index 00000000..5c5391ed --- /dev/null +++ b/tools/tsd/tsd-planning.org @@ -0,0 +1,332 @@ +#+title: Tag set documentation project +#+author: CMSMcQ +#+date: 28 May 2024 +#+ORG-IMAGE-ACTUAL-WIDTH: nil + +This document outlines a plan for a workflow to create and maintain +documentation for the XML vocabulary used for the XML form of ixml +grammars (here called VXML), and the XML vocabulary used for test +catalogs by the ixml Community Group. + +In its current form this document is not complete and /is binding on +no one/. It is written to serve as a basis for discussion, and to +record some thoughts and expectations. + +* Project overview +** Primary deliverables +The central deliverables are reference tag-set documentation (TSD) for +the XML vocabularies in question. + +The tag-set documentation we wish to create consists of some +expository prose and a reference pages for the element types and +widely used attributes. + +The crucial delivery format is XHTML; other XML vocabularies may be +used for maintenance, but is not expected to be of interest to others. + +** Requirements +Known requirements and desiderata: +- It must be possible to update the documentation more or less + conveniently as the schemas change. +- When the schema changes, human-supplied prose must be carried + forward easily. +- Information derivable from the schema should be provided + automatically. Specifically: declarations, lists of parents, + lists of children, lists of attributes. +- When schema-derived information changes, it is desirable that + the user be warned, so that any relevant prose can also be + updated. + +** Workflow +The intended workflow is described in this diagram: +#+CAPTION: Workflow plan +#+ATTR_HTML: width: 25% +[[./images/tsd-workflow.dot.png]] + +That is: +- The RNC/RNG schemas are maintained independently. + + The test catalog schema is maintained by hand in RNC; the ixml + schema is generated automatically in RNG from the ixml grammar, + which is maintained by hand. We use trang to make an RNG form of + the test catalog schema, and an RNC form of the ixml schema. + + Not shown: we use /jing -s/ to create a 'simplified' version of the + RNG schema. In some cases, this may require some hand work. (Jing + aborts with an error message if asked to simplify some schemas with + recursive patterns. The simplified schema also uses some rather + opaque names for patterns introduced by Jing.)) + +- An XSLT stylesheet (/rng-to-TSD.xsl/) auto-generates tag-set + documentation for the schema. + + If names and short descriptions are provided in the RNG annotation + namespace (/a:documentation/ elements), they should be carried over. + Otherwise, dummies should be provided. + + This stylesheet draws information about the vocabulary from both the + RNG schema and the simplified RNG schema. + +- An XSLT stylesheet (/tsd-merger.xsl/) reads the auto-generated + documentation and the previous hand-edited version of the same + documentation, and produces merged output. + + For schema-derived information, the auto-generated documentation + is preferred; for other information (basically: the prose), + the hand-edited documentation is preferred. + + If any schema-derived information differs between the two + sources, the stylesheet should report the fact to the user. + +- An XSLT stylesheet (/tsd-to-html.xsl/) reads the merged tag-set + documentation and generates HTML with an appropriate stylesheet + (tsd.css). + +** Secondary deliverables + +We have some secondary deliverables, whose purpose is to occupy +the corresponding positions in the workflow. + +- Specification of the tag-set documentation vocabulary and + conventions to be used. Obvious candidates are Docbook, TEI P3, TEI + P5, and an ad hoc custom vocabulary. + +- /auto-tsd.xsl/ + +- /tsd-merger.xsl/ + +- /tsd-to-html.xsl/ + +- /tsd.css/ + +* Vocabulary for tag-set documentation +We will use Docbook for the XML form of the tag-set documentation. + +Because Docbook's reference entries are rather generic, it may be +helpful to specify the pattern to be followed there in more detail. +Each /refentry/ element should contain: + +- /refnamediv/ + + /refdescriptor/ containing either "(element)" or "(attribute)" or + "(pattern)" + + /refname/ with the element type name, attribute name, or pattern + name + + /refpurpose/ containing (a) an unabbreviated form of the element + type name, (b) a colon, and (c) a short description (typically one line) + of the meaning or use of the construct +- /refsynopsisdiv/ + + /synopsis role="rng-raw"/ + - /info/ containing the 'raw' Relax NG declaration for the + construct: + + For elements, the /rng:element/ element. + + For attributes, the /rng:attribute/ element. + + For patterns, the /rng:define/ element. + For comparison: this is similar to inclusion of an element + declaration from a DTD with parameter entity references + unexpanded. + + /synopsis role="rng-simplified"/ (for elements and attributes only) + containing the corresponding declaration from the 'simplified' + for the construct. + For comparison: this is similar to inclusion of an element + declaration from a DTD with all parameter entity references + expanded. + + (optional, for elements) /synopsis role="structured"/ containing a + structured description in English of the content model of the + element. Not required, because in the usual case an English + summary can be generated from the simplified RNG without trouble. + Not forbidden, because it may be better to do this upstream rather + than in the creation of the HTML delivery form. +- (optional, for elements) /refsection/ entitled "Contents" containing + a prose description in English of the allowed contents of the + element. element. Not required, because not always useful. +- (for attributes) /refsection/ entitled "Data description" + with informal prose description of the attribute's datatype. +- (for elements) /refsection/ entitled "Attributes" listing all + attributes defined for the element. If the attribute is used on + more than one element, then we want just the attribute name with a + hyperlink to the reference entry for the attribute; if the attribute + is used only on this element, or should be given custom + documentation for this parent, a version of the documentation + pattern for attributes (perhaps attenuated) should be given. +- (optional) /refsection/ entitled "Remarks" with prose describing + relevant information -- whatever the user will need to know. For + elements and attributes this includes recognition criteria, + distinctions from similar elements or attributes, usage. +- /refsection/ entitled "Examples" with prose and examples. In + some cases, this may just consist of references to examples + given in other reference entries. +- (optional) /refsection/ entitled "Processing expectations". + +For example: +#+begin_src Docbook-xml + + + (element) + assert-xml + Assert-xml: asserts that the expected + output of a conforming ixml processor will be (or, + in cases of ambiguity, may be) the child element + of the /assert-xml/ element. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Contents + Any well-formed XML + + + Remarks + If the test catalog has a default namespace + declaration, it will be necessary to undeclare it in order + to avoid namespace capture of the asserted result. (IXML + output has no identified namespace.) + When comparing output of an ixml processor to the + asserted result, namespace declarations are to be + ignored. + + +#+end_src + +* Auto-generation of TSD (/auto-tsd.xsl/) +[To be drafted.] + +* Merger of TSDs (/tsd-merger.xsl/) +[To be drafted.] + +* HTML translation and display (/tsd-to-html.xsl/ and /tsd.css/) +[To be drafted.] + +* Related work +There has been reference documentation for SGML and XML tag sets for +about as long as there have been SGML and XML tag sets intended for +serious use, but there has been very little standardization on the +form of such documentation. Among the examples which have influenced +this work are: + +- Formex (1985). Formex served "formalized exchange of electronic + publications". The manual includes expository prose reference + documentation includes two lists of data elements, one for a format + called CCF (common communications format, an implementation of + ISO 2079) and one for an SGML document type definition. The + reference page for each element type includes: + + A symbolic representation of the element's generic identifier + (element type name) and attributes e.g. ~~ for + the AB (abstract) element with its LA (language) attribute. + + A definition (or terse prose description) of the element. + + A data description of the element, describing its content and + format. + + A usage note specifying whether the element is mandatory or + optional, repeatable or non-repeatable. + + A grouping note specifying what 'groups' the element is part + of (this appears to be a list of possible parents or possibly + higher-level containers). + + An example (not always present). + For each attribute, the reference page gives: + + A definition (or terse prose description) of the attribute. + + A data description defining the set of possible attribute + values. + Not listed here but prominent on each page are cross references to + the corresponding CCF data elements. + +- Maler and El Andaloussi (1996). Maler and El Andaloussi recommend + the following as the "minimal information" for element reference + documentation: + + Short name or actual generic identifier (e.g. ~olist~). + + Full name: descriptive phrase that explains the short name + (e.g. "An ordered list of related items."). + + Synopsis: rules for using the element, perhaps including tree + diagrams showing possible parents and children. + + Description: purpose, how and where it should be used, + recognition criteria, etc. + + Attributes: reference description for each attribute. + + Contents and Contexts (if not already present in the description + and if not clearly conveyed by the synopsis). + + Examples. + + Processing notes, including notes on how to work around + shortcomings in current tools. + +- JATS documentation (current). This is a reasonably typical example + of the tag-set documentation supplied by at least some commercially + active SGML and XML consultants. Reference material for an element + includes: + + Generic identifier / element type name + + Full name + + Annotation (to specify which DTDs in a set contain the element) + + Definition + + Remarks + + Related elements + + Content model + + Content description (prose) + + Presentation information (expected styling) + + Examples (with prose commentary) + + Related resources (pointers to other relevant information) + + Source (if adapted from some other tag set) + + Module (in a multi-module vocabulary) + + Revision history + There are similar structures for attributes and parameter entities. + +- TEI P3 (1994). The auxiliary document type for 'tag set + documentation' allows for each element type: + + generic identifier + + full name + + short description (typically a one-liner) + + list of attributes (with reference information for each) + + examples (with commentary and explanation) + + remarks + + information on the part of TEI where the element is defined, + the classes it belongs to, and the file(s) it is defined in + + a data description (in prose) + + a list of parents + + a list of children + + the text of the element's declaration + + the text of the element's attribute-list declaration + + hyperlinks to relevant documentation + + a list of equivalent elements (in other vocabularies) + TEI P5 (current) has modified the tag-set documentation of P3 and + made many of the elements less specific. + +The reference material of Docbook has also had an obvious influence. + +* References + +- /Formex: formalized exchange of electronic publications/, + ed. C. Guittet. Luxembourg: Office for Official Publications of the + European Communities, 'New Technologies -- Project Management' + Department, 1985. 243 pp. + +- Maler, Eve, and Jeanne El Andaloussi. /Developing SGML DTDs: from + text to model to markup./ Upper Saddle River, NJ: Prentice Hall PTR, + 1996.