Fixes to allow for regeneration of pysnmp base MIBs #8

dcvmoole · 2024-11-01T11:32:36Z

pysnmp comes with a set of built-in "base" MIBs, in pysnmp/smi/mibs/. By far most of those base MIBs have been compiled by pysmi, with a subset of the resulting Python modules modified by hand afterward. However, it has been quite a while since pysmi was last used to generate those base MIBs, and the current versions have a few problems (e.g., some missing access levels) that regeneration would resolve. Of course, such issues can also be resolved by hand, but I would say that for the overall maintenance of pysnmp, it will be a good thing to re-learn how such base MIB regeneration can be done at all: that way, when future updates are made to either the source MIBs themselves or to the structure of the generated Python code, such updates no longer all have to be applied to pysnmp by hand.

I believe I have indeed succeeded in establishing the process of regenerating the base MIBs. That also proved to be a good quality check for pysmi. An analysis of the differences in both generated code and resulting MIB tree details exposed by pysnmp, revealed a few bugs and deficiencies in pysmi. As before, most of these can be traced back to pysmi's switch to Jinja2 templates; the pysnmp base MIBs had not been regenerated since then.

The primary purpose of this PR is to resolve those differences, thereby improving pysmi's level of quality to the point that it can indeed be used to regenerate the pysnmp base MIBs again, without losing anything (at least, to the extent that I can tell!) in the process. Specifically: the first four of the commits in this PR resolve functional differences, and the last one makes the generated contents look a bit more similar and sensible. The fifth commit ('Improve on the use of symbol "Pythonization"') is a related improvement of pysmi's own code (and test set), although it has no effect on the generated base MIBs.

Once again, these changes have been tested with the script from PR #3. The statistics respectively before and after (against lextudio/mibs.pysnmp.com@aa51f794) are as follows:

MIB results: 11924 total, 10034 successful, 1014 failed to compile, 709 failed to load, 100 failed on dependency, 67 failed for other reasons
MIB results: 11924 total, 10034 successful, 1012 failed to compile, 709 failed to load, 102 failed on dependency, 67 failed for other reasons

As can be seen, the quantitative impact of this branch is minimal. As a result of the DEFVAL commit, two MIBs that previously triggered an exception during compilation, now fail in the next phase (dependencies) instead.

In terms of quality however, there are more differences: most notably, 2065 of the compiled MIBs now have extra default values that were previously missing, and 2847 of the compiled MIBs now have their ModuleCompliance classes contain the intended lists of objects. Note that not all of those affected MIBs can be loaded (it is a bit too much work to filter those out from these statistics).

As part of this PR, the pysmi test set is extended from 266 to 393 tests.

If and when this PR is accepted, my plan is to submit a subsequent PR for pysnmp that indeed updates the base MIBs, including instructions on how to do that again later.

P.S. This PR deliberately does not change pysmi to apply the recent pysnmp PEP-8 update from importSymbols to import_symbols and exportSymbols to export_symbols. Doing so would break all pysmi-generated code for use by pysnmp versions before that PEP-8 update, and I am not sufficiently familiar with the versioning strategy to tell whether that would be a problem. That means that for now, the pysmi-generated MIBs (base or not) will continue to use importSymbols/exportSymbols, and changing the base MIBs to use import_symbols/export_symbols will remain a manual step afterward for now (but with e.g. sed, a rather easy one!).

The pysnmp 'loadTexts' setting controls whether text-based metadata should be assigned to objects at load time of compiled MIBs. Currently, in a few cases, text-based metadata is being assigned to objects even when 'loadTexts' is disabled. Before the switch to Jinja2, those cases were however properly covered by a 'loadTexts' check, so that appears to have been a regression. This commit changes the assignments such that they are once again behind a 'loadTexts' check. As part of this commit, the test set is expanded to test that the resulting 'loadTexts' behavior is as intended, both for more cases in which 'loadTexts' is enabled (completing the extensions introduced in git-1e4172e), and many new cases in which 'loadTexts' is disabled. Please note that pysmi's own 'genTexts' setting adds an extra dimension with respect to (not) loading texts. Neither the pysmi change nor the test extensions in this commit address 'genTexts' in any way; that is left to future work. While working on the test set, also rename the "traptype" test from "smiv2" to "smiv1", as it uses exclusively SMIv1 constructions.

If an object type uses an enumerated-INTEGER syntax, and also specifies a DEFVAL (with a value from the enumeration), then that default value is also properly assigned to the object type's class in the generated Python code. That is the expected behavior. However, if an object type uses an "indirect" syntax instead, using a type (e.g., a TEXTUAL-CONVENTION) that in turn is based on an enumerated INTEGER, then the same case results in the default value being omitted from the generated Python code. That is a bug. The bug is a result of the Jinja2 template being given the enumeration *name* of the default value, and having to do the name-to-value lookup itself. That lookup assumed that the object type was also the entity that defined the enumeration, which (as per the second case above) is not always true. This commit changes the information handed to the Jinja2 template to the enumeration *value* of the default value (i.e., a number, as string), so that the template no longer needs to do a lookup at all. The intermediate code is in a better position to do the proper name-to-value lookup, without making assumptions on where the enumeration is defined. The overall result is that many more object types from many MIBs get their intended default values assigned to them. The test set is extended to cover the most typical occurrences of these aspects, including the case that the type that defines the enumeration comes from another MIB altogether (which itself already worked as intended, but would be easy to break in the future). Also included is a test case for a related minor bug that was automatically resolved as part of this change, even though it is based on an illegal construction (namely, of range-subtyping an enumerated INTEGER). Please note that this commit changes the pysmi JSON output format.

When an OBJECT-TYPE declaration subtypes a syntax, uses a BITS syntax, or uses a DEFVAL clause, the MIB definitions template generates an intermediate syntax type class named _Xyz_Type for object Xyz. For the first two of those three cases, the template then overrides the class's __name__ field to reflect the name of the original syntax. However, for cases where only the third (DEFVAL) case applies, it does not. The result is that users that obtain getSyntax().__class__.__name__ will often, but not always, find a proper syntax type name; in the DEFVAL case, they will get the internal "_Xyz_Type" name instead. This change overrides the intermediate syntax type class's __name__ field in all of the three cases, so that __name__ always yields a name that is indeed a name of a valid syntax type class higher up in the class hierarchy of the corresponding object. A corresponding test set is added to cover various local (intra-MIB) and import (inter-MIB) cases.

ModuleCompliance instances were erroneously not assigned any objects, meaning that getObjects() would always return an empty list. The reason for that problem is the use of a wrong field name in the Jinja2 template for pysnmp code generation. This commit fixes that bug, and adds unit tests for the classes that support objects but did not yet have unit tests for those.

Given that pysmi uses many ASN.1 symbol names directly as Python symbol names in generated code, it must convert some of those symbols to make for usable Python names. For example, hyphens are converted to underscores, and reserved Python keywords (such as "if" and "global") are prefixed with "_pysmi_" to avoid collisions. Such "Pythonization" of ASN.1 symbol names is done internally by trans_opers(). For by far most symbols, trans_opers() need not apply changes. That has resulted in the situation that pysmi has many bugs where trans_opers() is erroneously (not) invoked. One result of that is that users of pysnmp are exposed to the Pythonized names rather than the original ASN.1 names in various cases, making those unusable for subsequent pysnmp API calls. This commit rectifies the use of trans_opers() for the following cases: o Object identifier type DEFVALs. Use of OID names that required "Pythonization" would result in failed compilation. o Object lists of NotificationType, ObjectGroup, NotificationGroup, TrapType, and ModuleCompliance classes. Those would previously contain Pythonized names, as returned by their getObjects() APIs. o Table columns. Columns with names that required "Pythonization" would not be exposed as MibTableColumn subclasses. o External table indices. Indices that required "Pythonization" would be listed as coming from the local MIB rather than the MIB from which they are imported, as returned by getIndexNames(). o Table index augmentation registrations. Those would previously contain Pythonized names. Note that there is (still) no pysnmp API to obtain such registrations. Additional clean-up removes a case where trans_opers() was called twice on the same name, and removes the unused tracking of columns from the intermediate code generation stage. A few more comments are added to clarify which internal names are Pythonized and which are not. The test set is extended to cover all of the above cases. In the future, the "Pythonization" should be subjected to a more thorough overhaul, as the current approach is hard to follow and maintain. Hopefully, the new tests will make such an overhaul a bit easier to get right.

This commit combines two changes that have only a non-functional impact on generated Python code: o The enumeration lists (namedValues) are sorted by numeric value rather than by name. The new sorting order is more in line with what is found in most MIBs, usually easier to read, and more consistent with the sorting order before the introduction of Jinja2 templates. o The static list of imported constraint classes is rearranged to be alphabetic. The previous order made little sense. Seeing as neither aspect should be relied upon, no tests are added here.

lextm · 2024-11-01T18:35:47Z

This PR deliberately does not change pysmi to apply the recent pysnmp PEP-8 update from importSymbols to import_symbols and exportSymbols to export_symbols.

We can leave that to release 2.0 of PySMI.

As can be seen, the quantitative impact of this branch is minimal.

Thanks for taking good care of those files. Many are broken due to good reasons, so your changes are not the cause.

the current versions have a few problems (e.g., some missing access levels) that regeneration would resolve

We believe there are more issues to be discovered, so we can increase test coverage along the way. An interesting fact is that even though many pieces are not in the perfect state, the basic functionality remains working (such as OID translation).

MIB/agent support usually only matters to a limited set of SNMP users, so we still have enough time to clean up the things.

lextm · 2024-11-01T18:50:08Z

BTW,

it will be a good thing to re-learn how such base MIB regeneration can be done

We have been evaluating the possibility to move some key types to be static in PySNMP, instead of dynamically loading. That might one day allow us to add full annotations to them to address a few challenges during development/testing.

As before, most of these can be traced back to pysmi's switch to Jinja2 templates; the pysnmp base MIBs had not been regenerated since then.

It was an unfortunate chapter in our history when the transition to Jinja2 was mainly an experimental initiative led by Ilya, resulting in lots of unfinished tasks. Despite the instability of this branch, we proceeded to release version 1.3+ to synchronize with PySNMP, a decision that came with both positive and negative outcomes.

Recent commits from you were tremendously helpful for the community, so we appreciate your time and effort.

dcvmoole · 2024-11-02T13:33:45Z

Thank you for the merge and comments!

We believe there are more issues to be discovered, so we can increase test coverage along the way. An interesting fact is that even though many pieces are not in the perfect state, the basic functionality remains working (such as OID translation).

Hehe well, by total coincidence, lextudio/pysnmp#141 points out an omission (once again introduced by the Jinja2 conversion) that I missed in that very department and that actually needs to be fixed as well. I will create a new PR for the pysmi side of that issue ASAP.

We have been evaluating the possibility to move some key types to be static in PySNMP, instead of dynamically loading. That might one day allow us to add full annotations to them to address a few challenges during development/testing.

That sounds very reasonable, although I'd guess that affects mainly the stuff from SNMPv2-SMI, or? That is the one MIB from the base set that was never generated in the first place and will not be regenerated either. Either way, I will go ahead with the regeneration efforts (this does require some more work on my part, mostly with respect to unit tests for pysnmp), but of course please do feel free to point out anything that might create conflicts for you, so that we can work something out there.

It was an unfortunate chapter in our history when the transition to Jinja2 was mainly an experimental initiative led by Ilya, resulting in lots of unfinished tasks. Despite the instability of this branch, we proceeded to release version 1.3+ to synchronize with PySNMP, a decision that came with both positive and negative outcomes.

I for one think you made the right choice. This way, Ilya's initial work has not been wasted, and in the long run, the overall end result will be better.

Recent commits from you were tremendously helpful for the community, so we appreciate your time and effort.

My pleasure, glad to help!

lextm · 2024-11-03T02:37:50Z

Just shipped pysnmp 7.1.9 that includes the updated built-in MIB modules based on pysmi 1.5.8 compiled results. All pysnmp existing test cases remain passed, so we know at least things don't go worse. It also gives us a lesson in how to incorporate such changes to the pysnmp codebase, so we should be able to sync more closely with future pysmi releases.

dcvmoole · 2024-11-03T10:59:15Z

Oh, excellent, thank you!

dcvmoole added 6 commits October 30, 2024 17:19

lextm merged commit c489422 into lextudio:main Nov 1, 2024
17 checks passed

lextm added the enhancement New feature or request label Nov 1, 2024

dcvmoole mentioned this pull request Nov 2, 2024

Fixed Length Strings As Index to Table Includes Length in OID lextudio/pysnmp#141

Closed

dcvmoole mentioned this pull request Nov 18, 2024

DEFVAL improvements #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes to allow for regeneration of pysnmp base MIBs #8

Fixes to allow for regeneration of pysnmp base MIBs #8

dcvmoole commented Nov 1, 2024

lextm commented Nov 1, 2024

lextm commented Nov 1, 2024

dcvmoole commented Nov 2, 2024

lextm commented Nov 3, 2024

dcvmoole commented Nov 3, 2024

Fixes to allow for regeneration of pysnmp base MIBs #8

Fixes to allow for regeneration of pysnmp base MIBs #8

Conversation

dcvmoole commented Nov 1, 2024

lextm commented Nov 1, 2024

lextm commented Nov 1, 2024

dcvmoole commented Nov 2, 2024

lextm commented Nov 3, 2024

dcvmoole commented Nov 3, 2024