Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): Store information about available checkers to the database #4089

Merged
merged 2 commits into from
Mar 1, 2024

Conversation

whisperity
Copy link
Contributor

@whisperity whisperity commented Nov 17, 2023

This patch contains the back-end developments and solutions to collecting the information needed for #4049.

Summary

In #4049, it was deemed important that losing the information about which checkers executed during analysis is confusing to developers who have access only to the web-based interface without anything from the analyze command (such as CI logs). The confusion stems from checkers that were enabled but did not produce any reports not showing up anywhere on the interface. This is exasperated by the fact that the "analysis command" that is currently shown on the Web UI often contains references to random temporary files created by CI systems without seeing their contents.

The goal of this initiative is to deterministically show the knowledge gathered from the metadata.jsons (if applicable) about which analysers' which checkers executed. Having this list at hand is useful from an auditing point of view, as well, to ensure that it can be seen whether the executed analysers contained a particular checker or not. Thus, we move from the following distinction:

  • "checker that produced a report"
  • "checker that did not produce a report"

over to a more descriptive one:

  • "checker that was disabled"
  • "checker that was enabled" (and produced a report)
  • "checker that was enabled" (but did not produce a report)

Description

This patch implements the back-end changes required to support this new feature. No front-end changes available, yet.

  1. First, a new table, checkers is created, which deterministically stores a unique ID (per database) for each (analyser_name, checker_name) pair. The uniqueness of IDs to, henceforth, checkers is enforced on the database level with a compound UNIQUE KEY.
  2. checkers is filled with the names of all checkers identified during a run's store in a transactionally separated way, preceding the actual Report storage process. This ensures that during a (potentially concurrent) storage process, constraint violations do not tear down the progress of an entire run's contents. (A generously timed retry logic is implemented to ease this process, but it is not guaranteed to be infallible.)
  3. analysis_info, which previously stored only the analyze command-line is extended to JOIN onto the checkers table and stores a single BOOL to say whether the joined checker was enabled in the joined analysis_info.
  4. As there is a unique analysis_info for all run_histories, this list will trivially work in the same way as retrieving the analyze command-line for a past snapshot of a Run currently work.
    • During incremental schema migration, this information is NOT retroactively conjured. This is because there is currently no way of telling which checkers were "enabled" in an analysis, as potentially deleted reports of existing Runs might skew these results. It's safer (or at least less misleading) just to say "No information." for runs that were stored prior to the upgrade.
  5. The AnalysisInfo type on the Thrift API is extended to contain the information (map<analyser_name: string, map<checker_name: string, enabled: bool>>) now stored in the database as of point 3..
  6. Several checker-related aspects that were previously stored for every Report instance individually, most importantly the Severity, are moved into checkers instead. This simplifies the storage and server-start clean-up logic somewhat while complicating matters elsewhere during querying. The following columns are removed (essentially, uniqued out) from Reports: checker_id (string, "name"), analyzer_name, severity.
    • During incremental schema migration, the data from reports is pre-collected and dumped to checkers, with the new FOREIGN KEYs appropriately established as well. This means that for a product with a diverse set of checkers that all produced warnings, the vast majority of the "potential checker ID universe" will practically be allocated during the schema upgrade.
    • During schema downgrades, there is a clean way of doing this in the opposite direction, and the contents of these Report columns are restored appropriately.
  7. The checker_cat(-egory, string) and bug_type (string) columns of Reports is dropped irrevocably, and not restored during a downgrade. It was never clear whether these values were uniquely the same for a particular checker, so moving these to the checkers table might not be a feasible solution. However, as these columns contained generic data that was never exposed over the API, the storage of them is deemed unnecessary. The semantics for whatever was available in these columns were never actually defined and were very analyser-specific or specific to how the report-converter dealt with the raw output of an analyser.

Miscellaneous

This patch also contains the following quality improvements that are not directly related to the feature implementation but were designed and developed in parallel to this feature's development.

  • This patch changed some of the code related to how a product database is connected during schema migration. There were two connections here, one for the product database (under migration), and around that, to the config database (to get the connection string for the product). This is to ensure that a long migration (such as the one present due to this patch) will not cause nasty time-outs on the config connection. If that happens, the resulting exception (of no meaningful value, because the config database is only read, and only once per product under migration) previously escaped and killed the entire server, which is not nice for potential overnight migrations. The new version guards that such exceptions will not escape, the server stays alive, and even if the migration of one product fails, the subsequent product migrations are still attempted by the same server execution, so there is no need for any restarts or auto-restart services.
  • Change from the deprecated Binary type to LargeBinary consistently over the database schema. This means that this patch supersedes [fix] Replace Binary with LargeBinary #3736. To answer @vodorok's question added to #3736, SQLAlchemy's source code shows the established inheritance relationship: _Binary ${&lt;:}$ LargeBinary ${&lt;:}$ Binary, and Binary.__init__() only delegates LargeBinary.__init__(). Thus, it is clear that the two types behave the exact same across all database dialects, and the changes of both [fix] Replace Binary with LargeBinary #3736 and this patch are safe without additional schema migration commands.
  • Added the possibility of logging output during schema migrations. This was necessitated by the complexity of the steps of the migration script c3dad71f8e6b_... of this patch. We will be updating multiple millions of rows in chunks, in multiple passes (one read, one collate, one write), so the accountability of where the migration is going benefits a lot from these improvements.
  • The in-memory data structures that represent parsed metadata.jsons previously stored the checkers "list" in a hard-to-understand and unwieldy-to-use data structure: checkers: Dict[checker_name: str, _: Union[enabled_checkers: List[name: str] | _: Dict[name: str, enabled: bool]]]. This made accessing actual data from this troublesome, as every client code (including new things I was writing for this patch) had to branch on whether the actual available data was "old-style" (only the list of enabled checkers) or "new-style" (the list of all found checkers, marked with its enabledness bit). What's more, the tests were conflating what it means to be a v1 metadata.json. The original implementation, up to 7254d05 in September 2017, contained the former List[str] grouped by the analyser. In March 2020, bd775d6 introduced v2 files, but at this point, v1 was already doing Dict[str, bool]s. In between the two points, in December 2019, 0cd28ac changed the format without the concept of "format versions", turning the list into Dict[str, bool]. Nevertheless, several test files that do not indicate the version still test whether the "real" v1 is understood by the system. I fixed all these issues by transitioning to the current v2 format in memory as soon as possible, creating an appropriate checkers: Dict[str, Dict[str, bool]] data structure irrespective of the format of the metadata.json in the input.

`Binary` is deprecated in SQLAlchemy and was removed in 1.4
@whisperity whisperity added enhancement 🌟 database 🗄️ Issues related to the database schema. server 🖥️ labels Nov 17, 2023
@whisperity whisperity added this to the release 6.23.0-rc2 milestone Nov 17, 2023
@bruntib bruntib modified the milestones: release 6.23.0-rc2, release 6.23.0 Nov 17, 2023
@whisperity whisperity added the WIP 💣 Work In Progress label Nov 22, 2023
@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch 2 times, most recently from 3395d97 to 835ea00 Compare December 7, 2023 15:57
@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch 3 times, most recently from 3a97692 to 76b9910 Compare December 10, 2023 13:57
@whisperity whisperity changed the title feat(server): Store information about disabled checkers to the database feat(server): Store information about available checkers to the database Dec 10, 2023
@whisperity whisperity marked this pull request as ready for review December 10, 2023 14:24
@whisperity whisperity requested a review from dkrupp December 10, 2023 14:25
@whisperity whisperity removed the WIP 💣 Work In Progress label Dec 11, 2023
Copy link
Contributor

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During database migration, the server gets trapped in a cyclic restart with the following error:

Traceback (most recent call last):
  File "/codechecker/lib/python3/codechecker_common/cli.py", line 209, in main
    sys.exit(args.func(args))
  File "/codechecker/lib/python3/codechecker_server/cmd/server.py", line 425, in __handle
    main(args)
  File "/codechecker/lib/python3/codechecker_server/cmd/server.py", line 1009, in main
    server_init_start(args)
  File "/codechecker/lib/python3/codechecker_server/cmd/server.py", line 936, in server_init_start
    __db_migration(cfg_sql_server, context.run_migration_root,
  File "/codechecker/lib/python3/codechecker_server/cmd/server.py", line 609, in __db_migration
    ret = db.upgrade()
  File "/codechecker/lib/python3/codechecker_server/database/database.py", line 326, in upgrade
    command.upgrade(cfg, "head")
  File "/usr/local/lib/python3.9/site-packages/alembic/command.py", line 294, in upgrade
    script.run_env()
  File "/usr/local/lib/python3.9/site-packages/alembic/script/base.py", line 490, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/usr/local/lib/python3.9/site-packages/alembic/util/pyfiles.py", line 97, in load_python_file
    module = load_module_py(module_id, path)
  File "/usr/local/lib/python3.9/site-packages/alembic/util/compat.py", line 182, in load_module_py
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/codechecker/lib/python3/codechecker_server/migrations/report/env.py", line 120, in <module>
    run_migrations_online()
  File "/codechecker/lib/python3/codechecker_server/migrations/report/env.py", line 114, in run_migrations_online
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/usr/local/lib/python3.9/site-packages/alembic/runtime/environment.py", line 813, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/usr/local/lib/python3.9/site-packages/alembic/runtime/migration.py", line 560, in run_migrations
    step.migration_fn(**kw)
  File "/codechecker/lib/python3/codechecker_server/migrations/report/versions/c3dad71f8e6b_store_information_about_enabled_and_disabled_checkers_for_a_run.py", line 235, in upgrade
    upgrade_analysis_info()
  File "/codechecker/lib/python3/codechecker_server/migrations/report/versions/c3dad71f8e6b_store_information_about_enabled_and_disabled_checkers_for_a_run.py", line 62, in upgrade_analysis_info
    _, new_analyzer_command = recompress_zlib_as_tagged_exact_ratio(
  File "/codechecker/lib/python3/codechecker_server/migrations/common.py", line 89, in recompress_zlib_as_tagged_exact_ratio
    data = raw_zlib_decode_buf(value)
  File "/codechecker/lib/python3/codechecker_server/migrations/common.py", line 26, in raw_zlib_decode_buf
    return zlib.decompress(value)
TypeError: a bytes-like object is required, not 'NoneType'

@whisperity
Copy link
Contributor Author

During database migration, the server gets trapped in a cyclic restart with the following error:

@vodorok I have identified the cause of the problem for that behaviour and fixed it. Turns out I made the bogus assumption that analysis_info.analyzer_command would be NON NULL in every case. The fix was trivial (if ... is None: continue 😏), I will update the patch soon.

Unfortunately, I encountered some pretty weird behaviour that I need to track down before this patch is touched...

@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch from 76b9910 to 0fe1134 Compare January 17, 2024 13:04
@whisperity whisperity added the API change 📄 Content of patch changes API! label Jan 17, 2024
@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch 3 times, most recently from 79cda3f to 682f4ae Compare January 22, 2024 14:41
@whisperity whisperity requested a review from vodorok February 21, 2024 16:58
@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch from 979967b to afe550f Compare February 23, 2024 13:58
Copy link
Contributor

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do a measurement of the store time of Xerxes-C analyzed with --enable-all. If the store time hit is not severe (under 5%), the patch can be merged. My other comments are questions, or suggestions.

web/server/codechecker_server/api/mass_store_run.py Outdated Show resolved Hide resolved
web/server/codechecker_server/api/mass_store_run.py Outdated Show resolved Hide resolved
web/server/codechecker_server/api/mass_store_run.py Outdated Show resolved Hide resolved
s_ver = None
prod_status[pd.endpoint] = (status, s_ver, package_schema,
db_location)
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a concrete exception occuring here? Shouldn't we expect for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main exception here was from the absolutely unnecessary sess.commit() timing out. After removing that line from here and moving it before the products are iterated, I was not able to produce an exception here. However, the entire server, in principle, should NOT die just because one product had an issue, so I think a catch-all case is appropriate here. Keeping the server alive should be paramount.

@whisperity
Copy link
Contributor Author

Please do a measurement of the store time of Xerxes-C analyzed with --enable-all.

@vodorok Thanks for the mention of the potential issue. Indeed, there were problems with the added flush() call. Here are some comparative measurements, with the original report directory created with a 6.23 release at 2.8 GiB. Without this patch, the store (as logged by the timing of massStoreRun()) took 1140.93 s, when using --verbose debug on both server and clientside.

With this patch, this had gone up to 1731.29 s. However, undoing the flush() calls and changing the logic around it slightly to remove the original need they were introduced for only lessened this to 1644.56 s.

So I've gone around and changed a few more things, because it turns out doing a SELECT id FROM checkers WHERE column1 = <string> AND column2 = <string> is not a good idea... 😋 With the upcoming changes to the patch, we will be down to 956 s. (At first this was indeed strange for me, so I reproduced it three more times each time flushing and re-filling (to almost the brink) the system RAM to evade/prevent OS-level caching and spawning new database containers as well), and this reproduces within 2-3 s of error.

If the store time hit is not severe (under 5%), the patch can be merged.

Unfortunately, it looks like this is a 16.2% reduction in time required on this test. I hope that's not a problem. 😏

@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch 2 times, most recently from 5dc9caa to 2519a2f Compare March 1, 2024 15:37
This patch allows users to gather whether a checker was enabled or
disabled during the analysis, irrespective of whether a checker produced
any reports (which might have been deleted from the server since!). This
improves auditing capabilities as the definite knowledge on which
checkers were _available_ is kept in the database, and not lost after
the analysis temporaries are cleaned up from the analysing client.

Features:

 - Create a new table, `checkers`, to store unique ID (per product
   database) for a checker's name.
 - Add information about checkers and enabledness to the database, based
   on the `metadata.json`, if available.
 - Extend the `AnalysisInfo` API object to report the collected
   information to the client.

Refactoring:

 - Normalise the use of the `checkers` table by lifting additional
   checker-unique information (`severity`) from `reports`, leaving only
   a `FOREIGN KEY` in the `reports` table.
 - Ensure that all versions of `metadata.json` is represented the same
   way in memory once the `MetadataInfoParser` succeeded.
 - Ensure that a long migration of a report database does not result in
   time-outs for the connection of the configuration database, and that
   the failure in the migration of one product does not kill the entire
   server.
@whisperity whisperity force-pushed the feat/show-zero-checkers/patch branch from 5dc9caa to f4a8089 Compare March 1, 2024 16:17
@whisperity whisperity requested a review from vodorok March 1, 2024 16:43
Copy link
Contributor

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in this case, I think I can make an exception and allow a deviation from the 5% run time change :). Great work!
LGTM!

@vodorok vodorok merged commit b544c58 into Ericsson:master Mar 1, 2024
7 of 8 checks passed
@whisperity whisperity deleted the feat/show-zero-checkers/patch branch March 2, 2024 18:28
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 7, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 7, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 8, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 8, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 8, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 8, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 11, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 12, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Mar 25, 2024
…their statistics

It is a new feature based on Ericsson#4089. The new Analysis statitics tab on the Statistics page is able to list all enabled checkers for runs that are selected (or for all runs if no run selected) in the report filter. The table lists all checkers that were enabled in at least one of runs according to the latest analysis. It also shows checker severity, status, number of closed and outstanding reports.

Status can inform the user that the specific checker was "Enabled in all" runs or "Enabled in all runs except these" where "runs" and "these" words are links to list appropriate runs.

Closed and outstanding report counts depend on review and detection status. These statistics represent the number of closed and outstanding reports that belong to runs that were created with new DB schema.
cservakt added a commit to cservakt/codechecker that referenced this pull request Jul 12, 2024
Fixing report sorting on unique mode. After modifying the DB schema in Ericsson#4089 PR, the unique mode query of getRunResults endpoint has been changed, therefore, the report sorting is not working properly.

Now, the unique mode query is redesigned and it use row_number() function to filter unique reports correctly. Where clause is also modified. It is getting rid of report annotation filter. Filtering annotation remains in having clause.
cservakt added a commit to cservakt/codechecker that referenced this pull request Aug 1, 2024
Fixing report sorting on unique mode. After modifying the DB schema in Ericsson#4089 PR, the unique mode query of getRunResults endpoint has been changed, therefore, the report sorting is not working properly.

Now, the unique mode query is redesigned and it use row_number() function to filter unique reports correctly. Where clause is also modified. It is getting rid of report annotation filter. Filtering annotation remains in having clause.
cservakt added a commit to cservakt/codechecker that referenced this pull request Aug 1, 2024
Fixing report sorting on unique mode. After modifying the DB schema in Ericsson#4089 PR, the unique mode query of getRunResults endpoint has been changed, therefore, the report sorting is not working properly.

Now, the unique mode query is redesigned and it use row_number() function to filter unique reports correctly. Where clause is also modified. It is getting rid of report annotation filter. Filtering annotation remains in having clause.
bruntib pushed a commit that referenced this pull request Aug 23, 2024
Fixing report sorting on unique mode. After modifying the DB schema in #4089 PR, the unique mode query of getRunResults endpoint has been changed, therefore, the report sorting is not working properly.

Now, the unique mode query is redesigned and it use row_number() function to filter unique reports correctly. Where clause is also modified. It is getting rid of report annotation filter. Filtering annotation remains in having clause.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change 📄 Content of patch changes API! database 🗄️ Issues related to the database schema. enhancement 🌟 server 🖥️
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants