Feature/fix allmatch #687

micahsnyder · 2022-08-26T17:08:31Z

Overhaul of the allmatch system. In allmatch mode, CL_VIRUS not returned until the very end. Removes need for allmatch checks scattered throughout the codebase.

lgtm-com · 2022-09-07T21:28:47Z

This pull request fixes 1 alert when merging 9243b2e into cba4246 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

ragusaa · 2022-09-13T17:44:47Z

I put a comment in CLAM-2083

micahsnyder · 2022-09-19T22:47:09Z

Just did a rebase to get things up to date, to resolve the merge conflicts and squash the commits that needed squashing.

lgtm-com · 2022-09-19T23:14:31Z

This pull request fixes 1 alert when merging a944aeb into 0301808 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-20T22:11:12Z

This pull request fixes 1 alert when merging fe86c44 into 0301808 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-21T01:28:15Z

This pull request fixes 1 alert when merging a2740e2 into bafb26b - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-21T19:53:51Z

This pull request fixes 1 alert when merging cc001bc into bafb26b - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-21T20:22:09Z

This pull request fixes 1 alert when merging 60b6837 into bafb26b - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-23T17:46:08Z

This pull request fixes 1 alert when merging ca6c589 into f720b00 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-26T06:51:36Z

This pull request fixes 1 alert when merging 774700c into f720b00 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

lgtm-com · 2022-09-26T18:51:44Z

This pull request fixes 1 alert when merging a4a0d35 into f720b00 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

micahsnyder · 2022-09-27T05:37:41Z

I found that the section offset commit increased ram usage by a bit (50-ish MB?) and also increased load time by a lot (60 seconds give or take).
So I removed the section offset fix commit and moved it to a separate branch for further investigation: https://github.com/micahsnyder/clamav-micah/tree/CLAM-2094-section-offset-acpatt

lgtm-com · 2022-09-27T06:01:07Z

This pull request fixes 1 alert when merging 95369d3 into 197113c - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

micahsnyder · 2022-09-27T15:56:08Z

I just removed the "reduce RAM usage" commit and moved it to a separate branch so we can test and review it separate. I found that it had some false positive issues relating to offset sigs during regression testing: https://github.com/micahsnyder/clamav-micah/tree/CLAM-2128-reduce-ac-patt-mem-usage

Since I found a different way to reduce RAM usage in this PR back to normal levels (moving the section offset bug fix to a separate branch as well), this change not as important to merge at the same time.

lgtm-com · 2022-09-27T16:34:30Z

This pull request fixes 1 alert when merging f2f4a91 into 197113c - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

libclamav/7z_iface.c

And fix issue where critical errors in magic_scan done by pdf_scan_contents() may be ignored.

And fix issue where call to magic_scan may not propagate critical errors.

It appears the result from the _x509_to_pem() function was accidentally hardcoded to a `0` (success)

This does not change the ABI because enums are ints. Though it may add warnings.

Fix issue where critical errors from magic_scan may not be propagated. Fix issue where file name and descriptor may be leaked if keeptmp is enabled.

And fix issue where magic_scan() critical error may be ignored.

And fixed a couple of conditions where critical scan errors may be ignored.

Also fixed a number of conditions where magic_scan() critical errors may be ignored. To ensure that the scan truly aborts for signature matches (not in allmatch mode) and for timeouts, the `ctx->abort` option is now set in these two conditions, and checked in several spots in magic_scan(). Additionally, I've consolidated some of the "scan must halt" type of checks (mostly large switch statements) into a function so that we can use the exact same logic in a few places in magic_scan(). I've also fixed a few minor warnings and code format issues.

The header parsing / executable metadata collecting functions for the PE, ELF, and Mach-O file types were using `int` for the return type. Mostly they were returning 0 for success and -1, -2, -3, or -4 for failure. But in some cases they were returning cl_error_t enum values for failure. Regardless, the function using them was treating 0 as success and non-zero as failure, which it stored as -1 ... every time. This commit switches them all to use cl_error_t. I am continuing to storeo the final result as 0 / -1 in the `peinfo` struct, but outside of that everything has been made consistent. While I was working on that, I got a tad side tracked. I noticed that the target type isn't an enum, or even a set of #defines. So I made an enum and then changed the code that uses target types to use the enum. I also removed the `target` parameter from a number of functions that don't actually use it at all. Some recursion was masking the fact that it was an unused parameter which is why there was no warning about it.

The cli_bc_ctx->outfd struct member was not properly initialized to -1. Perhaps previous developers figured 0 was invalid-enough. All of the checks for that file descriptor assumed 0 was the invalid value, going so far as to explicitly set outfd to 0 if `open()` returned -1. I didn't know this, so when I cleaned up the error handling in `cli_unpackelf()` and `cli_unpackmacho()`, I had it `close(outfd)` when not -1. That of course ended up closing stdin... and then all subsequent file scans opened the file as fd `0`,... which interestingly caused `read()` and `stat()` errors, but only after scanning a macho or elf file, first. Anyways... this commit fixes the issue by properly initializing outfd to -1, and by changing any checks from 0 to -1. I also found that it appears that the bytecode timeout wasn't being applied to bytecode functions associated with logical signaures (that is, those run by `cli_bytecode_runlsig()`). What I see is that `ctx->bytecode_timeout` is only set to a non-zero value in `cli_bytecode_context_alloc()`. But for `cli_bytecode_runlsig()`, the bytecode context sits on the stack and is memset instead. To resolve this, and ensure the bytecode context is properly initialized, I created a new function that does this and had it do the memset instead of using a calloc in the allocation function. I also removed the `bytecode_context_clear()` function because it simply called `bytecode_context_reset()` and then did a memset. The memset is unnecessary, especially since in most cases it's memsetting a stack structure immediately before a return.

Because of an oversight, the `ctx->abort_scan` flag was propagating timeout and perhaps other error types to the final verdict. Fixed this in the `result_should_goto_done()` function. Some non-fatal errors (notably CL_ETIMEOUT) may appear between the call to the magic-scan, and the end of scan_common(). We need to filter these status results at this level as well to prevent this from happening. I also fixed a warning accidentally introduced earlier caused by trying to use a string after free'ing it and setting it to NULL. And I explicitly set the status to CL_SUCCESS if the cache check comes back clean, rather than trusting that no one had messed with the `ret` variable since it was let set (to CL_SUCCESS). It's more readable and less prone to future bugs.

Fix pretty silly double-free, and fix issue where 'extracted_file' was not being set to true. Also rename `uncompressed` variable to `uncompressed_block`, because there is another variable with the same name, above.

The `cli_memcpy` function is a version of memcpy that will catch exceptions on Windows if the memcpy fails because the memory-mapped file fails to read (like if a network drive is disconnected or something). It presently returns an `int`. Since it is only used in one spot, I've swapped it over to a `cl_error_t`, and created a unique variable so we're not piggybacking with the `cli_file_t` enum used to check the result of a filetyping scan.

Error handling for bytecode sig evaluation, as well as logical and yara sig evaluation only consider CL_VIRUS or CL_SUCCESS and fail to consider terminal errors such as CL_ETIMEOUT. This commit fixes the error handling in those functions so we properly abort instead of continuing to evaluate more sigs. This commit also adds some a scan time limit checks: 1. shortly after a bytecode time limit exceeded. 2. after evaluating each lsig or yara sig. 3. when processing unzip central directory file headers.

In the logical sig evaluation function, we are calling the run bytecode function even if the lsig doesn't depend on bytecode. That function immediately returns CL_ENULLARG, which previously was ignored because we were only checking for CL_VIRUS. Now that we're checking if NOT success, the error propagates and scanning of that layer ends early. The above bug is actually in some duplicate code, so when fixing this I went ahead and refactored the function to remove this, and added additional comments and readability improvements.

Some files that take longer to scan may exceed the scan time by a fair bit during the actual pattern matching part of the scan (vs. decompressing archives or running bytecode functions, or whatever). ClamAV does pattern matching and hashing of files in chunks. This change adds a time check before each chunk is matched.

micahsnyder · 2022-10-18T01:44:54Z

Rebased and fixed massive merge conflict caused by archive inspection PR.

lgtm-com · 2022-10-18T02:14:22Z

This pull request fixes 1 alert when merging 155ca1b into cf81299 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

Some small files that appear to include archive content (zip entries, RAR archives, etc) later in the file are no longer set to that SFX type initially. However, the current way we skip those drops the initial type detection on the floor, meaning that something like a small image may end up as CL_TYPE_ANY instead of CL_TYPE_PNG. This commit fixes that by changing where in the process the SFX types are ignored.

lgtm-com · 2022-10-18T21:08:27Z

This pull request fixes 1 alert when merging efa8a01 into cf81299 - view on LGTM.com

fixed alerts:

1 for Empty branch of conditional

micahsnyder added the ⚒️Work in Progress (WIP) label Aug 26, 2022

micahsnyder marked this pull request as draft August 29, 2022 17:52

micahsnyder mentioned this pull request Sep 7, 2022

Clam 1594 fix allmatch remove checks (part 2) #684

Merged

micahsnyder force-pushed the feature/fix-allmatch branch from f4a21e8 to a944aeb Compare September 19, 2022 22:46

micahsnyder force-pushed the feature/fix-allmatch branch from a944aeb to fe86c44 Compare September 20, 2022 21:43

micahsnyder force-pushed the feature/fix-allmatch branch from fe86c44 to a2740e2 Compare September 21, 2022 00:58

micahsnyder force-pushed the feature/fix-allmatch branch from cc001bc to 60b6837 Compare September 21, 2022 19:57

micahsnyder force-pushed the feature/fix-allmatch branch from 5bcefe6 to ca6c589 Compare September 23, 2022 17:19

micahsnyder force-pushed the feature/fix-allmatch branch from 637b0a1 to 774700c Compare September 26, 2022 06:25

micahsnyder force-pushed the feature/fix-allmatch branch from 774700c to a4a0d35 Compare September 26, 2022 18:13

micahsnyder force-pushed the feature/fix-allmatch branch from a4a0d35 to 95369d3 Compare September 27, 2022 05:31

micahsnyder force-pushed the feature/fix-allmatch branch from 95369d3 to f2f4a91 Compare September 27, 2022 15:52

micahsnyder removed the ⚒️Work in Progress (WIP) label Sep 29, 2022

micahsnyder marked this pull request as ready for review September 29, 2022 21:05

micahsnyder commented Sep 29, 2022

View reviewed changes

libclamav/7z_iface.c Outdated Show resolved Hide resolved

micahsnyder added 23 commits October 17, 2022 10:41

PCRE: Remove allmatch check + minor code cleanup

b765e56

MSXML: Remove allmatch checks + minor code cleanup

74a5ea8

OLE2: Remove allmatch checks + minor code cleanup

47ddd92

PDF: remove allmatch checks

fa95ceb

And fix issue where critical errors in magic_scan done by pdf_scan_contents() may be ignored.

PE: Remove allmatch checks + minor code cleanup

0a299a4

And fix issue where call to magic_scan may not propagate critical errors.

Fix certificate load x509 -> PEM result check

3fe98fe

It appears the result from the _x509_to_pem() function was accidentally hardcoded to a `0` (success)

Correct parameter type in cl_strerror() public API

b41d62c

This does not change the ABI because enums are ints. Though it may add warnings.

Fix error handling in Mach-O and ELF bytecode unpacker hooks

17c4104

Fix issue where critical errors from magic_scan may not be propagated. Fix issue where file name and descriptor may be leaked if keeptmp is enabled.

TAR: Remove allmatch checks

8804925

And fix issue where magic_scan() critical error may be ignored.

ZIP: Remove allmatch checks

150fd6c

And fix issue where magic_scan() critical error may be ignored.

Excel (XLM, VBA): Remove allmatch checks + some code cleanup

677c1f4

And fixed a couple of conditions where critical scan errors may be ignored.

Phishcheck: remove allmatch check

e2201ed

Change timeout message verbosity in PDF parser to debug

612a869

HFS+ parser: Fix bugs introduced in refactor

9ebed53

Fix pretty silly double-free, and fix issue where 'extracted_file' was not being set to true. Also rename `uncompressed` variable to `uncompressed_block`, because there is another variable with the same name, above.

Fmap: Fix copy-paste bug when checking if map hash exists

051bc96

micahsnyder force-pushed the feature/fix-allmatch branch from f10a037 to 155ca1b Compare October 18, 2022 01:44

micahsnyder merged commit 9d8af63 into main Oct 19, 2022

micahsnyder deleted the feature/fix-allmatch branch October 19, 2022 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/fix allmatch #687

Feature/fix allmatch #687

micahsnyder commented Aug 26, 2022

lgtm-com bot commented Sep 7, 2022

ragusaa commented Sep 13, 2022

micahsnyder commented Sep 19, 2022

lgtm-com bot commented Sep 19, 2022

lgtm-com bot commented Sep 20, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 23, 2022

lgtm-com bot commented Sep 26, 2022

lgtm-com bot commented Sep 26, 2022

micahsnyder commented Sep 27, 2022

lgtm-com bot commented Sep 27, 2022

micahsnyder commented Sep 27, 2022

lgtm-com bot commented Sep 27, 2022

micahsnyder commented Oct 18, 2022

lgtm-com bot commented Oct 18, 2022

lgtm-com bot commented Oct 18, 2022

Feature/fix allmatch #687

Feature/fix allmatch #687

Conversation

micahsnyder commented Aug 26, 2022

lgtm-com bot commented Sep 7, 2022

ragusaa commented Sep 13, 2022

micahsnyder commented Sep 19, 2022

lgtm-com bot commented Sep 19, 2022

lgtm-com bot commented Sep 20, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 21, 2022

lgtm-com bot commented Sep 23, 2022

lgtm-com bot commented Sep 26, 2022

lgtm-com bot commented Sep 26, 2022

micahsnyder commented Sep 27, 2022

lgtm-com bot commented Sep 27, 2022

micahsnyder commented Sep 27, 2022

lgtm-com bot commented Sep 27, 2022

micahsnyder commented Oct 18, 2022

lgtm-com bot commented Oct 18, 2022

lgtm-com bot commented Oct 18, 2022