Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash Module Memory Leaks while using WinCrypt APIs on Windows #1703

Closed
dlaktionov1 opened this issue May 8, 2022 · 1 comment
Closed

Hash Module Memory Leaks while using WinCrypt APIs on Windows #1703

dlaktionov1 opened this issue May 8, 2022 · 1 comment

Comments

@dlaktionov1
Copy link

When libyara is built with HAVE_WINCRYPT_H = 1, the library uses WinCrypt APIs in order to calculate MD5, SHA1 and SHA256 hashes: CryptCreateHash, CryptHashData, CryptGetHashParam and CryptDestroyHash. As can be seen in crypto.h#L69, these API routines are wrapped using generic C macros, such as yr_md5_init, yr_md5_update and yr_md5_final for MD5 hashes, and similar macros for SHA1 and SHA256 hashes.
The documentation for CryptCreateHash API clearly states that:

When you have finished using the hash object, release the handle by calling the CryptDestroyHash function.

This statement poses an important restriction while using the above hash-calculation macros: whenever the yr_hash_init macro is used, one must always call the yr_hash_final macro in order to release the memory resources used by WinCrypt (or more precisely by the corresponding cryptographic providers). Unfortunately, this restriction is not being followed by the implementation of the hash module, due to a logical hash-caching related bug: to avoid calculating the same hashes for the same regions of the scanned sample, the hash module caches the hash calculation results using a module-specific YR_HASH_TABLE (see add_to_cache and get_from_cache). Whenever the hash.md5, hash.sha1 or hash.sha256 entry points are used over non-string data (data_md5, data_sha1, data_sha256), the module follows the same logical pattern:

  • Initializing the hash calculation context using the corresponding yr_hash_init macro.
  • Performing a cache lookup using get_from_cache and if a cache hit occurs, immediately returning the available hash value.
  • Calculating the requested hash value, using the yr_hash_update macro.
  • Finalizing the hash calculation using the yr_hash_final macro.
  • Caching the calculated hash using add_to_cache.

The problem with this implementation pattern is now obvious: whenever libyara is given a YARA ruleset containing multiple rules which use the same hash-based conditions over the same data regions of the scanned sample, a cache hit will occur and the library will leak all the WinCrypt-related memory allocations which were made by the corresponding yr_hash_init macro calls. In situations where libyara is linked and used by a long-running critical process (such as a Windows service which implements an antimalware scanning solution), this memory leaking bug will end up consuming the entire virtual memory of the service process until the machine physical memory limit is reached and either the service or the entire machine restarts due to lack of memory.
The solution to the problem is obvious as well: once yr_hash_init macro is used, the yr_hash_final macro must be called on every execution branch which returns to the caller (the cache hit scenario and any other error-handling code blocks).
The issue affects all libyara releases since 4.0.x.

plusvic added a commit that referenced this issue May 10, 2022
Fix memory leaks described in #1703.
plusvic added a commit that referenced this issue May 10, 2022
Fix memory leaks described in #1703.
@plusvic
Copy link
Member

plusvic commented May 10, 2022

Fixed in #1705

@plusvic plusvic closed this as completed May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants