Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh-keygen: generating new host keys: DILITHIUM_3 malloc(): corrupted top size #111

Closed
vt-alt opened this issue Aug 22, 2021 · 5 comments
Closed

Comments

@vt-alt
Copy link

vt-alt commented Aug 22, 2021

After upgrading to OQS-OpenSSH-snapshot-2021-08 ssh-keygen crashes failing sshd.service.

# /usr/bin/ssh-keygen -A
ssh-keygen: generating new host keys: DILITHIUM_3 malloc(): corrupted top size
Aborted (core dumped)
# gdb -q --args /usr/bin/ssh-keygen -A
Reading symbols from /usr/bin/ssh-keygen...
Reading symbols from /usr/lib/debug/usr/bin/ssh-keygen.debug...
(gdb) r
Starting program: /usr/bin/ssh-keygen -A
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
ssh-keygen: generating new host keys: DILITHIUM_3 malloc(): corrupted top size

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49        return ret;
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007ffff72c6538 in __GI_abort () at abort.c:79
#2  0x00007ffff731ee77 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff742f399 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007ffff732695c in malloc_printerr (str=str@entry=0x7ffff742d634 "malloc(): corrupted top size") at malloc.c:5389
#4  0x00007ffff732a354 in _int_malloc (av=av@entry=0x7ffff7461a00 <main_arena>, bytes=bytes@entry=200) at malloc.c:4135
#5  0x00007ffff732c161 in __libc_calloc (n=n@entry=1, elem_size=elem_size@entry=200) at malloc.c:3448
#6  0x000055555556f6a8 in sshkey_new (type=19) at sshkey.c:689
#7  0x00005555555713e2 in sshkey_from_private (k=0x5555555e4ff0, pkp=pkp@entry=0x7fffffffd698) at sshkey.c:2190
#8  0x00005555555650f5 in do_gen_all_hostkeys (pw=pw@entry=0x5555555e47b0) at ssh-keygen.c:1247
#9  0x0000555555560797 in main (argc=0, argv=0x7fffffffe4f8) at ssh-keygen.c:3788
(gdb)

Run in valgrind does not crash but produces multiple warnings::

/etc/openquantumsafe-openssh# rm ./ssh_host_dilithium3_key
rm: remove regular file './ssh_host_dilithium3_key'? y
/etc/openquantumsafe-openssh# /usr/bin/ssh-keygen  -A
ssh-keygen: generating new host keys: DILITHIUM_3 malloc(): corrupted top size
Aborted (core dumped)
/etc/openquantumsafe-openssh# valgrind /usr/bin/ssh-keygen  -A
==14133== Memcheck, a memory error detector
==14133== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==14133== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==14133== Command: /usr/bin/ssh-keygen -A
==14133==
ssh-keygen: generating new host keys: DILITHIUM_3 ==14133== Invalid write of size 1
==14133==    at 0x4AE9DFD: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:896)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c10 is 0 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9E0E: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:899)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c12 is 2 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9E16: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:898)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c11 is 1 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9D70: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:880)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c13 is 3 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9D7E: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:882)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c14 is 4 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9D8D: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:883)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c15 is 5 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9D9F: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:885)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c16 is 6 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DB7: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:887)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c17 is 7 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DBB: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:888)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c18 is 8 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DC9: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:890)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c19 is 9 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DD7: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:891)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c1a is 10 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DE6: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:893)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c1b is 11 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==
==14133== Invalid write of size 1
==14133==    at 0x4AE9DF9: pqcrystals_dilithium3_avx2_polyt0_pack (poly.c:895)
==14133==    by 0x4AEB750: pqcrystals_dilithium3_avx2_keypair (sign.c:164)
==14133==    by 0x12523C: sshkey_generate (sshkey.c:2066)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==  Address 0x5618c1c is 12 bytes after a block of size 4,000 alloc'd
==14133==    at 0x483C79B: malloc (vg_replace_malloc.c:380)
==14133==    by 0x124FA6: sshkey_generate (sshkey.c:2012)
==14133==    by 0x1190DD: do_gen_all_hostkeys (ssh-keygen.c:1243)
==14133==    by 0x114796: main (ssh-keygen.c:3788)
==14133==

==14133==
==14133== HEAP SUMMARY:
==14133==     in use at exit: 303 bytes in 8 blocks
==14133==   total heap usage: 894 allocs, 886 frees, 353,146 bytes allocated
==14133==
==14133== LEAK SUMMARY:
==14133==    definitely lost: 248 bytes in 2 blocks
==14133==    indirectly lost: 44 bytes in 5 blocks
==14133==      possibly lost: 0 bytes in 0 blocks
==14133==    still reachable: 11 bytes in 1 blocks
==14133==         suppressed: 0 bytes in 0 blocks
==14133== Rerun with --leak-check=full to see details of leaked memory
==14133==
==14133== For lists of detected and suppressed errors, rerun with: -s
==14133== ERROR SUMMARY: 16 errors from 13 contexts (suppressed: 0 from 0)
@baentsch
Copy link
Member

baentsch commented Aug 27, 2021

Can you please share more information as to how to reproduce this issue? Platform? build options? liboqs version?

Edit: Just rebuilt everything on a local Linux machine (Mint 19/Ubuntu Bionic, x86_64, liboqs current main (0.7.0-dev)) and everything works just fine (after explicitly deleting only the already successfully built Dilithium3 key):

> ~/git/oqs/openssh$ ./ssh-keygen -A
ssh-keygen: generating new host keys: DILITHIUM_3 
> ~/git/oqs/openssh$ ./ssh -V
OQS-OpenSSH_8.6-2021-08_p1, OpenSSL 1.1.1  11 Sep 2018
> ~/git/oqs/openssh$ ./ssh-keygen --version
unknown option -- -
usage: ssh-keygen [-q] [-a rounds] [-b bits] [-C comment] [-f output_keyfile]
                  [-m format] [-N new_passphrase] [-O option]
                  [-t dsa | ecdsa | ecdsa-sk | ed25519 | ed25519-sk | rsa |
                  OQS-fork added algorithms (see README.md) ]
                  [-w provider] [-Z cipher]
...
> ~/git/oqs/openssh$ ldd ./ssh-keygen
	linux-vdso.so.1 (0x00007ffccfed6000)
	libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fede8ecd000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fede8cc9000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fede8aaf000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fede86be000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fede849f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fede9843000)

--> If the issue persists for you, please share output of the same commands in your environment.

@vt-alt
Copy link
Author

vt-alt commented Aug 28, 2021

I think I found the cause of problem. (I build OQS packages for ALT Linux).

It seems, when I installed new openquantumsafe-openssh the liboqs package is not updated to latest and there is subtle ABI difference causing crash for DILITHIUM.
So, this is seems my fault. We have automatic dependency resolver which is was depending upon, but it works by comparing list of the library symbols and cannot detect ABI difference. I will set dependency to liboqs by the version numbers to make this not happen in the future.

Thanks for the patience and your work!

@vt-alt vt-alt closed this as completed Aug 28, 2021
@vt-alt
Copy link
Author

vt-alt commented Aug 29, 2021

I try to think over this problem, and I think it's partly because of incorrect library versioning of the liboqs. Currently, it just mirrors it project versioning, but library versioning (or 'API versioning') is a different thing and should be maintained separately. See https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
Cmake have SOVERSION attribute for this too.

liboqs library versions change like 0.5.0, 0.6.0, 0.7.0 which meant (in all these releases) that in its API only new functions are added and no backward incompatible changes happened. Which seems incorrect.

More links on the subject:
https://stackoverflow.com/questions/12637841/what-is-the-soname-option-for-building-shared-libraries-for
https://cmake.org/cmake/help/latest/prop_tgt/SOVERSION.html

There is tool to compare ABIs (it may not catch all subtle changes though):

$ abipkgdiff liboqs-0.4.0-alt1.x86_64.rpm --d1 liboqs-debuginfo-0.4.0-alt1.x86_64.rpm \
             liboqs-0.7.0-alt1.x86_64.rpm --d2 liboqs-debuginfo-0.7.0-alt1.x86_64.rpm
================ changes of 'liboqs.so.0.5.0-dev'===============
  Functions changes summary: 2 Removed, 0 Changed (45 filtered out), 2 Added functions
  Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
  Function symbols changes summary: 0 Removed, 12 Added function symbols not referenced by debug info
  Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info

  2 Removed functions:

    [D] 'function OQS_CPU_EXTENSIONS OQS_get_available_CPU_extensions()'    {OQS_get_available_CPU_extensions}
    [D] 'function const char* OQS_get_cpu_extension_name(unsigned int)'    {OQS_get_cpu_extension_name}

  2 Added functions:

    [A] 'function int OQS_CPU_has_extension(OQS_CPU_EXT)'    {OQS_CPU_has_extension}
    [A] 'function int OQS_MEM_secure_bcmp(void*, void*, size_t)'    {OQS_MEM_secure_bcmp}

  12 Added function symbols not referenced by debug info:

    [A] KeccakF1600_FastLoop_Absorb_avx2
    [A] KeccakP1600_12rounds_FastLoop_Absorb_avx2
    [A] KeccakP1600_AddByte_avx2
    [A] KeccakP1600_AddBytes_avx2
    [A] KeccakP1600_ExtractAndAddBytes_avx2
    [A] KeccakP1600_ExtractBytes_avx2
    [A] KeccakP1600_Initialize_avx2
    [A] KeccakP1600_OverwriteBytes_avx2
    [A] KeccakP1600_OverwriteWithZeroes_avx2
    [A] KeccakP1600_Permute_12rounds_avx2
    [A] KeccakP1600_Permute_24rounds_avx2
    [A] KeccakP1600_Permute_Nrounds_avx2

================ end of changes of 'liboqs.so.0.5.0-dev'===============

Certainly, functions removal is backward incompatible change to the library, so major library version should have been incremented.

@baentsch
Copy link
Member

baentsch commented Aug 29, 2021

Thanks very much for the pointers and education. I tend to agree that we're indeed not following reasonable practices.

Certainly, functions removal is backward incompatible change to the library, so major library version should have been incremented.

Personally I tend to agree and we'd need to improve. The function removal pointed out above also comes a bit as a surprise to me.

--> Would you want to rename (and re-open) this issue or create a separate new one only highlighting this library versioning issue? Edit: Disregard: This belongs into liboqs as a separate issue.

@vt-alt
Copy link
Author

vt-alt commented Aug 29, 2021

JFYI. To fix the problem with our package I just increased SOVERSION of liboqs to 1 (see the patch below) and rebuilt both liboqs 0.7.0 and OQS openssh 8.6p1.202108 so they get linked properly and would be updated in accord. (That solution looked better than manually setting version dependency).

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index a68c5619..01cf1e73 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -84,7 +84,7 @@ set_target_properties(oqs
     ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
     LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
     VERSION ${OQS_VERSION_TEXT}
-    SOVERSION 0
+    SOVERSION 1
     # For Windows DLLs
     RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants