Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysqld failed while attempting to check config on arm 64 #338

Closed
jmburges opened this issue Dec 28, 2020 · 43 comments
Closed

mysqld failed while attempting to check config on arm 64 #338

jmburges opened this issue Dec 28, 2020 · 43 comments
Labels
need feedback Need feedback from user.

Comments

@jmburges
Copy link

Hello,

I ran sudo docker run --name some-mariadb -e MYSQL_ROOT_PASSWORD=my-secret-pw arm64v8/mariadb:latest to test out mariadb on my ODroid-C2 and receieved the following error:

2020-12-28 01:47:19+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 1:10.5.8+maria~focal started.
2020-12-28 01:47:20+00:00 [ERROR] [Entrypoint]: mysqld failed while attempting to check config
        command was: mysqld --verbose --help --log-bin-index=/tmp/tmp.55u0o1Cg8L

It's under sudo so I don't think there should be any permissions problems. Any ideas?

@tianon
Copy link
Contributor

tianon commented Dec 28, 2020

Interesting, sounds like the mysqld binary is failing for some reason -- can you try something like the following to try to help narrow this down?

$ docker pull arm64v8/mariadb:latest
$ docker run -it --rm --entrypoint bash arm64v8/mariadb:latest
root@b7403e3d732e:/# mysqld --version
root@b7403e3d732e:/# mysqld --verbose --help > /dev/null

@jmburges
Copy link
Author

Thanks for the fast response! Here is the output. Looks like it's not starting at all. weird

root@30c94cb0cb16:/# mysqld --version
Illegal instruction (core dumped)
root@30c94cb0cb16:/# mysqld --verbose --help > /dev/null
Illegal instruction (core dumped)

@tianon
Copy link
Contributor

tianon commented Dec 28, 2020

Bizarre - sounds like something is either wrong with the mysqld binary published by MariaDB or the ODroid-C2 chip/kernel. 😕

@grooverdan
Copy link
Member

Can you include some hardware information LD_SHOW_AUXV=1 /bin/true? and lscpu?

If you look at dmesg output what address did this occur at? Is this for the mariadb-10.5.8 version?

Writing an upstream bug report on https://jira.mariadb.org would be appreciated.

@grooverdan
Copy link
Member

Note https://jira.mariadb.org/browse/MDEV-23495 there was a bug in 10.5 that is fixed in 10.5.7 onward. If your version is in this range try an update.

@fauust
Copy link
Collaborator

fauust commented Feb 19, 2021

Hi,
I am not able to reproduce this on:

  • CentOS 8 ARM (virtual machine, openstack, QEMU CPU);
  • Raspberry Pi 4.

@jmburges, here is for comparison the information requested by @grooverdan.

  • CentOS 8 ARM:
LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR: 0xffff8b780000
AT_??? (0x33): 0x1270
AT_HWCAP:        917fff
AT_PAGESZ:       65536
AT_CLKTCK:       100
AT_PHDR:         0xaaaac7110040
AT_PHENT:        56
AT_PHNUM:        9
AT_BASE:         0xffff8b790000
AT_FLAGS:        0x0
AT_ENTRY:        0xaaaac71116e0
AT_UID:          0
AT_EUID:         0
AT_GID:          0
AT_EGID:         0
AT_SECURE:       0
AT_RANDOM:       0xffffdc145838
AT_EXECFN:       /bin/true
AT_PLATFORM:     aarch64lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  8
Socket(s):           1
NUMA node(s):        1
Vendor ID:           0x48
Model:               0
Stepping:            0x1
CPU max MHz:         2400.0000
CPU min MHz:         2400.0000
BogoMIPS:            200.00
L1d cache:           64K
L1i cache:           64K
L2 cache:            512K
L3 cache:            32768K
NUMA node0 CPU(s):   0-7
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
  • Raspberry PI 4:
LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR:      0xffff93227000
AT_??? (0x33): 0x1270
AT_HWCAP:             887
AT_PAGESZ:            4096
AT_CLKTCK:            100
AT_PHDR:              0xaaaaac030040
AT_PHENT:             56
AT_PHNUM:             9
AT_BASE:              0xffff931f7000
AT_FLAGS:             0x0
AT_ENTRY:             0xaaaaac031710
AT_UID:               1000
AT_EUID:              1000
AT_GID:               1000
AT_EGID:              1000
AT_SECURE:            0
AT_RANDOM:            0xffffdbf07d98
AT_HWCAP2:            0x0
AT_EXECFN:            /bin/true
AT_PLATFORM:          aarch64lscpu
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       ARM
Model:                           3
Model name:                      Cortex-A72
Stepping:                        r0p3
CPU max MHz:                     1500.0000
CPU min MHz:                     600.0000
BogoMIPS:                        108.00
NUMA node0 CPU(s):               0-3
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Vulnerable
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm crc32 cpuid

@jmburges you could also try to build the container and use it directly.
Something like this should do it:

git clone https://github.com/docker-library/mariadb && cd mariadb/10.5
docker build . -t mariadb-test
docker run -it -e MYSQL_ROOT_PASSWORD=my-secret-pw mariadb-test:latest

@grooverdan
Copy link
Member

@jmburges. We'd really like to solve this on your hardware. Can you please include the hardware information?

Also can you in a container:

# add-apt-repository 'deb [arch=amd64,arm64,ppc64el] https://download.nus.edu.sg/mirror/mariadb/repo/10.5/ubuntu focal main/debug'
# apt-get update -y
# apt-get install mariadb-server-10.5-dbg-sym
# gdb --args mysqld --verbose
gdb> bt full

This will help us find where in the codebase the problem could be.

@grooverdan grooverdan added the need feedback Need feedback from user. label Mar 2, 2021
@grooverdan
Copy link
Member

Cannot proceed without information.

@stoinov
Copy link

stoinov commented Oct 7, 2021

I can confirm the same issue with a Odroid-C2
Here are my hardware specs:

❯ LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR: 0x7f911f0000
AT_HWCAP:        83
AT_PAGESZ:       4096
AT_CLKTCK:       100
AT_PHDR:         0x5581496040
AT_PHENT:        56
AT_PHNUM:        8
AT_BASE:         0x7f911c7000
AT_FLAGS:        0x0
AT_ENTRY:        0x5581497a2c
AT_UID:          0
AT_EUID:         0
AT_GID:          0
AT_EGID:         0
AT_SECURE:       0
AT_RANDOM:       0x7fe3652678
AT_EXECFN:       /bin/true
AT_PLATFORM:     aarch64

❯ lscpu
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
Model:                 4
CPU max MHz:           1536.0000
CPU min MHz:           100.0000
BogoMIPS:              2.00
Flags:                 fp asimd crc32

I also tried building from source and got the same error when running the custom image.

In order to run add-apt-repository on a latest ubuntu image I followed this article.
Unfortunately I got error when ran apt-get install mariadb-server-10.5-dbg-sym:
E: Unable to locate package mariadb-server-10.5-dbg-sym

@grooverdan
Copy link
Member

grooverdan commented Oct 7, 2021

@stoinov I've created some prebuilt images with debug symbols installed - https://quay.io/repository/mariadb-foundation/mariadb-debug?tab=tags

Use quay.io/mariadb-foundation/mariadb-debug:10.5 example as the image name.

Can you try one of those?

@stoinov
Copy link

stoinov commented Oct 7, 2021

I've used the image you provided and I still get the same error:

[ERROR] [Entrypoint]: mysqld failed while attempting to check config
        command was: mysqld --verbose --help --log-bin-index=/tmp/tmp.vJJK0kpbmZ

When I enter the container with docker run --name test -it --entrypoint /bin/bash quay.io/mariadb-foundation/mariadb-debug:10.5 I am still getting E: Unable to locate package mariadb-server-10.5-dbg-sym
gdb --args mysqld --verbose command is not found.

@grooverdan
Copy link
Member

Debug symbols are already installed:

$ podman  run --name test -it --entrypoint /bin/bash --rm  quay.io/mariadb-foundation/mariadb-debug:10.5
root@9b22202a5296:/# dpkg -l | grep sym
ii  libmariadb3-dbgsym:amd64        1:10.5.13+maria~focal             amd64        debug symbols for libmariadb3
ii  libnettle7:amd64                3.5.1+really3.5.1-2ubuntu0.2      amd64        low level cryptographic library (symmetric and one-way cryptos)
ii  mariadb-backup-dbgsym           1:10.5.13+maria~focal             amd64        debug symbols for mariadb-backup
ii  mariadb-client-10.5-dbgsym      1:10.5.13+maria~focal             amd64        debug symbols for mariadb-client-10.5
ii  mariadb-client-core-10.5-dbgsym 1:10.5.13+maria~focal             amd64        debug symbols for mariadb-client-core-10.5
ii  mariadb-server-10.5-dbgsym      1:10.5.13+maria~focal             amd64        debug symbols for mariadb-server-10.5
ii  mariadb-server-core-10.5-dbgsym 1:10.5.13+maria~focal             amd64        debug symbols for mariadb-server-core-10.5

So first option, see if running without specifying an entrypoint gives a decent backtrace.

Option 2:

$ podman  run --name test -it  --rm  -p 2345:2345 --cap-add CAP_SYS_PTRACE --security-opt seccomp=unconfined  quay.io/mariadb-foundation/mariadb-debug:10.5 gosu mysql gdbserver :2345 mariadbd
Process mariadbd created; pid = 11
Listening on port 2345

And use gdb compiled with this patch https://sourceware.org/pipermail/gdb-patches/2021-August/181718.html to get the debug symbols to connect - https://jira.mariadb.org/browse/MDEV-26727

Option 3:

FROM  quay.io/mariadb-foundation/mariadb-debug:10.5
RUN apt-get update && apt-get install -y gdb

The after building this use docker run (as above) new_image gosu mysql gdb --args mariadbd

Both above 2 options assume either a) it crashes early without a datadir. If it doesn't provide /var/lib/mysql as a volume containing a created datadir (even if copied from elsewhere).

@grooverdan
Copy link
Member

option 2,3; while gdb is waiting to run the server, you can docker exec {container} sh -c "chown -R mysql: && mariadb-install -u mysql" to install the instance (assuming it doesn't crash during the install, which for an architecture bug seems likely it would).

@stoinov
Copy link

stoinov commented Oct 7, 2021

  1. Running the container gives the initial one line error.
  2. Was not sure how to do it
  3. Built an image and started it as shown. Got this prompt waiting for input:
Reading symbols from mariadbd...
Reading symbols from /usr/lib/debug/.build-id/27/53d3e03d00ed21daecfe736a4aaf2689218226.debug...
(gdb)

I tried running the chown command inside the docker but got: chown: missing operand after 'mysql:'

Running just docker exec test sh -c "mariadb-install -u mysql" returned mariadb-install: not found

@grooverdan
Copy link
Member

grooverdan commented Oct 7, 2021

Sorry I was assuming too much. Let me do a more complete of a modified option 3 example without typos:

Running (with podman, docker substitute should be equivalent). Here we install gdb on the prompt.

$    podman run -ti  --cap-add CAP_SYS_PTRACE --security-opt seccomp=unconfined  quay.io/mariadb-foundation/mariadb-debug:10.5 bash
root@fe15c31b5a31:/# apt-get update && apt-get install -y gdb
Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] 
...

Change permissions on volume:

root@fe15c31b5a31:/# chown -R mysql: /var/lib/mysql

Install basic datadir:

root@fe15c31b5a31:/# gosu mysql mariadb-install-db
Installing MariaDB/MySQL system tables in '/var/lib/mysql' ...
OK

This may crash for you, if so just ignore it an go on.

Start mariadbd under gdb:

root@fe15c31b5a31:/# gosu mysql gdb --args  mariadbd
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from mariadbd...
Reading symbols from /usr/lib/debug/.build-id/6e/0a874dca5a7ff831396ddc0785d939a192efe3.debug...
(gdb) 

On the gdb prompt, set a few options, and then just r return to run.

(gdb) set pagination off
(gdb) set print frame-arguments all
(gdb) r
Starting program: /usr/sbin/mariadbd 
[Thread debugging using libthread_db enabled]
...
2021-10-07 22:48:08 0 [Note] Reading of all Master_info entries succeeded
2021-10-07 22:48:08 0 [Note] Added new Master_info '' to hash table
2021-10-07 22:48:08 0 [Note] /usr/sbin/mariadbd: ready for connections.
Version: '10.5.13-MariaDB-1:10.5.13+maria~focal' as '10.5.13-MariaDB-4eb7217ec33fef8d23f2dda0c97b442508c81b1d'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution

I'm assuming its crashing at startup, but if you need to apply a workload, do this now.

After it gets SIGILL or some stopping signal, enter thread apply all bt full:

Thread 1 "mariadbd" received signal SIGILL, Illegal instruction.
0x00007ffff7664aff in __GI___poll (fds=fds@entry=0x7fffffffe2a0, nfds=nfds@entry=2, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29	../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) thread apply all bt full

The capture that output. And attach it to a new https://jira.mariadb.org/ ticket.

@grooverdan grooverdan reopened this Oct 7, 2021
@stoinov
Copy link

stoinov commented Oct 8, 2021

Thanks for the detailed breakdown, and the install basic datadir step crashes as predicted:

Error log
root@8c72417e2d36:/# gosu mysql mariadb-install-db
Installing MariaDB/MySQL system tables in '/var/lib/mysql' ...
Illegal instruction

Installation of system tables failed!  Examine the logs in
/var/lib/mysql for more information.

The problem could be conflicting information in an external
my.cnf files. You can ignore these by doing:

    shell> /usr/bin/mariadb-install-db --defaults-file=~/.my.cnf

You can also try to start the mysqld daemon with:

    shell> /usr/sbin/mysqld --skip-grant-tables --general-log &

and use the command line tool /usr/bin/mysql
to connect to the mysql database and look at the grant tables:

    shell> /usr/bin/mysql -u root mysql
    mysql> show tables;

Try 'mysqld --help' if you have problems with paths.  Using
--general-log gives you a log in /var/lib/mysql that may be helpful.

The latest information about mysql_install_db is available at
https://mariadb.com/kb/en/installing-system-tables-mysql_install_db
You can find the latest source at https://downloads.mariadb.org and
the maria-discuss email list at https://launchpad.net/~maria-discuss

Please check all of the above before submitting a bug report
at https://mariadb.org/jira
Side note - this exact error message I also get from the linuxserver image. I have a [separate issue](https://github.com/linuxserver/docker-mariadb/issues/59#issuecomment-937348999) with them about it. Just as there, there are no logs in the mentioned `/var/lib/mysql` folder. Not sure if both are related but it seems like something worth checking too.

Continuing with your steps.

gdb log:
Reading symbols from mariadbd...
Reading symbols from /usr/lib/debug/.build-id/27/53d3e03d00ed21daecfe736a4aaf2689218226.debug...
(gdb) set pagination off
(gdb) set print frame-arguments all
(gdb) r
Starting program: /usr/sbin/mariadbd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
0x0000007fb7b253a8 in ?? () from /lib/aarch64-linux-gnu/libcrypto.so.1.1
(gdb) thread apply all bt full

Thread 1 (Thread 0x7fb7fef870 (LWP 444)):
#0  0x0000007fb7b253a8 in ?? () from /lib/aarch64-linux-gnu/libcrypto.so.1.1
No symbol table info available.
#1  0x0000007fb7d065f0 in ?? () from /lib/aarch64-linux-gnu/libcrypto.so.1.1
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

This seems rather underwhelming so let me know if it's enough for a jira ticket or we can expand on it somehow to get more data.

@grooverdan
Copy link
Member

grooverdan commented Oct 8, 2021

Thanks @stoinov . From the Above we see that the SIGILL is in /lib/aarch64-linux-gnu/libcrypto.so.1.1. This the the openssl crypto library from Ubuntu (20.04) that we base our image on.

From openssl/openssl#14897 it looks like openssl uses SIGILL to determine features available.

When you get to this step in gdb, enter c to continue onf the gdb prompt. The next fault (I'm hoping) is the one I'm more interested in. This will reference some MariaDB code.

Additionally include the debug symbols from libssl1.1 (note command is two lines)

root@627b02963823:/# echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse" |  tee  -a /etc/apt/sources.list.d/ddebs.list
root@627b02963823:/# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622
root@627b02963823:/# apt-get update && apt-get install -y  libssl1.1-dbgsym

Looks like a bt full will be sufficient for a bug report rather than for all threads.

@stoinov
Copy link

stoinov commented Oct 9, 2021

So I redid your previous instruction while adding the libssl1.1 step before running mariadbd. After executing the gbd commands, here is what I get now:

MariDB log
Starting program: /usr/sbin/mariadbd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
_armv7_tick () at crypto/arm64cpuid.S:20
20      crypto/arm64cpuid.S: No such file or directory.
(gdb) thread apply all bt full

Thread 1 (Thread 0x7fb7fef870 (LWP 1099)):
#0  _armv7_tick () at crypto/arm64cpuid.S:20
No locals.
#1  0x0000007fb7b211ac in OPENSSL_cpuid_setup () at ../crypto/armcap.c:207
        e = <optimized out>
        ill_oact = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0, 20, 0, 1, 0, 545460846593, 0, 545460846593, 548547814656, 549755810976, 548542746740, 548544734704, 548544734704, 1, 549755811624, 549755811640}}, sa_flags = 0, sa_restorer = 0x0}
        ill_act = {__sigaction_handler = {sa_handler = 0x7fb7b254a8 <ill_handler>, sa_sigaction = 0x7fb7b254a8 <ill_handler>}, sa_mask = {__val = {18446744067267099431, 18446744073709551615 <repeats 15 times>}}, sa_flags = 0, sa_restorer = 0x0}
        oset = {__val = {0, 548539255392, 4294967295, 548547788656, 4294967295, 548547814656, 4294967295, 548547801088, 548537830448, 548547801088, 548547788960, 549755811424, 548547727316, 549755811440, 548547727316, 549755811456}}
        trigger = 1
#2  0x0000007fb7fdb83c in ?? () from /lib/ld-linux-aarch64.so.1
No symbol table info available.
#3  0x0000007fb7fdb93c in ?? () from /lib/ld-linux-aarch64.so.1
No symbol table info available.
#4  0x0000007fb7fce144 in ?? () from /lib/ld-linux-aarch64.so.1
No symbol table info available.
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
my_timer_init (mti=0x5556c2c260 <sys_timer_info>) at ./mysys/my_rdtsc.c:393
393     ./mysys/my_rdtsc.c: No such file or directory.
is this expected? Or should we change something?

@grooverdan
Copy link
Member

That's good. With this we can focus on how my_timer_init is implemented on ARM64 and how that maps to the capabilities of Odroid-C2. It may not be the last one, but its a good start.

@stoinov
Copy link

stoinov commented Oct 10, 2021

Just a side note that I'm using kernel 3.16.85+ which might be of interest in troubleshooting this. Not sure if I can update to a newer one on my current distro - DietPi.

@grooverdan
Copy link
Member

I think its going to be highly unlikely that its a kernel. Problem: The line 393 corresponds to the my_timer_cycles inline. This was changed in MariaDB/server@c76b45a to use the CNTVCT_EL0 register which I suspect isn't on the A53. Further documentation checks welcome.

If you could test a base docker.io/library/mariadb-10.5.4 to see if this starts, no gdb required, that would confirm its this code. Alternately test the sample test code in https://jira.mariadb.org/browse/MDEV-23249?focusedCommentId=160673&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-160673

@stoinov
Copy link

stoinov commented Oct 10, 2021

great catch! 10.5.4 does indeed startup and creates files in the folder, but now I have different error:

mysql_error.log:
211010 22:32:06 [ERROR] mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.5.4-MariaDB-1:10.5.4+maria~focal
key_buffer_size=33554432
read_buffer_size=3145728
max_used_connections=0
max_threads=12
thread_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 119059 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x30)[0x55730b1310]
Printing to addr2line failed
/usr/sbin/mysqld(handle_fatal_signal+0x45c)[0x5572b5c9bc]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7fb6000510]
/usr/sbin/mysqld(crc32c_aarch64+0x4b4)[0x55730c8adc]
/usr/sbin/mysqld(+0xcf68ec)[0x5572f908ec]
/usr/sbin/mysqld(+0xcf7274)[0x5572f91274]
/usr/sbin/mysqld(+0xcf8fa8)[0x5572f92fa8]
/usr/sbin/mysqld(+0xcf9f50)[0x5572f93f50]
/usr/sbin/mysqld(+0xcfb198)[0x5572f95198]
/usr/sbin/mysqld(+0x6090f0)[0x55728a30f0]
/usr/sbin/mysqld(+0xb67cc8)[0x5572e01cc8]
/usr/sbin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x6c)[0x5572b5f7f4]
/usr/sbin/mysqld(+0x7057dc)[0x557299f7dc]
/usr/sbin/mysqld(_Z11plugin_initPiPPci+0x864)[0x55729a08b4]
/usr/sbin/mysqld(+0x63f270)[0x55728d9270]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x40c)[0x55728deadc]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0x7fb568f090]
/usr/sbin/mysqld(+0x639d90)[0x55728d3d90]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             unlimited            unlimited            processes 
Max open files            1048576              1048576              files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       7875                 7875                 signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: core

2021-10-10 22:32:11 0 [ERROR] InnoDB: Corrupted page [page id: space=0, page number=0] of datafile './ibdata1' could not be found in the doublewrite buffer.
2021-10-10 22:32:11 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption
2021-10-10 22:32:11 0 [ERROR] Plugin 'InnoDB' init function returned error.
2021-10-10 22:32:11 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2021-10-10 22:32:11 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2021-10-10 22:32:11 0 [ERROR] Aborting
then it continues repeating it.

I found that this has already been solved in 10.5.5, which is very unfortunate 😄

I tried running with 10.4.21 but I get:

2021-10-10 22:44:12+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.4.21+maria~focal started.
2021-10-10 22:44:12+00:00 [ERROR] [Entrypoint]: mysqld failed while attempting to check config
	command was: mysqld --verbose --help --log-bin-index=/tmp/tmp.0RiFQm5vEb

@grooverdan
Copy link
Member

10.4.14+ include the same patch. We can't revert MDEV-23249 as it still would be a bug, so what other counter registers are available?

FYI @tsahee, @mysqlonarm, @dr-m

@stoinov
Copy link

stoinov commented Oct 11, 2021

I can confirm that 10.4.13 starts correctly:
2021-10-11 0:00:14 0 [Note] mysqld: ready for connections.
Version: '10.4.13-MariaDB-1:10.4.13+maria~focal' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution

@mysqlonarm
Copy link

@tsahee
Copy link

tsahee commented Oct 11, 2021

CNTVCT_EL0 register should be available in A53, and on any processor implementing armv8-a:

https://developer.arm.com/documentation/ddi0595/2020-12/AArch64-Registers/CNTVCT-EL0--Counter-timer-Virtual-Count-register?lang=en

A kernel error in 3.16.85 sounds quite reasonable (trapping access to that register and handling it wrong). Tagging
@geoffreyblake

@stoinov
Copy link

stoinov commented Oct 11, 2021

@mysqlonarm I tried installing clang++ on my device OS (not container) and I only could've found clang 11.0.
Trying to compile using clang -std=c++11 -stdlib=libc++ timer.cc returned timer.cc:1:10: fatal error: 'iostream' file not found which seems to be issue of using c instead c++.

I haven't done any compiling before so excuse me if this is something obvious.

@grooverdan
Copy link
Member

@tsahee , @geoffreyblake, any comments on if this is a kernel error? Any workarounds?

@geoffreyblake
Copy link

@stoinov @grooverdan , the compiler error above is likely from not having clang installed properly. As for a workaround to the CNTVCT_EL0 register, the best I can supply is writing a small kernel module to check the contents of CNTKCTL_EL1 and look to see if bit 1 is set to 1, and if not, setting it to 1 on all cores. https://developer.arm.com/documentation/ddi0595/2021-03/AArch64-Registers/CNTKCTL-EL1--Counter-timer-Kernel-Control-register

If that bit is set to 0, then access to CNTVCT_EL0 from user space traps into the kernel.

@grooverdan
Copy link
Member

grooverdan commented Nov 23, 2021

Makes sense. @geoffreyblake given the DietPi seems to use the HardKernel fork for its odroidc2 should that be the place to patch?

Is setting to 1 here absurdly crude?

@geoffreyblake
Copy link

The link above is looking at code for the KVM hypervisor, touching code there will not have any impact on the host OS itself.

You can write a small driver like below to print out the value of the reg by simply insmod'ing it, just have the kernel headers on hand:

#include <asm/io.h>
#include <linux/module.h>

void test_each(void *info)
{
  u64 cntkctl_el1;
  u64 cpu = smp_processor_id();
  asm volatile("mrs %0,  s3_0_c14_c1_0" : "=r" (actlr2));
  printk("%lld: cntkclt_el1=%#llx\n", cpu, cntkctl_el1);
}

int __init start(void)
{
  on_each_cpu(test_each, NULL, 1);

  return 0;
}


void __exit end(void)
{}

module_init(start);
module_exit(end);
MODULE_LICENSE("GPL v2");

Sample Makefile:

obj-m += print-cntkctl_el1.o

BUILD_KERNEL ?= $(shell uname -r)

all:
	make -C /lib/modules/$(BUILD_KERNEL)/build M=$(CURDIR) modules

clean:
	make -C /lib/modules/$(BUILD_KERNEL)/build M=$(CURDIR) clean

@grooverdan
Copy link
Member

@stoinov are you ok to build and load the module code that @geoffreyblake (many thanks) has provided?

@stoinov
Copy link

stoinov commented Nov 29, 2021

Sure. I just need detailed instructions how to do this as I am not familiar with Linux to do it on my own.

@grooverdan
Copy link
Member

grooverdan commented Nov 29, 2021

Kernel docs.

You should have a gcc compiler and make installed.

In a new directory, put the Makefile as Makefile, and the C source code as print-cntkctl_el1.c.

Change
Download the kernel source. I don't know if dietpi has a package for source package, but failing that:

git clone --single-branch --branch odroidc2-v3.16.y --depth 10  https://github.com/hardkernel/linux.git

If the path exists "/lib/modules/uname -r"

  1. ensure "/lib/modules/uname -r/build" is a symlink to your source
  2. from the module directory "make -C /lib/modules/uname -r/build M=$PWD"

otherwise:

  1. cd linux
  2. make modules_prepare
  3. In your module directory: make -C /path/to/linux M=$PWD

From your module directory:

make -C {same as before} modules_install
modprobe print-cntkctl_el1

This should have the module loaded. Look at dmesg output to see the printed output.

@stoinov
Copy link

stoinov commented Dec 1, 2021

so I did apt install make, gcc with the latest versions available.
I do have /lib/modules/3.16.85+/ so I cloned the provided repo with the specified branch and then ln -s /root/maria/linux/ /lib/modules/3.16.85+/build
Then in the maria folder I tried "make -C /lib/modules/uname -r/build M=$PWD" and I got this error:

make: Entering directory '/root/maria/linux'

  ERROR: Kernel configuration is invalid.
         include/generated/autoconf.h or include/config/auto.conf are missing.
         Run 'make oldconfig && make prepare' on kernel src to fix it.


  WARNING: Symbol version dump ./Module.symvers
           is missing; modules will have no dependencies and modversions.

  CC [M]  /root/maria/print-cntkctl_el1.o
In file included from <command-line>:
././include/linux/kconfig.h:4:10: fatal error: generated/autoconf.h: No such file or directory
    4 | #include <generated/autoconf.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [scripts/Makefile.build:264: /root/maria/print-cntkctl_el1.o] Error 1
make: *** [Makefile:1363: _module_/root/maria] Error 2
make: Leaving directory '/root/maria/linux'

reading up on this error, I saw a lot of variability in fixes, the most obvious being apt install --reinstall linux-headers-$(uname -r). After running again I got:

make: Entering directory '/usr/src/linux-headers-3.16.85+'
  CC [M]  /root/maria/print-cntkctl_el1.o
/root/maria/print-cntkctl_el1.c: In function ‘test_each’:
/root/maria/print-cntkctl_el1.c:8:49: error: ‘actlr2’ undeclared (first use in this function)
    8 |   asm volatile("mrs %0,  s3_0_c14_c1_0" : "=r" (actlr2));
      |                                                 ^~~~~~
/root/maria/print-cntkctl_el1.c:8:49: note: each undeclared identifier is reported only once for each function it appears in
In file included from include/linux/printk.h:5,
                 from include/linux/kernel.h:13,
                 from include/asm-generic/bug.h:13,
                 from arch/arm64/include/generated/asm/bug.h:1,
                 from include/linux/bug.h:4,
                 from include/linux/thread_info.h:11,
                 from include/asm-generic/preempt.h:4,
                 from arch/arm64/include/generated/asm/preempt.h:1,
                 from include/linux/preempt.h:18,
                 from include/linux/spinlock.h:50,
                 from include/linux/mm_types.h:8,
                 from include/asm-generic/pgtable.h:7,
                 from ./arch/arm64/include/asm/pgtable.h:429,
                 from ./arch/arm64/include/asm/io.h:29,
                 from /root/maria/print-cntkctl_el1.c:1:
/root/maria/print-cntkctl_el1.c: At top level:
include/linux/init.h:337:6: warning: ‘init_module’ specifies less restrictive attribute than its target ‘start’: ‘cold’ [-Wmissing-attributes]
  337 |  int init_module(void) __attribute__((alias(#initfn)));
      |      ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:23:1: note: in expansion of macro ‘module_init’
   23 | module_init(start);
      | ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:12:12: note: ‘init_module’ target declared here
   12 | int __init start(void)
      |            ^~~~~
In file included from include/linux/printk.h:5,
                 from include/linux/kernel.h:13,
                 from include/asm-generic/bug.h:13,
                 from arch/arm64/include/generated/asm/bug.h:1,
                 from include/linux/bug.h:4,
                 from include/linux/thread_info.h:11,
                 from include/asm-generic/preempt.h:4,
                 from arch/arm64/include/generated/asm/preempt.h:1,
                 from include/linux/preempt.h:18,
                 from include/linux/spinlock.h:50,
                 from include/linux/mm_types.h:8,
                 from include/asm-generic/pgtable.h:7,
                 from ./arch/arm64/include/asm/pgtable.h:429,
                 from ./arch/arm64/include/asm/io.h:29,
                 from /root/maria/print-cntkctl_el1.c:1:
include/linux/init.h:343:7: warning: ‘cleanup_module’ specifies less restrictive attribute than its target ‘end’: ‘cold’ [-Wmissing-attributes]
  343 |  void cleanup_module(void) __attribute__((alias(#exitfn)));
      |       ^~~~~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:24:1: note: in expansion of macro ‘module_exit’
   24 | module_exit(end);
      | ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:20:13: note: ‘cleanup_module’ target declared here
   20 | void __exit end(void)
      |             ^~~
/root/maria/print-cntkctl_el1.c: In function ‘test_each’:
/root/maria/print-cntkctl_el1.c:8:3: error: invalid lvalue in ‘asm’ output 0
    8 |   asm volatile("mrs %0,  s3_0_c14_c1_0" : "=r" (actlr2));
      |   ^~~
make[1]: *** [scripts/Makefile.build:264: /root/maria/print-cntkctl_el1.o] Error 1
make: *** [Makefile:1363: _module_/root/maria] Error 2
make: Leaving directory '/usr/src/linux-headers-3.16.85+'

@grooverdan
Copy link
Member

This looks like a @geoffreyblake typo. Replace actlr2 with cntkctl_el1 in the code.

@stoinov
Copy link

stoinov commented Dec 2, 2021

after the fix I got success:

make: Entering directory '/usr/src/linux-headers-3.16.85+'
  CC [M]  /root/maria/print-cntkctl_el1.o
In file included from include/linux/printk.h:5,
                 from include/linux/kernel.h:13,
                 from include/asm-generic/bug.h:13,
                 from arch/arm64/include/generated/asm/bug.h:1,
                 from include/linux/bug.h:4,
                 from include/linux/thread_info.h:11,
                 from include/asm-generic/preempt.h:4,
                 from arch/arm64/include/generated/asm/preempt.h:1,
                 from include/linux/preempt.h:18,
                 from include/linux/spinlock.h:50,
                 from include/linux/mm_types.h:8,
                 from include/asm-generic/pgtable.h:7,
                 from ./arch/arm64/include/asm/pgtable.h:429,
                 from ./arch/arm64/include/asm/io.h:29,
                 from /root/maria/print-cntkctl_el1.c:1:
include/linux/init.h:337:6: warning: ‘init_module’ specifies less restrictive attribute than its target ‘start’: ‘cold’ [-Wmissing-attributes]
  337 |  int init_module(void) __attribute__((alias(#initfn)));
      |      ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:23:1: note: in expansion of macro ‘module_init’
   23 | module_init(start);
      | ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:12:12: note: ‘init_module’ target declared here
   12 | int __init start(void)
      |            ^~~~~
In file included from include/linux/printk.h:5,
                 from include/linux/kernel.h:13,
                 from include/asm-generic/bug.h:13,
                 from arch/arm64/include/generated/asm/bug.h:1,
                 from include/linux/bug.h:4,
                 from include/linux/thread_info.h:11,
                 from include/asm-generic/preempt.h:4,
                 from arch/arm64/include/generated/asm/preempt.h:1,
                 from include/linux/preempt.h:18,
                 from include/linux/spinlock.h:50,
                 from include/linux/mm_types.h:8,
                 from include/asm-generic/pgtable.h:7,
                 from ./arch/arm64/include/asm/pgtable.h:429,
                 from ./arch/arm64/include/asm/io.h:29,
                 from /root/maria/print-cntkctl_el1.c:1:
include/linux/init.h:343:7: warning: ‘cleanup_module’ specifies less restrictive attribute than its target ‘end’: ‘cold’ [-Wmissing-attributes]
  343 |  void cleanup_module(void) __attribute__((alias(#exitfn)));
      |       ^~~~~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:24:1: note: in expansion of macro ‘module_exit’
   24 | module_exit(end);
      | ^~~~~~~~~~~
/root/maria/print-cntkctl_el1.c:20:13: note: ‘cleanup_module’ target declared here
   20 | void __exit end(void)
      |             ^~~
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /root/maria/print-cntkctl_el1.mod.o
  LD [M]  /root/maria/print-cntkctl_el1.ko
make: Leaving directory '/usr/src/linux-headers-3.16.85+'

On the next step I got an error tho:

make -C /lib/modules/3.16.85+/build modules_install
make: Entering directory '/usr/src/linux-headers-3.16.85+'
cp: cannot stat './modules.order': No such file or directory
make: *** [Makefile:1112: _modinst_] Error 1
make: Leaving directory '/usr/src/linux-headers-3.16.85+'

I can see there is a modules.order in the /lib/modules/3.16.85+ folder but not in the /usr/src/linux-headers-3.16.85+. The sym link still stands and works properly but I get redirected to this other headers folder.

@geoffreyblake
Copy link

@stoinov , you have the built module, since it has no dependencies, you can do: sudo insmod print-cntkctl_el1.ko to load it, no need for modules_install.

@stoinov
Copy link

stoinov commented Dec 6, 2021

Thanks @geoffreyblake, here's the resulting output from dmesg:

[1031565.459946] 0: cntkclt_el1=0x0
[1031565.459952] 3: cntkclt_el1=0x0
[1031565.459957] 1: cntkclt_el1=0x0
[1031565.460030] 2: cntkclt_el1=0x0

@geoffreyblake
Copy link

CNTKCTL_EL1 is 0, that will explain the unhandled trap @stoinov . You can try to modify your kernel or just modify this driver code to execute when its loaded:

cntkctl_el1 = 0x2;
asm volatile("msr s3_0_c14_c1_0, %0" : : "r" (cntkctl_el1));

At that point, I would assume things will work.

@grooverdan
Copy link
Member

Thank you very much @geoffreyblake for assisting @stoinov.

grooverdan added a commit to alexfanqi/server that referenced this issue Jan 5, 2022
As reported in MariaDB/mariadb-docker#338
and later hardkernel/linux#423.

While modern kernels support this, it seems older hardware may be
stuck at kernel versions without this initialization.
grooverdan added a commit to MariaDB/server that referenced this issue Jan 5, 2022
As reported in MariaDB/mariadb-docker#338
and later hardkernel/linux#423.

While modern kernels support this, it seems older hardware may be
stuck at kernel versions without this initialization.
@peiandsky
Copy link

Set the maximum memory allocated to the database to be less than 8GB

@grooverdan
Copy link
Member

Set the maximum memory allocated to the database to be less than 8GB

What is this? A request for help? Instructions related to arm64?

See https://github.com/MariaDB/mariadb-docker#getting-help if this is a request for help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need feedback Need feedback from user.
Development

No branches or pull requests

9 participants