-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mysqld failed while attempting to check config on arm 64 #338
Comments
Interesting, sounds like the $ docker pull arm64v8/mariadb:latest
$ docker run -it --rm --entrypoint bash arm64v8/mariadb:latest
root@b7403e3d732e:/# mysqld --version
root@b7403e3d732e:/# mysqld --verbose --help > /dev/null |
Thanks for the fast response! Here is the output. Looks like it's not starting at all. weird
|
Bizarre - sounds like something is either wrong with the |
Can you include some hardware information If you look at dmesg output what address did this occur at? Is this for the mariadb-10.5.8 version? Writing an upstream bug report on https://jira.mariadb.org would be appreciated. |
Note https://jira.mariadb.org/browse/MDEV-23495 there was a bug in 10.5 that is fixed in 10.5.7 onward. If your version is in this range try an update. |
Hi,
@jmburges, here is for comparison the information requested by @grooverdan.
❯ LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR: 0xffff8b780000
AT_??? (0x33): 0x1270
AT_HWCAP: 917fff
AT_PAGESZ: 65536
AT_CLKTCK: 100
AT_PHDR: 0xaaaac7110040
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0xffff8b790000
AT_FLAGS: 0x0
AT_ENTRY: 0xaaaac71116e0
AT_UID: 0
AT_EUID: 0
AT_GID: 0
AT_EGID: 0
AT_SECURE: 0
AT_RANDOM: 0xffffdc145838
AT_EXECFN: /bin/true
AT_PLATFORM: aarch64
❯ lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 1
NUMA node(s): 1
Vendor ID: 0x48
Model: 0
Stepping: 0x1
CPU max MHz: 2400.0000
CPU min MHz: 2400.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-7
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
❯ LD_SHOW_AUXV=1 /bin/true
AT_SYSINFO_EHDR: 0xffff93227000
AT_??? (0x33): 0x1270
AT_HWCAP: 887
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR: 0xaaaaac030040
AT_PHENT: 56
AT_PHNUM: 9
AT_BASE: 0xffff931f7000
AT_FLAGS: 0x0
AT_ENTRY: 0xaaaaac031710
AT_UID: 1000
AT_EUID: 1000
AT_GID: 1000
AT_EGID: 1000
AT_SECURE: 0
AT_RANDOM: 0xffffdbf07d98
AT_HWCAP2: 0x0
AT_EXECFN: /bin/true
AT_PLATFORM: aarch64
❯ lscpu
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: ARM
Model: 3
Model name: Cortex-A72
Stepping: r0p3
CPU max MHz: 1500.0000
CPU min MHz: 600.0000
BogoMIPS: 108.00
NUMA node0 CPU(s): 0-3
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Vulnerable
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm crc32 cpuid @jmburges you could also try to build the container and use it directly. git clone https://github.com/docker-library/mariadb && cd mariadb/10.5
docker build . -t mariadb-test
docker run -it -e MYSQL_ROOT_PASSWORD=my-secret-pw mariadb-test:latest |
@jmburges. We'd really like to solve this on your hardware. Can you please include the hardware information? Also can you in a container:
This will help us find where in the codebase the problem could be. |
Cannot proceed without information. |
I can confirm the same issue with a Odroid-C2
I also tried building from source and got the same error when running the custom image. In order to run |
@stoinov I've created some prebuilt images with debug symbols installed - https://quay.io/repository/mariadb-foundation/mariadb-debug?tab=tags Use quay.io/mariadb-foundation/mariadb-debug:10.5 example as the image name. Can you try one of those? |
I've used the image you provided and I still get the same error:
When I enter the container with |
Debug symbols are already installed:
So first option, see if running without specifying an entrypoint gives a decent backtrace. Option 2:
And use gdb compiled with this patch https://sourceware.org/pipermail/gdb-patches/2021-August/181718.html to get the debug symbols to connect - https://jira.mariadb.org/browse/MDEV-26727 Option 3:
The after building this use Both above 2 options assume either a) it crashes early without a datadir. If it doesn't provide /var/lib/mysql as a volume containing a created datadir (even if copied from elsewhere). |
option 2,3; while gdb is waiting to run the server, you can |
I tried running the chown command inside the docker but got: Running just |
Sorry I was assuming too much. Let me do a more complete of a modified option 3 example without typos: Running (with podman, docker substitute should be equivalent). Here we install gdb on the prompt.
Change permissions on volume:
Install basic datadir:
This may crash for you, if so just ignore it an go on. Start mariadbd under gdb:
On the
I'm assuming its crashing at startup, but if you need to apply a workload, do this now. After it gets
The capture that output. And attach it to a new https://jira.mariadb.org/ ticket. |
Thanks for the detailed breakdown, and the install basic datadir step crashes as predicted: Error log
Continuing with your steps. gdb log:
This seems rather underwhelming so let me know if it's enough for a jira ticket or we can expand on it somehow to get more data. |
Thanks @stoinov . From the Above we see that the SIGILL is in /lib/aarch64-linux-gnu/libcrypto.so.1.1. This the the openssl crypto library from Ubuntu (20.04) that we base our image on. From openssl/openssl#14897 it looks like openssl uses SIGILL to determine features available. When you get to this step in gdb, enter Additionally include the debug symbols from libssl1.1 (note command is two lines)
Looks like a |
So I redid your previous instruction while adding the libssl1.1 step before running mariadbd. After executing the gbd commands, here is what I get now: MariDB log
|
That's good. With this we can focus on how |
Just a side note that I'm using kernel 3.16.85+ which might be of interest in troubleshooting this. Not sure if I can update to a newer one on my current distro - DietPi. |
I think its going to be highly unlikely that its a kernel. Problem: The line 393 corresponds to the my_timer_cycles inline. This was changed in MariaDB/server@c76b45a to use the CNTVCT_EL0 register which I suspect isn't on the A53. Further documentation checks welcome. If you could test a base docker.io/library/mariadb-10.5.4 to see if this starts, no gdb required, that would confirm its this code. Alternately test the sample test code in https://jira.mariadb.org/browse/MDEV-23249?focusedCommentId=160673&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-160673 |
great catch! 10.5.4 does indeed startup and creates files in the folder, but now I have different error: mysql_error.log:
I found that this has already been solved in 10.5.5, which is very unfortunate 😄 I tried running with 10.4.21 but I get:
|
10.4.14+ include the same patch. We can't revert MDEV-23249 as it still would be a bug, so what other counter registers are available? FYI @tsahee, @mysqlonarm, @dr-m |
I can confirm that 10.4.13 starts correctly: |
@stoinov given the different hardware can you try the the following code/example on it. |
CNTVCT_EL0 register should be available in A53, and on any processor implementing armv8-a: A kernel error in 3.16.85 sounds quite reasonable (trapping access to that register and handling it wrong). Tagging |
@mysqlonarm I tried installing clang++ on my device OS (not container) and I only could've found I haven't done any compiling before so excuse me if this is something obvious. |
@tsahee , @geoffreyblake, any comments on if this is a kernel error? Any workarounds? |
@stoinov @grooverdan , the compiler error above is likely from not having clang installed properly. As for a workaround to the CNTVCT_EL0 register, the best I can supply is writing a small kernel module to check the contents of CNTKCTL_EL1 and look to see if bit 1 is set to 1, and if not, setting it to 1 on all cores. https://developer.arm.com/documentation/ddi0595/2021-03/AArch64-Registers/CNTKCTL-EL1--Counter-timer-Kernel-Control-register If that bit is set to 0, then access to CNTVCT_EL0 from user space traps into the kernel. |
Makes sense. @geoffreyblake given the DietPi seems to use the HardKernel fork for its odroidc2 should that be the place to patch? Is setting to 1 here absurdly crude? |
The link above is looking at code for the KVM hypervisor, touching code there will not have any impact on the host OS itself. You can write a small driver like below to print out the value of the reg by simply insmod'ing it, just have the kernel headers on hand: #include <asm/io.h>
#include <linux/module.h>
void test_each(void *info)
{
u64 cntkctl_el1;
u64 cpu = smp_processor_id();
asm volatile("mrs %0, s3_0_c14_c1_0" : "=r" (actlr2));
printk("%lld: cntkclt_el1=%#llx\n", cpu, cntkctl_el1);
}
int __init start(void)
{
on_each_cpu(test_each, NULL, 1);
return 0;
}
void __exit end(void)
{}
module_init(start);
module_exit(end);
MODULE_LICENSE("GPL v2"); Sample Makefile:
|
@stoinov are you ok to build and load the module code that @geoffreyblake (many thanks) has provided? |
Sure. I just need detailed instructions how to do this as I am not familiar with Linux to do it on my own. |
You should have a gcc compiler and make installed. In a new directory, put the Makefile as Change
If the path exists "/lib/modules/
otherwise:
From your module directory:
This should have the module loaded. Look at |
so I did
reading up on this error, I saw a lot of variability in fixes, the most obvious being
|
This looks like a @geoffreyblake typo. Replace |
after the fix I got success:
On the next step I got an error tho:
I can see there is a |
@stoinov , you have the built module, since it has no dependencies, you can do: |
Thanks @geoffreyblake, here's the resulting output from
|
CNTKCTL_EL1 is 0, that will explain the unhandled trap @stoinov . You can try to modify your kernel or just modify this driver code to execute when its loaded:
At that point, I would assume things will work. |
Thank you very much @geoffreyblake for assisting @stoinov. |
As reported in MariaDB/mariadb-docker#338 and later hardkernel/linux#423. While modern kernels support this, it seems older hardware may be stuck at kernel versions without this initialization.
As reported in MariaDB/mariadb-docker#338 and later hardkernel/linux#423. While modern kernels support this, it seems older hardware may be stuck at kernel versions without this initialization.
Set the maximum memory allocated to the database to be less than 8GB |
What is this? A request for help? Instructions related to arm64? See https://github.com/MariaDB/mariadb-docker#getting-help if this is a request for help. |
Hello,
I ran
sudo docker run --name some-mariadb -e MYSQL_ROOT_PASSWORD=my-secret-pw arm64v8/mariadb:latest
to test out mariadb on my ODroid-C2 and receieved the following error:It's under
sudo
so I don't think there should be any permissions problems. Any ideas?The text was updated successfully, but these errors were encountered: