Skip to content

Can not load or initialize libmscordaccor.so when debugging a coredump in lldb #42

@mikem8361

Description

@mikem8361

@JoeStead commented on Tue May 01 2018

We have a coredump from our application that was built using the .NET Core 2.0.3 SDK, a standalone app. The application was running on CentOS7.

To perform some analysis on the dump, I've exported the file onto an Ubuntu 16.04 VM, with lldb-3.6 installed, and the same .NET Core SDK Version that was used to build the application.

I'm launching lldb like this:

lldb-3.6 /usr/bin/dotnet --core ./mydumpfilehere

The dump is loaded, and I'm able to use the standard lldb commands, so far, so good. Now I want to load the sos plugin, which I do by running:

plugin load /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.3/libsosplugin.so

Now, if I type help, I see the commands from the plugin.

Finally, I setclrpath to setclrpath /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.3 and to my knowledge, I should be good to go.

Now, as soon as I use any sos command, I get the following message:

Failed to load data access DLL, 0x80004005
Can not load or initialize libmscordaccore.so. The target runtime may not be initialized.

I thought, that perhaps, the SDK versions weren't quite matching up, so I built the coreclr repo, and used the plugin from there instead, still with no luck.

Finally, I thought I might be able to use the plugin that's actually shipped with our self contained app, again, that does not work.

The only thing I can think of, is I need to analyse this on a CentOS 7 machine, but my (limited) understanding is that shouldn't matter?


@JoeStead commented on Tue May 01 2018

Thought I'd give 2.1 a go too. Downloaded the 2.1 preview, built a standalone app for Centos, which just printed hello world and threw an exception. Used (still on Ubuntu 16.04) lldb-3.9 to load the dump, and load the plugin from the 2.1 install, exactly the same issue.


@RussKeldorph commented on Tue May 01 2018

@dotnet/dotnet-diag


@mikem8361 commented on Tue May 01 2018

I think the problem might be creating the dump on Centos but using the Ubuntu runtime/DAC to load it even though it should be the same runtime unless it is a slight different build. It does look like you are using 2.0.3 on both Centos and Ubuntu. I'll try creating a dump on Centos and loading it on my Ubuntu 14.04 machine to see what might be the problem.

Another problem you will run into is using lldb 3.6 on to load the core dump. SOS doesn't work on lldb 3.6 on core dumps. You need 3.9. The problem is that the libsosplugin.so is built for 3.6 on Centos/Ubuntu for 2.0.x. Attached is a 3.9 libsosplugin.so if you want to try that.

For .NET Core 2.1 libsosplugin.so is built for lldb 3.9.

libsosplugin.zip


@JoeStead commented on Wed May 02 2018

Thanks for the plugin

If there is a problem with creating a dump on Centos and loading on Ubuntu, am I right in thinking I could compile lldb on Centos and hopefully everything will work? Something I've been keen to avoid because it takes so long to build, but needs must :)


@mikem8361 commented on Wed May 02 2018

I have repro'ed the problem of creating a core dump on Centos 7 and attempting to load it on Ubuntu (14.04). I'm currently investigating.


@mikem8361 commented on Wed May 02 2018

I found a simple workaround for now: copy /usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.3/libcoreclr.so to the directory containing the core dump which I assume is the current directory. That seems to fix the problem. I'll continue to investigate an actual fix.


@mikem8361 commented on Wed May 02 2018

The libcoreclr.so needs to be in the same directory as the host ("dotnet" or "corerun") used to run the .NET Core program used to create the dump. When you load a core dump lldb needs the host program as part of the --core command line.

lldb-3.9 --core coredumpfile dotnet


@JoeStead commented on Thu May 03 2018

I've not been able to use the workaround you've suggested, just so I'm not misunderstanding the steps:

Use lldb-3.9 (with the 2.0.3 SDK)
Use the version of the plugin you included above
Copy the libcoreclr.so to the same directory as the coredump file
Load the plugin included above
set the clr path?

I'm getting a little confused with the latter steps I think, and it's definitely my fault, this is a whole new world for me


@mikem8361 commented on Thu May 03 2018

You have correct steps (yes on setclrpath). The only thing that seems to be missing in your above comments is the host ("dotnet" or "corerun") used to run your test program. lldb doesn't seem to find any of the other modules if I don't add the host program to the --core comand line.

Here is the script I use to load my core dump on Ubuntu:

BUILD_DIR=$HOME/coreclr/bin/Product/Linux.x64.Debug
DOTNET=/usr/share/dotnet/shared/Microsoft.NETCore.App/2.0.3
HOST=$HOME/save/centosdump/dotnet
$LLDB_PATH -o "plugin load $BUILD_DIR/libsosplugin.so" -o "setclrpath $DOTNET" --core $HOME/save/centosdump/core.43690 $HOST

If you only have 2.0.3 installed then the HOST path should be HOST=/usr/share/dotnet/dotnet

$ ls
core.43690*  dotnet*  loadcore*

Hopefully that is clearer.


@mikem8361 commented on Fri May 04 2018

I investigated the problem some more and it is a lldb problem than a plugin/SOS. The lldb APIs that SOS uses to initialize the DAC do not work unless the actual "libcoreclr.so" file matching the one in core dump can be loaded from lldb. Hence requiring libcoreclr.so to be in the directory where the "host" is which is the way lldb finds and loads the module/symbol info for all the modules in the core dump. I've tried all the different symbol and module related settings with no luck. There seems to be no programmatic way to fix this either, but I'll continue to look at this.


@raffaeler commented on Mon Jun 04 2018

I tried with the 2.1.300 on Ubuntu18 + lldb-3.9 and the problem is still there.
The easiest workaround is to work with a self contained deployment.
But even in this case, certain commands such as clrstack make lldb crash randomly and you have to restart the debugging from the beginning.


@mikem8361 commented on Mon Jun 04 2018

@raffaeler Can you give me some more details? Are you debugging a core dump? Do you have any details on how commands like clrstack make lldb crash randomly (repro steps, etc.)?

I'm currently working on addressing problems that make SOS/lldb unreliable or problematic by increasing the test coverage, etc. in the new diagnostics repo.


@raffaeler commented on Mon Jun 04 2018

It is a console app that throws right after pressing a key from the console.
Instead of throwing manually I iterate a string characters longer than its length
Lldb catches the exception and I can use sos commands only because I published a self contained deployment.
Commands like pe works, bit clrstack made lldb crash.
Ubuntu is version 18.04 LTS.


@raffaeler commented on Mon Jun 04 2018

BTW I attach the process instead of running it through corerun.


@raffaeler commented on Mon Jun 04 2018

@mikem8361
I love the fact there is a repo entirely for diagnostics... good luck ;)
Do you want me to open an issue in the new diagnostic repo? It looks more appropriate...


@mikem8361 commented on Tue Jun 05 2018

@raffaeler opening a new issue in the diagnostics repo sound good especially sense your problem seems to have nothing to do with this original issue.


@raffaeler commented on Tue Jun 05 2018

Done: #22


@eshenem commented on Mon Jun 25 2018

I am getting the same error. I have lldb-3.9 installed on debian VM (core dump was generated in docker container based on alpine). I copied libcoreclr.so to the directory where core dump file exists and that did not help. Also, while I wasn't sure the build was based on net core 2.1.0 or 2.1.1, I installed both and tried both to no avail. Any ideas?


@raffaeler commented on Mon Jun 25 2018

@eshenem I would suggest you to post a comment with the details in the new thread I opened


@mikem8361 commented on Mon Jun 25 2018

If you are loading a core dump on debian generated on alpine you need the alpine runtime bits (libcoreclr.so, etc.) because the runtime installed on debian will not match the alpine runtime bits. Alpine is one of the distros that does run the "portable" .NET Core runtime (RID: linux-x64). See
https://www.microsoft.com/net/download/linux.

And it gets more complicated when it comes to using SOS if you are loading this alpine core dump on debian because the libsos.so and some of the runtime modules it depends on (libmscordaccore.so) won't work on debian either.

I would suggest loading this core dump in an alpine docker container, but I've found and I'm working through that the alpine lldb doesn't work that well or not at all. I'm currently working on docs/directions to build a lldb for alpine that works, but it will be a few weeks or so before it is all ready (see diagnostics repo).


@eshenem commented on Tue Jun 26 2018

Thanks @mikem8361 - will keep an eye on diagnostics repo for docs / direction.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions