Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new virtualgl seems to break steam with bumblebee #16

Closed
bouloumag opened this issue Feb 22, 2016 · 36 comments
Closed

new virtualgl seems to break steam with bumblebee #16

bouloumag opened this issue Feb 22, 2016 · 36 comments

Comments

@bouloumag
Copy link

On ArchLinux, after upgrading lib32-virtualgl and virtualgl to version 2.5, steam no longer work with bumblebee.

% optirun steam

Running Steam on arch rolling 64-bit
STEAM_RUNTIME is enabled automatically
Installing breakpad exception handler for appid(steam)/version(1454620878)
[VGL] ERROR: Could not load GLX/OpenGL functions
[2016-02-22 13:50:23] Startup - updater built Feb 4 2016 12:22:08

Note that steam do work if I start it without optirun (indeed, it use the intel gpu, which is slow).

Downgrading virtualgl to 2.4.1 as a workaround fix the problem.

@dcommander
Copy link
Member

Yeah, well, sorry to rant, but Arch Linux is incredibly difficult to support, because it is a constantly moving target, and it ticks me off that apparently they didn't release the beta version to give people a chance to test this stuff before the final VGL 2.5 release. All I can do is try my best to repro this with a more sane distribution.

@dcommander
Copy link
Member

It doesn't fail in the same way on Ubuntu, but I can confirm that 2.5 + Steam is broken but 2.4.1 + Steam works. Will look into it.

@dcommander
Copy link
Member

Correction: the failure I was experiencing (segfault) was due to a mismatch between my dev version and the system-installed version (i.e. LD_LIBRARY_PATH wasn't properly picking up my dev build.) Installing the 2.5 packages onto the system cleared up the segfault, and Steam launches. I cannot get past the login screen, because I am not a gamer and do not have a Steam account, but it seems that the failure you're observing is occurring before that point. So I guess I can't reproduce this. Not really much I can do unless this is reproducible with another distro. I've tried several times without luck to get Arch Linux up and running. Again-- moving target. Basically impossible to support from a software manager's point of view. If you can make this fail in the same way using another distro (Ubuntu 10-15, SUSE 11, CentOS 5-7, Fedora 22, Mint 17.2 are all things I can readily test), then I'm happy to look into it further. Otherwise, there isn't much I can do at this point except wait and hope that the problem manifests in some other way that I can reproduce.

@dcommander
Copy link
Member

This might be related to the introduction of the nVidia GL vendor-neutral dispatch library in Arch. https://bugs.archlinux.org/task/48109. Seems it may not be fully baked, although I am able to install and use the GLVND just fine on CentOS (so Arch's implementation may be what isn't fully baked, not nVidia's.)

@kaseiwang
Copy link

I have this problem too. My wine report "err:wgl:has_opengl glAccum not found in libGL, disabling OpenGL.". And I have tried with GLVND and without GLVND. In both cases 2.5 doesn't work, but 2.4.1 works.

@dcommander
Copy link
Member

@kasei-wang are you also using Arch Linux? This issue is tracking the failure with VGL 2.5 and the Linux version of Steam under Arch Linux. #15 is tracking the failure with VGL 2.5 and WINE under Arch Linux, but the issues may have the same root cause.

@kaseiwang
Copy link

Yes I'm using Arch. And I found a friend who using VGL2.5 on Arch without this failure. We are trying to find out the cause. Maybe I will report more information tomorrow or the day after tomorrow (UTC+8).

@dcommander
Copy link
Member

In reading https://bugs.archlinux.org/task/48109, it seems that it may be related to upgrading both VGL and the nVidia drivers at the same time. Apparently something about the nVidia driver upgrade seems to have broken applications that load OpenGL symbols manually from libGL (as opposed to linking directly with it and letting the dynamic linker handle the symbol loading.) That includes VirtualGL, since that is how we load symbols from libGL as well.

@kaseiwang
Copy link

I rebuild VGL2.5 and bumblebee. Still does't work. Seems not that cause.

@romanlex
Copy link

Manjaro Linux, I have the same thing :(
virtualgl 2.5-1

err:wgl:has_opengl glAccum not found in libGL, disabling OpenGL.

@dcommander
Copy link
Member

Confirmed that this is an Arch-specific packaging issue. They appear to have duplicated libvglfaker-nodl.so as libvglfaker.so, so effectively every time you invoke vglrun, it is as if you invoked it with -nodl. That means that applications like WINE and Steam that rely on dlopen()/dlsym() to load OpenGL functions will never work. Please inform the Arch maintainers of this and ask them to fix their [fancy] software.

In the future, I will not be fielding requests for support on Arch, unless the issue can be reproduced on another platform. It took the better part of a day for me to just install it and get it limping along enough to test with it. It is completely non-intuitive for me, and this issue proves that it is rather thrown-together and not well tested.

@dcommander
Copy link
Member

Temporary workaround:

Run Steam with VGL_GLLIB=/usr/lib32/libGL.so.1 vglrun steam. Normally, VirtualGL attempts to load the OpenGL/GLX symbols using dlsym(RTLD_NEXT, ...), and since libvglfaker.so is linked against libGL (or at least it's supposed to be-- the crux of the issue on Arch is that their version of the faker isn't), VirtualGL should be able to pick up symbols from libGL even if the application itself isn't linking with libGL (that is, if the application is using dlopen()/dlsym() to load libGL symbols.) Specifying VGL_GLLIB causes VirtualGL to directly load symbols from the specified library instead of trying to use RTLD_NEXT.

@bouloumag
Copy link
Author

For the reccord :

Starting steam with this command

LD_PRELOAD=/usr/lib32/nvidia/libGL.so __GLVND_DISALLOW_PATCHING=1 optirun steam

seems to works correctly on archlinux, with virtualgl 2.5 and bumblebee (intel/closed source nvidia driver).

@dcommander
Copy link
Member

I am not sure what __GLVND_DISALLOW_PATCHING does exactly, but it makes sense that preloading libGL would work around the issue as well, because the root cause is that Arch isn't packaging the VirtualGL fakers correctly. Since they are only including the -nodl faker (twice), what happens is that VirtualGL tries to use dlsym(..., RTLD_NEXT) to load symbols from libGL, but it can't do that, because the -nodl faker isn't linked with libGL, and neither is the 3D application in this case. The -nodl faker is designed for use only with "simple" OpenGL applications that link directly with libGL (it's mainly included as a means of testing compatibility issues-- there are very few applications that actually require it), so if you use it with an application like Steam that doesn't link directly with libGL, then libGL will never be loaded into the process-- unless you explicitly load it (with LD_PRELOAD) or explicitly tell VirtualGL to load it (with VGL_GLLIB.)

@Nebucatnetzer
Copy link

I can understand that you find Arch difficult to use.
However would maybe Antergos help here a bit?
https://antergos.com/
It's a distro based on Arch which use all the Arch repos unlike Manjaro.
Thanks for your work, keep it up :)

@dcommander
Copy link
Member

I finally got Arch up and running, so that's no longer the problem. But I'm just one person maintaining three rather complicated OSS projects. I cannot support every single distro on the planet. The problems with Arch, from my point of view:

  1. It's a moving target. As you guys discovered, it can work one day and break the next, due to no fault of the upstream developers such as me.
  2. It works like no other distro I've ever encountered. It's completely non-intuitive and "bare-metal" and, in general, made me feel like I was stuck in the 1990's trying to install BSD on a 386.
  3. As far as I'm aware, it is not used by any large-scale VirtualGL deployments. Thus, I have no way of getting paid for debugging issues affecting Arch unless they also affect another platform, or unless the Arch user reporting the problem is willing to pay for the labor (unlikely.)
  4. Downstream commercial products that ship VirtualGL (such as Exceed On Demand) also don't support Arch, so I will get little or no help from other developers in diagnosing the issue unless the issue also affects another platform.

In the future, if a bug is reported in VirtualGL on Arch, the first thing I will do is attempt to diagnose it on another distro of similar vintage, such as the latest Fedora or Ubuntu. Failing that, I will request that the submitter attempt to reproduce the bug using our official binary package (to eliminate downstream packaging bugs such as the one that prompted this issue.) Failing that, I will assume the bug to be Arch-specific and will not devote any significant time to diagnosing or fixing it until/unless it can be shown to affect another platform.

Sorry, guys. This is unfortunately not the first time that I've been sent on a wild goose chase that turned out to be a downstream bug. I'm an independent developer, so time is money.

@Nebucatnetzer
Copy link

Absolutely understable and I'm sure Arch users can live with that solution (lot's of DIY minds there).
Already checking if it happens on another plattform is quite nice you could easily say that you don't care about Arch at all.
Was just pointing out an easier way to install Arch in case you need it at some point :).

@NecroKote
Copy link

@dcommander I also use Arch and faced this issue, and decided to repack package myself to eliminate any posibily of maintainer's mistake here, so let me share my results here.
I've obtained PKGBUILD from here
made $ makepkg
and got libvglfaker.so in /pkg/usr/lib32, that is NOT linked with libGL.so, just like libvglfaker-nogl.so.
Then i'v tried to build library "by hands", and also got not linked library...

Through tons of cmake logs, step by step, i've came to point, that changing line 93 in
server/CMakeLists.txt (original sources)
from if(${fakerlib} STREQUAL ${VGL_FAKER_NAME}) to if(${fakerlib} STREQUAL "vglfaker")
fixes compilation (i've got properly linked libvglfaker.so) and packet packaging.

I don't know much about all that cmake stuff, but i think this issue could be related to some sort of changed behavior in cmake handling STREQUAL.

PS: sorry for terrible grammar.

@dcommander
Copy link
Member

Both 32-bit and 64-bit builds work fine for me using either CMake 2.8.x or the latest CMake (3.5) on a non-Arch platform. I don't think it's a CMake issue. I would suggest adding

message(STATUS "VGL_FAKER_NAME = ${VGL_FAKER_NAME}")

right above

if(${fakerlib} STREQUAL ${VGL_FAKER_NAME})

and see what the value of VGL_FAKER_NAME actually is. I don't see anything suspicious in the PKGBUILD file, so I can't imagine why that if() test would be failing, unless perhaps VGL_FAKER_NAME is being set to a non-default value.

@NecroKote
Copy link

I've added debug output of ${fakerlib} and ${VGL_FAKER_NAME} and that is what i got:

...
-- Building VirtualGL server components
-- VGL_INCDIR = /usr/share/include
-- VGL_LIBDIR = /usr/lib32
-- Using in-tree version of FLTK
-- fakerlib = vglfaker
-- VGL_FAKER_NAME = vglfaker
-- fakerlib = vglfaker-nodl
-- VGL_FAKER_NAME = vglfaker
-- Configuring done
-- Generating done
...

So, everything looks correct, but still i've got .so without libGL linked:

$ ldd pkg/usr/lib32/libvglfaker.so
linux-gate.so.1 (0xf7724000)
libdl.so.2 => /usr/lib32/libdl.so.2 (0xf7641000)
libturbojpeg.so.0 => /usr/lib32/libturbojpeg.so.0 (0xf75ea000)
libXv.so.1 => /usr/lib32/libXv.so.1 (0xf75e4000)
libX11.so.6 => /usr/lib32/libX11.so.6 (0xf7495000)
libXext.so.6 => /usr/lib32/libXext.so.6 (0xf747f000)
libpthread.so.0 => /usr/lib32/libpthread.so.0 (0xf7462000)
libm.so.6 => /usr/lib32/libm.so.6 (0xf740d000)
libc.so.6 => /usr/lib32/libc.so.6 (0xf7258000)
/usr/lib/ld-linux.so.2 (0x5662f000)
libxcb.so.1 => /usr/lib32/libxcb.so.1 (0xf7231000)
libXau.so.6 => /usr/lib32/libXau.so.6 (0xf722c000)
libXdmcp.so.6 => /usr/lib32/libXdmcp.so.6 (0xf7225000)

$ cmake --version
cmake version 3.5.0

@dcommander
Copy link
Member

Try replacing

if(${fakerlib} STREQUAL ${VGL_FAKER_NAME})

with

if(${fakerlib} STREQUAL "${VGL_FAKER_NAME}")

@NecroKote
Copy link

Just came to same conclusion. Tried it, but still no luck
(build directory removed before trying, so no cache involved)

Also, i've added debug message inside that IF, and it showed up correctly

@dcommander
Copy link
Member

That doesn't make sense, though. If you're seeing that debug statement, then obviously the if() statement is working correctly, so I don't understand why replacing it with if(${fakerlib} STREQUAL "vglfaker") would have made any difference.

Try make VERBOSE=1 and look at the line where it's linking libvglfaker.so. The PKGBUILD script is overriding OPENGL_gl_LIBRARY with /usr/lib32/libGL.so, which should be OK, but maybe something is fishy with how it's being linked.

@NecroKote
Copy link

Here is the part with linking:

[ 79%] Linking CXX shared library ../lib/libvglfaker.so
cd /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build/server && /usr/bin/cmake -E cmake_link_script CMakeFiles/vglfaker.dir/link.txt --verbose=1
make[2]: вход в каталог «/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build»
make[2]: Цель «server/CMakeFiles/vgltransut.dir/build» не требует выполнения команд.
make[2]: выход из каталога «/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build»
[ 80%] Built target x11transut
make -f client/CMakeFiles/vglclient.dir/build.make client/CMakeFiles/vglclient.dir/depend
make[2]: вход в каталог «/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build»
[ 81%] Built target vgltransut
make[2]: вход в каталог «/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build»
cd /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5 /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/client /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build/client /home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build/client/CMakeFiles/vglclient.dir/DependInfo.cmake --color=
/usr/bin/g++ -m32 -fPIC -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4 -O3 -DNDEBUG -z defs -Wl,--version-script,/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build/server/faker-mapfile -Wl,-O1,--sort-common,--as-needed,-z,relro -L/home/necrokote/pkg/lib32-virtualgl-manual/src/VirtualGL-2.5/build/staticlib -static-libgcc -shared -Wl,-soname,libvglfaker.so -o ../lib/libvglfaker.so CMakeFiles/vglfaker.dir/ConfigHash.cpp.o CMakeFiles/vglfaker.dir/ContextHash.cpp.o CMakeFiles/vglfaker.dir/DisplayHash.cpp.o CMakeFiles/vglfaker.dir/faker.cpp.o CMakeFiles/vglfaker.dir/faker-gl.cpp.o CMakeFiles/vglfaker.dir/faker-glx.cpp.o CMakeFiles/vglfaker.dir/faker-sym.cpp.o CMakeFiles/vglfaker.dir/faker-x11.cpp.o CMakeFiles/vglfaker.dir/faker-xcb.cpp.o CMakeFiles/vglfaker.dir/XCBConnHash.cpp.o CMakeFiles/vglfaker.dir/fakerconfig.cpp.o CMakeFiles/vglfaker.dir/GlobalCriticalSection.cpp.o CMakeFiles/vglfaker.dir/GLXDrawableHash.cpp.o CMakeFiles/vglfaker.dir/glxvisual.cpp.o CMakeFiles/vglfaker.dir/PixmapHash.cpp.o CMakeFiles/vglfaker.dir/ReverseConfigHash.cpp.o CMakeFiles/vglfaker.dir/TransPlugin.cpp.o CMakeFiles/vglfaker.dir/VirtualDrawable.cpp.o CMakeFiles/vglfaker.dir/VirtualPixmap.cpp.o CMakeFiles/vglfaker.dir/VirtualWin.cpp.o CMakeFiles/vglfaker.dir/VisualHash.cpp.o CMakeFiles/vglfaker.dir/WindowHash.cpp.o CMakeFiles/vglfaker.dir/X11Trans.cpp.o CMakeFiles/vglfaker.dir/vglconfigLauncher.cpp.o CMakeFiles/vglfaker.dir/VGLTrans.cpp.o CMakeFiles/vglfaker.dir/XVTrans.cpp.o ../lib/libvglcommon.a ../lib/libfbx-faker.a ../lib/libfbxv.a ../lib/libvglsocket.a -lm -ldl /usr/lib32/libGL.so.1 -lturbojpeg -lXv -lX11 -lXext ../lib/libvglutil.a -lpthread

@dcommander
Copy link
Member

Not sure why it's trying to link with /usr/lib32/libGL.so.1. Perhaps CMake is being overly clever and is following the symlink for /usr/lib32/libGL.so (which it shouldn't do!) Perhaps try another debug statement and see what the value of OPENGL_gl_LIBRARY is. Next thing I'd try is removing -DOPENGL_gl_LIBRARY=/usr/lib32/libGL.so from the CMake command line.

@NecroKote
Copy link

Here it is - additional debug statement inside IF section, -DOPENGL_gl_LIBRARY removed

-- fakerlib = vglfaker
-- -- would link with libGL ! yay ! OPENGL_gl_LIBRARY = /usr/lib32/libGL.so
-- fakerlib = vglfaker-nodl
-- Configuring done

@dcommander
Copy link
Member

OK, so does the faker get linked properly when you remove -DOPENGL_gl_LIBRARY from the PKGBUILD file?

@NecroKote
Copy link

Unfortunately, no. I'm trying to cleanup my whole system, and run package build again.

@kaseiwang
Copy link

Symbols in /usr/lib/libvglfaker.so was removed by "LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro“"(the --as-needed) in makepkg.conf on Archlinux. That's why when you compile without Arch's makepkg, it works.
I will tell this to the maintainer.

@danielausparis
Copy link

Catching up from these multiple developments and findings... thanks to you all! The issue as posted by me in Arch's bug reporting system is https://bugs.archlinux.org/task/48403?project=5... it draw no attention up to now but you might report your findings there. Daniel.

@danielausparis
Copy link

Apparently there's a new packaging from Arch: virtualgl/lib32-virtualgl 2.5-2, I cannot test it at the moment unfortunately.

@kaseiwang
Copy link

I have test those virtualgl/lib32-virtualgl 2.5-2, works for me.

@svenstaro
Copy link

We fixed the issue in Arch. A bit less hate from upstream towards our project would be appreciated. We actually try to be very compatible with upstream and very rarely downstream patch.

@dcommander
Copy link
Member

@svenstaro I do this for a living. Maintaining VGL and other related open source projects is my only income source, so time is literally money for me. This issue, which turned out to be your problem and not mine, took approximately 15 hours of labor, for which I was not compensated. I seriously doubt that you spent 15 hours on the same issue, even though it was ultimately your bug. I simply cannot afford to do that kind of pro bono work unless it is ultimately benefiting the upstream project. VirtualGL has numerous large deployments around the world, but most of them are running RHEL, Ubuntu, or SLES. If I can't repro an issue on one of those platforms, then I can't get paid for fixing it. That doesn't mean I'm unwilling to fix issues that affect Arch, but it means that I need you guys to be really certain that the issue is mine and not yours. I frankly don't think you did due diligence on this.

I'm not expressing "hate" toward you or Arch. I'm expressing frustration. I realize that Arch is out on the bleeding edge, so sometimes bugs will surface in Arch before they surface elsewhere. I realize that mistakes will be made, and that's fine. If you prefer Arch, that's fine too. But Arch is a "get-your-hands-dirty" sort of distro, so I expect that its users and maintainers will get their hands dirty. I expect that you will be the first line of support for your downstream package, not me. If an issue is reproducible with my packages (the official VirtualGL binaries), and particularly if it's reproducible on multiple platforms, then I'm happy to get involved.

@dcommander
Copy link
Member

And note that about 10 of those 15 hours were spent simply figuring out how to install and configure Arch. And this is coming from someone who has nearly 20 years of professional experience with Linux programming and administration, someone who maintains virtual machines with about a dozen different distros. Some of the documented procedures for configuring certain things in Arch (I recall having a lot of trouble configuring a static IP, for instance) simply didn't work as advertised.

@danielausparis
Copy link

Just came home from abroad and tested the new package, and it works like a charm! I would like to warmly thank all involved, of course dcommander who spent huge efforts in investigating this issue, all others who helped tracking down and of course svenstaro who is responsible for the Arch package! Cheers to all, all the best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants