Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenGL problem on Bookworm #2264

Closed
SebKuzminsky opened this issue Jan 12, 2023 · 30 comments
Closed

OpenGL problem on Bookworm #2264

SebKuzminsky opened this issue Jan 12, 2023 · 30 comments

Comments

@SebKuzminsky
Copy link
Collaborator

On an up-to-date install of Bookworm, with linuxcnc 2.9 built into a deb and installed, linuxcnc fails to start with this error:

LINUXCNC - 2.9.0~pre1
Machine configuration directory is '/home/seb/linuxcnc/configs/sim.axis'
Machine configuration file is 'axis.ini'
Starting LinuxCNC...
linuxcnc TPMOD=tpmod HOMEMOD=homemod EMCMOT=motmod
Note: Using POSIX non-realtime
Found file(lib): /usr/share/linuxcnc/hallib/core_sim.hal
Found file(lib): /usr/share/linuxcnc/hallib/sim_spindle_encoder.hal
Found file(lib): /usr/share/linuxcnc/hallib/axis_manualtoolchange.hal
Found file(lib): /usr/share/linuxcnc/hallib/simulated_home.hal
Found file(lib): /usr/share/linuxcnc/hallib/check_xyz_constraints.hal
link (updating variable file): No such file or directory
Traceback (most recent call last):
  File "/usr/bin/axis", line 26, in <module>
    from OpenGL.GLUT import *
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/__init__.py", line 5, in <module>
    from OpenGL.GLUT.fonts import *
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/fonts.py", line 20, in <module>
    p = platform.getGLUTFontPointer( name )
  File "/usr/lib/python3/dist-packages/OpenGL/platform/baseplatform.py", line 350, in getGLUTFontPointer
    raise NotImplementedError( 
NotImplementedError: Platform does not define a GLUT font retrieval function
Shutting down and cleaning up LinuxCNC...
task: 386 cycles, min=0.000009, max=0.006394, avg=0.001097, 0 latency excursions (> 10x expected cycle time of 0.001000s)
Note: Using POSIX non-realtime
LinuxCNC terminated with an error.  You can find more information in the log:
    /home/seb/linuxcnc_debug.txt
and
    /home/seb/linuxcnc_print.txt
as well as in the output of the shell command 'dmesg' and in the terminal

The issue has been discussed on the forum here, without resolution: https://forum.linuxcnc.org/9-installing-linuxcnc/47468-python-issues-on-bookworm

The issue is reproducible without involving LinuxCNC at all:

$ python3 -c "from OpenGL.GLUT import *"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/__init__.py", line 5, in <module>
    from OpenGL.GLUT.fonts import *
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/fonts.py", line 20, in <module>
    p = platform.getGLUTFontPointer( name )
  File "/usr/lib/python3/dist-packages/OpenGL/platform/baseplatform.py", line 350, in getGLUTFontPointer
    raise NotImplementedError( 
NotImplementedError: Platform does not define a GLUT font retrieval function

From MPI-IS/mesh#49, here's a workaround that gets past the GLUT import error, but linuxcnc still fails shortly thereafter:

$ export PYOPENGL_PLATFORM=osmesa
$ python3 -c "from OpenGL.GLUT import *"
$ linuxcnc
LINUXCNC - 2.9.0~pre1
Machine configuration directory is '/home/seb/linuxcnc/configs/sim.axis'
Machine configuration file is 'axis.ini'
Starting LinuxCNC...
linuxcnc TPMOD=tpmod HOMEMOD=homemod EMCMOT=motmod
Note: Using POSIX non-realtime
Found file(lib): /usr/share/linuxcnc/hallib/core_sim.hal
Found file(lib): /usr/share/linuxcnc/hallib/sim_spindle_encoder.hal
Found file(lib): /usr/share/linuxcnc/hallib/axis_manualtoolchange.hal
Found file(lib): /usr/share/linuxcnc/hallib/simulated_home.hal
Found file(lib): /usr/share/linuxcnc/hallib/check_xyz_constraints.hal
Traceback (most recent call last):
  File "/usr/bin/axis", line 24, in <module>
    from OpenGL.GL import *
  File "/usr/lib/python3/dist-packages/OpenGL/GL/__init__.py", line 4, in <module>
    from OpenGL.GL.VERSION.GL_1_1 import *
  File "/usr/lib/python3/dist-packages/OpenGL/GL/VERSION/GL_1_1.py", line 14, in <module>
    from OpenGL.raw.GL.VERSION.GL_1_1 import *
  File "/usr/lib/python3/dist-packages/OpenGL/raw/GL/VERSION/GL_1_1.py", line 7, in <module>
    from OpenGL.raw.GL import _errors
  File "/usr/lib/python3/dist-packages/OpenGL/raw/GL/_errors.py", line 4, in <module>
    _error_checker = _ErrorChecker( _p, _p.GL.glGetError )
AttributeError: 'NoneType' object has no attribute 'glGetError'
Shutting down and cleaning up LinuxCNC...
task: 322 cycles, min=0.000007, max=0.004341, avg=0.001073, 0 latency excursions (> 10x expected cycle time of 0.001000s)
Note: Using POSIX non-realtime
LinuxCNC terminated with an error.  You can find more information in the log:
    /home/seb/linuxcnc_debug.txt
and
    /home/seb/linuxcnc_print.txt
as well as in the output of the shell command 'dmesg' and in the terminal
@SebKuzminsky
Copy link
Collaborator Author

SebKuzminsky commented Jan 12, 2023

The error goes away if I switch from the default Gnome session (wayland) to the "GNOME On X11" session type, so that echo $XDG_SESSION_TYPE says x11.

This screenshot is from Buster but it works the same on Bookworm:
gnome-on-x11

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 12, 2023 via email

@SebKuzminsky
Copy link
Collaborator Author

I can also work around this problem (when running on Wayland) by setting the environment variable PYOPENGL_PLATFORM to x11 before launching linuxcnc:

$ echo $XDG_SESSION_TYPE
wayland

$ python3 -c 'from OpenGL.GLUT import *'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/__init__.py", line 5, in <module>
    from OpenGL.GLUT.fonts import *
  File "/usr/lib/python3/dist-packages/OpenGL/GLUT/fonts.py", line 20, in <module>
    p = platform.getGLUTFontPointer( name )
  File "/usr/lib/python3/dist-packages/OpenGL/platform/baseplatform.py", line 350, in getGLUTFontPointer
    raise NotImplementedError( 
NotImplementedError: Platform does not define a GLUT font retrieval function

$ PYOPENGL_PLATFORM=x11 python3 -c 'from OpenGL.GLUT import *'

$ PYOPENGL_PLATFORM=x11 linuxcnc
LINUXCNC - 2.9.0~pre1
Machine configuration directory is '/home/seb/linuxcnc/configs/sim.axis'
Machine configuration file is 'axis.ini'
Starting LinuxCNC...
linuxcnc TPMOD=tpmod HOMEMOD=homemod EMCMOT=motmod
Note: Using POSIX non-realtime
Found file(lib): /home/seb/linuxcnc-hacking/linuxcnc-dev/lib/hallib/core_sim.hal
Found file(lib): /home/seb/linuxcnc-hacking/linuxcnc-dev/lib/hallib/sim_spindle_encoder.hal
Found file(lib): /home/seb/linuxcnc-hacking/linuxcnc-dev/lib/hallib/axis_manualtoolchange.hal
Found file(lib): /home/seb/linuxcnc-hacking/linuxcnc-dev/lib/hallib/simulated_home.hal
Found file(lib): /home/seb/linuxcnc-hacking/linuxcnc-dev/lib/hallib/check_xyz_constraints.hal
note: MAXV     max: 5.000 units/sec 300.000 units/min
note: LJOG     max: 5.000 units/sec 300.000 units/min
note: LJOG default: 0.250 units/sec 15.000 units/min
note: jog_order='XYZ'
note: jog_invert=set()

What does it all mean?

@SebKuzminsky
Copy link
Collaborator Author

This patch makes linuxcnc's OpenGL stuff work for me on Wayland on Bookworm:

diff --git a/scripts/linuxcnc.in b/scripts/linuxcnc.in
index f8d5f18471..1f2b5aa424 100644
--- a/scripts/linuxcnc.in
+++ b/scripts/linuxcnc.in
@@ -22,6 +22,8 @@ if test "xyes" = "x@RUN_IN_PLACE@"; then
     fi
 fi
 
+export PYOPENGL_PLATFORM=x11
+
 ################################################################################
 # 0. Values that come from configure
 ################################################################################

@rodw-au
Copy link
Contributor

rodw-au commented Jan 13, 2023

Nice to know there is a solution to this recent development.
I think as they bury xorg, they are removing comptability features.
Does your patch let Gmocappy work? Please check.
using xfce is also a good solution as its a xorg environment.

f the environment is changed in linuxcnc, I'd like to see if my chromebook will run it again.

@rodw-au
Copy link
Contributor

rodw-au commented Jan 13, 2023

Compiled v 2.9 as RIP on my chromebook and set this environment variable
export PYOPENGL_PLATFORM=x11
Chromebook runs a version of Bullseye with kernel 5.10
Axis failed with an opengl error
Ran this script that is deployed with linuxcnc
~/linuxcnc-dev/lib/python/qtvcp/designer/install_script
and mentioned in the docs here
http://linuxcnc.org/docs/2.9/html/plasma/qtplasmac.html#qt-dependency
I ran qtplasmac which is a qtpvcp config and the program opened but complained about a missing dependency
Is python3-gst1.0 installed?
This does not exist but found this package exists in bullseye python3-gst-1.0 so installed it
THis resolved the missing dependency.
I ran axis again and it opened perfectly!

So a request: Can the qtvcp dependencies be added to the list linuxcnc knows about when you run
dpkg-checkbuilddeps
as per the docs here
http://linuxcnc.org/docs/2.9/html/code/building-linuxcnc.html#Satisfying-Build-Dependencies
Surely qtvcp should be considered part of the main line of linuxcnc?

Anyway, its great I can now use my chromebook to run sims to test stuff. Its been broken (by the same issue it seems for over 12 months.

@SebKuzminsky
Copy link
Collaborator Author

The tip of master works on Bullseye but fails as described above on Bookworm. The important difference seems to be that Bullseye has python3-opengl 3.1.5 (which works), but Bookwork has python3-opengl 3.1.6 (which fails).

3.1.6 went into debian in mid-November, so I expect it's been broken since then.

If I install 3.1.5 from snapshots (http://snapshot.debian.org/binary/python3-opengl/) on Bookworm, Axis runs again. (I had to install it with dpkg -i --force-depends, because python3-opengl 3.1.5 Depends on freeglut3, which in Bookworm has transitioned to libglut3.12).

The important difference between python3-opengl 3.1.5 (which works) and 3.1.6 (which doesn't work) is in the detection and selection of the "platform" it uses. 3.1.5 selects the "GLX" platform, but 3.1.6 selects the "EGL" platform:

3.1.5:

3.1.6:

Just like the error message says, the EGL platform lacks the getGLUTFontPointer function: https://github.com/mcfletch/pyopengl/blob/3e9791ffb4cd4831dae261d6bea3049ce9e78f01/OpenGL/platform/egl.py

Unlike the GLX platform, which has that function: https://github.com/mcfletch/pyopengl/blob/3e9791ffb4cd4831dae261d6bea3049ce9e78f01/OpenGL/platform/glx.py#L97

After digging around for a bit, it's not totally surprising that this bug made it into pyopengl, and hasn't been noticed or fixed yet -- the pyopengl project is even more starved for developers than LinuxCNC. This is the most recent email on the pyopengl developers' mailing list: https://sourceforge.net/p/pyopengl/mailman/message/37278387/

This all makes me more willing to go with the fix/workaround in #2267 - it just restores the selection of the working GLX platform from 3.1.5.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 16, 2023 via email

@SebKuzminsky
Copy link
Collaborator Author

Reported upstream here: mcfletch/pyopengl#89

Reported to Debian here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1029011

@andypugh
Copy link
Collaborator

https://xkcd.com/2347/

@SebKuzminsky
Copy link
Collaborator Author

With @swt2c's fix in mcfletch/pyopengl#91 I now get a little further in launching Axis:

$ echo $XDG_SESSION_TYPE
wayland

$ linuxcnc -l
LINUXCNC - 2.10.0~pre0
Machine configuration directory is '/home/seb/linuxcnc/configs/sim.axis'
Machine configuration file is 'axis.ini'
Starting LinuxCNC...
linuxcnc TPMOD=tpmod HOMEMOD=homemod EMCMOT=motmod
Note: Using POSIX non-realtime
Found file(lib): /usr/share/linuxcnc/hallib/core_sim.hal
Found file(lib): /usr/share/linuxcnc/hallib/sim_spindle_encoder.hal
Found file(lib): /usr/share/linuxcnc/hallib/axis_manualtoolchange.hal
Found file(lib): /usr/share/linuxcnc/hallib/simulated_home.hal
Found file(lib): /usr/share/linuxcnc/hallib/check_xyz_constraints.hal
Traceback (most recent call last):
  File "/usr/bin/axis", line 62, in <module>
    from rs274.OpenGLTk import *
  File "/usr/lib/python3/dist-packages/rs274/OpenGLTk.py", line 16, in <module>
    import _togl
ImportError: /usr/lib/python3/dist-packages/_togl.cpython-310-x86_64-linux-gnu.so: undefined symbol: glXDestroyCo
ntext
Shutting down and cleaning up LinuxCNC...
task: 634 cycles, min=0.000015, max=0.005903, avg=0.001098, 0 latency excursions (> 10x expected cycle time of 0.
001000s)
Note: Using POSIX non-realtime
LinuxCNC terminated with an error.  You can find more information in the log:
    /home/seb/linuxcnc_debug.txt
and
    /home/seb/linuxcnc_print.txt
as well as in the output of the shell command 'dmesg' and in the terminal

@SebKuzminsky
Copy link
Collaborator Author

A couple of thoughts here, from me who doesn't know the first thing about OpenGL:

  1. It looks like we're now using a mix of EGL and GLX, is that ok? Seems wrong.
  2. We have an old, old fork of the togl source in our repo, we should probably look into rewriting our _toglmodule & related build infrastructure to use debian's packaged libtogl and libtogl-dev instead.

@swt2c
Copy link

swt2c commented Jan 26, 2023

Sorry to butt in here, but since I'm here... :-)

A couple of thoughts here, from me who doesn't know the first thing about OpenGL:

1. It looks like we're now using a mix of EGL and GLX, is that ok?  Seems wrong.

Yes, that's probably not going to work. If you want to work natively on Wayland, you're going to have to use EGL. Otherwise, you could force things back to X11 and use GLX.

2. We have an old, _old_ fork of the togl source in our repo, we should probably look into rewriting our _toglmodule & related build infrastructure to use debian's packaged libtogl and libtogl-dev instead.

Assuming that _togl.cpython-310-x86_64-linux-gnu.so is your forked of togl, then yes it appears to be linked with GLX.

@SebKuzminsky
Copy link
Collaborator Author

Hi, nice to see you here! Thanks for the pyopengl fix, and for your advise on our OpenGL mess :-)

_togl.cpython-310-x86_64-linux-gnu.so is built from https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/usr_intf/axis/extensions/_toglmodule.c, which hilariously #includes our fork of togl.c...

@swt2c
Copy link

swt2c commented Jan 26, 2023

I'm sorry that I looked. ;)

Your togl code would probably need to grow EGL support, if you wanted to go that route. I looked quickly at Debian's togl and it doesn't look much better/newer. togl project seems to be dead as far as I can see.

@SebKuzminsky
Copy link
Collaborator Author

SebKuzminsky commented Jan 27, 2023

We don't have the expertise or volunteer-hours available to switch our whole world from GLX to EGL currently, so it looks like I should reopen #2267 and advocate for that as our workaround for the near-term future.

Does that sound like the least-worst solution to you, @swt2c?

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 28, 2023 via email

@SebKuzminsky
Copy link
Collaborator Author

Here's a link-heavy overview of the OpenGL/GLX/EGL landscape on Unix, I found it useful: https://utcc.utoronto.ca/~cks/space/blog/linux/EGLAndGLXAndOpenGL?showcomments#comments

It sounds like we should immediately force LinuxCNC back to running on GLX (like we have been forever), instead of inconsistently try to run partially on GLX and partially on EGL (like we accidentally started doing back in November).

We should then hope that one of us has the spoons to clean up our OpenGL mess and switch us from GLX to EGL, since that seems to be the way the future is going.

And maybe at the same time switch from OpenGL to OpenGL ES, to run better on tiny ARM machines which sometimes don't implement OpenGL but do implement OpenGL ES.

SebKuzminsky added a commit that referenced this issue Jan 29, 2023
On Bookworm and newer pyopengl detects if it's running on Wayland, and
chooses EGL there.  This is problematic because the rest of LinuxCNC is
still on GLX, and (as i understand it) one should not mix GLX and EGL
within a single application.

See <#2264> for some details.

The correct fix is probably to teach our code to either use GLX everywhere
(for when running on X, and like it does after this commit) or EGL
everywhere (for when running on Wayland).

This commit fixes Axis on Wayland on Bookworm for me.
@swt2c
Copy link

swt2c commented Jan 29, 2023

We don't have the expertise or volunteer-hours available to switch our whole world from GLX to EGL currently, so it looks like I should reopen #2267 and advocate for that as our workaround for the near-term future.

Does that sound like the least-worst solution to you, @swt2c?

Yes.

@petterreinholdtsen
Copy link
Collaborator

petterreinholdtsen commented Jan 30, 2023 via email

@JetForMe
Copy link
Contributor

I just installed the Jan 23 Debian 12 "testing" build (pretty sure it was dated Jan 23), and am running into these same issues. I'm willing to try to help out with this, but first, how do I properly install pyopengl mentioned here?

@swt2c
Copy link

swt2c commented Jan 31, 2023

I just installed the Jan 23 Debian 12 "testing" build (pretty sure it was dated Jan 23), and am running into these same issues. I'm willing to try to help out with this, but first, how do I properly install pyopengl mentioned here?

Just update your Debian testing. That fix is now in testing.

@JetForMe
Copy link
Contributor

Is that this?

$ dpkg -l | grep -i opengl
…
ii  python3-opengl                          3.1.6+dfsg-2                    all          Python bindings to OpenGL (Python 3)

I have no other updates available.

@swt2c
Copy link

swt2c commented Jan 31, 2023 via email

@JetForMe
Copy link
Contributor

Thank you for the confirmation. I'm experiencing a strange behavior where I get the "Platform does not define a GLUT font retrieval function" when logged into the VM console (I'm doing this in a Parallels VM on my Mac), but not when I'm logged in via ssh -Y. The environment is slightly different (e.g. WAYLAND_DISPLAY= is set on the console but not ssh), but I don't know this stuff well enough to understand.

@NTULINUX
Copy link
Contributor

NTULINUX commented Jan 31, 2023

PYOPENGL_PLATFORM=x11 linuxcnc
export PYOPENGL_PLATFORM=x11 ; linuxcnc

Nothing is working here for me..

edit: Happens on both LXQt and XFCE.

edit2: Note: Using Gentoo here; no Wayland, only X server.

@JetForMe
Copy link
Contributor

@NTULINUX Do you have the fix to python3-opengl? That fixed the error this issue is about. But to get it to work all the way I also had to set export GDK_BACKEND=x11.

@SebKuzminsky
Copy link
Collaborator Author

This issue is "fixed" for now, by a combination of python3-opengl 3.1.6+dfsg-2 and #2314.

Many thanks to @swt2c for the pyopengl fix and for lending his expertise here, and thanks to @jepler for the #2314 workaround!

Gird your loins, linuxcnc hackers: we have lots of OpenGL work to do in the near future...

@andypugh
Copy link
Collaborator

andypugh commented Feb 9, 2023

Loins girded, I am trying to clear my pressing projects to make room.

@jepler
Copy link
Contributor

jepler commented Feb 9, 2023

Thank you Andy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants