Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMPI 1.8.3 crashing in hwloc 1.7.2 on ChromeOS #339

Closed
Moldoteck opened this issue Dec 8, 2018 · 2 comments
Closed

OpenMPI 1.8.3 crashing in hwloc 1.7.2 on ChromeOS #339

Moldoteck opened this issue Dec 8, 2018 · 2 comments

Comments

@Moldoteck
Copy link

lstopo 1.11.5

un hwloc (no description available)
ii hwloc-nox 1.11.5-1 amd64 Hierarchical view of the machine - non-X version of u
un libhwloc-contrib-plugins (no description available)
ii libhwloc-dev:amd64 1.11.5-1 amd64 Hierarchical view of the machine - static libs and he
ii libhwloc-plugins 1.11.5-1 amd64 Hierarchical view of the machine - plugins
un libhwloc0 (no description available)
un libhwloc1 (no description available)
un libhwloc2 (no description available)
un libhwloc3 (no description available)
un libhwloc4 (no description available)
ii libhwloc5:amd64 1.11.5-1 amd64 Hierarchical view of the machine - shared libs

Which operating system and hardware are you running on?

#######******* Chrome OS, Linux penguin 4.14.67-07156-gc116f2c8c400 #1 SMP PREEMPT Sun Sep 9 14:28:13 PDT 2018 x86_64 GNU/Linux

  • On Unix-like systems, run uname -a so that we know which operating system, distribution, and kernel version you are using.

  • Post the output of lstopo - if it works
    Machine (1413MB) + Package L#0 + L2 L#0 (1024KB)
    L1d L#0 (24KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
    L1d L#1 (24KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)

  • Describe the machine and the processors it contains, as well as any memory and/or peripherals that matter to your issue
    Lenovo Chromebook N22-20, Intel celeron n3050, 2GB ram

Details of the problem

  • What happened?

I am using mpirun command. The root of the problem seems to be in hwloc. As I see, somebody already had problem like this sebhtml/ray#235 (comment)

  • How did you start your process?
    mpirun -np 1 a.out
    mpirun --help

Additional information

sebhtml/ray#235 (comment)
If your issue consists in a wrong topology detection, we also need the following for debugging remotely:

Machine (P#0 local=1446596KB total=1446596KB Backend=Linux LinuxCgroup=/ hwlocVersion=1.11.5 ProcessName=lstopo-no-graph$
Package L#0 (P#0 CPUModel=06/4c)
L2Cache L#0 (size=1024KB linesize=64 ways=16)
L1dCache L#0 (size=24KB linesize=64 ways=6)
L1iCache L#0 (size=32KB linesize=64 ways=8)
Core L#0 (P#0)
PU L#0 (P#0)
L1dCache L#1 (size=24KB linesize=64 ways=6)
L1iCache L#1 (size=32KB linesize=64 ways=8)
Core L#1 (P#0)
PU L#1 (P#1)
depth 0: 1 Machine (type #1)
depth 1: 1 Package (type #3)
depth 2: 1 L2Cache (type #4)
depth 3: 2 L1dCache (type #4)
depth 4: 2 L1iCache (type #4)
depth 5: 2 Core (type #5)
depth 6: 2 PU (type #6)
Topology not from this system

@bgoglin
Copy link
Contributor

bgoglin commented Dec 8, 2018

Your link seems to say you're using OpenMPI 1.8.3, which is very old and seems to contain hwloc 1.7.2. We fixed so many bugs since that release in 2013 that I don't know where to start.

You should:

  1. build your own hwloc 1.7.2 and check whether lstopo works there. if it doesn't work, then the bug is already fixed in 1.11.5. if it works, we'd to figure out why it fails in OpenMPI and not in lstopo.
  2. rebuild OpenMPI using your hwloc 1.11.5 instead of its own embedded 1.7.2 (--with-hwloc=/path/to/your/hwloc/1.11.5/install/directory).

@bgoglin bgoglin changed the title Problem causing mpirun crash OpenMPI 1.8.3 crashing in hwloc 1.7.2 on ChromeOS Dec 8, 2018
@bgoglin
Copy link
Contributor

bgoglin commented Oct 4, 2019

Closing this old bug. If the same problem occurs with a recent hwloc, please reopen.

@bgoglin bgoglin closed this as completed Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants