Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lustre: dkms build fails on freshly installed CentOS 6 #3996

Closed
moschlar opened this issue Nov 6, 2015 · 8 comments
Closed

lustre: dkms build fails on freshly installed CentOS 6 #3996

moschlar opened this issue Nov 6, 2015 · 8 comments

Comments

@moschlar
Copy link

moschlar commented Nov 6, 2015

I followed the instructions from http://zfsonlinux.org/lustre.html to install ZFS and Lustre on a freshly installed vanilla CentOS 6 machine.

The kernel is 2.6.32-573.7.1.el6.x86_64, spl and zfs are 0.6.5.3 and lustre and lustre-osd-zfs is 2.5.3-1zfs.

ZFS and SPL compiled without problems through dkms, but Lustre seems to be having a problem.
dmks build lustre/2.5.3 says:

[...]
CC:            gcc
LD:            /usr/bin/ld -m elf_x86_64
CPPFLAGS:      -include /var/lib/dkms/lustre/2.5.3/build/config.h -I/var/lib/dkms/lustre/2.5.3/build/libcfs/include -I/var/lib/dkms/lustre/2.5.3/build/lnet/include -I/var/lib/dkms/lustre/2.5.3/build/lustre/include 
LLCPPFLAGS:    -D__arch_lib__ -D_LARGEFILE64_SOURCE=1
CFLAGS:        -g -O2 -Werror
EXTRA_KCFLAGS: -include /var/lib/dkms/lustre/2.5.3/build/config.h  -g -I/var/lib/dkms/lustre/2.5.3/build/libcfs/include -I/var/lib/dkms/lustre/2.5.3/build/lnet/include -I/var/lib/dkms/lustre/2.5.3/build/lustre/include
LLCFLAGS:      -g -Wall -fPIC -D_GNU_SOURCE

Type 'make' to build Lustre.

Building module:
cleaning build area...
make KERNELRELEASE=2.6.32-573.7.1.el6.x86_64................................................................(bad exit status: 2)
Error! Bad return status for module build on kernel: 2.6.32-573.7.1.el6.x86_64 (x86_64)
Consult /var/lib/dkms/lustre/2.5.3/build/make.log for more information.

and make.log ends with

[...]
  CC [M]  /var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.o
cc1: warnings being treated as errors
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c: In function ‘udmu_objset_statfs’:
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c:280: error: left shift count is negative
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c:280: error: duplicate case value
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c:280: error: previously used here
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c:281: error: left shift count is negative
/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.c:285: error: left shift count is negative
make[6]: *** [/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs/udmu.o] Error 1
make[5]: *** [/var/lib/dkms/lustre/2.5.3/build/lustre/osd-zfs] Error 2
make[4]: *** [/var/lib/dkms/lustre/2.5.3/build/lustre] Error 2
make[3]: *** [_module_/var/lib/dkms/lustre/2.5.3/build] Error 2
make[3]: Leaving directory `/usr/src/kernels/2.6.32-573.7.1.el6.x86_64'
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/var/lib/dkms/lustre/2.5.3/build'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/lib/dkms/lustre/2.5.3/build'
make: *** [all] Error 2

Does anyone have an idea what could be causing this errors?

@torkilt
Copy link

torkilt commented Nov 24, 2015

I hit the same snag, any idea for a fix or workaround?

@moschlar
Copy link
Author

From reading https://jira.hpdd.intel.com/browse/LU-6452, I thought that it maybe could work with the older version 0.6.3 of ZFS and SPL.

And I can actually confirm that it works...

But that is crucial information that should be noted on the documentation page.

@moschlar
Copy link
Author

And, of course, I want to use the newest Lustre (2.7.0) with the newest ZFS version (0.6.5.3) on a kernel that's getting security updates.
What is the best strategy for that?

@torkilt
Copy link

torkilt commented Nov 27, 2015

Lustre with 0.6.3 doesn't work either on my test machines. Everything seems to install and zfs works fine, but lustre does not.

[root@MDS1 local]# service lustre start lustre-MDT0000
Mounting tank/mgsmdt on /mnt/lustre/local/lustre-MDT0000
mount.lustre: mount tank/mgsmdt at /mnt/lustre/local/lustre-MDT0000 failed: No such device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems

I'm finding it odd that it should be that hard to get the community version up and running for a quick test

@moschlar
Copy link
Author

Yep, it's total PITA.

But as for your problem, are the lustre modules loaded? What are lsmod and dmesg saying?

@torkilt
Copy link

torkilt commented Nov 27, 2015

No modules and no mention of them on dmesg.

I eventually got the latest maintenance release from the Intel site going with SPL/ZFS 0.6.3-1.1 from here, though.

@skull-squadron
Copy link

Installing lustre was a PITA at every step, so we didn't use it at Stanford because it was so time and support expensive to deploy on IB/FC clusters for any sort of reliable setup (we ended up using Panasas and DDN solutions). Alternatives like Ceph were much easier to get going on commodity enterprise&consumer gear. Btw 2-3 gens old IB gear is super cheap and much lower latency than 10gbe. HTH gives a Plan B if it doesn't work out, but I hope it works because a viable, usuable cluster fs is better for users.

@behlendorf
Copy link
Contributor

Closing. Lustre build issues against ZFS are tracker on Lustre's issue tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants