Skip to content

zyxar/hdfs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Package

Go bindings for libhdfs, for manipulating files on Hadoop distributed file system.

Types

  • hdfs.Fs: file system handle
  • hdfs.File: file handle
  • hdfs.FileInfo: file metadata structure, represented within Go

Methods

see go doc

Usage

Prerequisite

  • JVM
  • HDFS: c bindings for libhdfs, java binary packages
  • HDFS: configured cluster

Tips for building libhdfs on OS X

Based on hadoop-1.0.4.

  1. change <error.h> to <err.h> in src/c++/libhdfs/hdfsJniHelper.c
  2. change md5sum to md5 in src/saveVersion.sh
  3. run chmod +x src/c++/libhdfs/install-sh
  4. run ant -Dcompile.c++=true -Dlibhdfs=true compile-c++-libhdfs to build libhdfs.
  5. Upon successful building, libraries have been installed in build/c++/Mac_OS_X-x86_64-64/lib; Makefile in build/c++-build/Mac_OS_X-x86_64-64/libhdfs is very helpful for later-on re-compilation. it is ok the build ends up with installation errors if you can already find compiled libs in build/c++-build/Mac_OS_X-x86_64-64/libhdfs/.libs or so
  6. change install_name for libhdfs: install_name_tool -id /usr/lib/java/libhdfs.0.0.0.dylib libhdfs.0.0.0.dylib
  7. put libhdfs*.dylib in /usr/lib/java

Tips for building libhadoop on OS X

Based on hadoop-1.0.1; libhadoop would be loaded by util.NativeCodeLoader when accessing local file system.

  1. java: change -ljvm to -framework JavaVM in both Makefile.am and Makefile.in

  2. libz: apply patch to acinclude.m4:

     elif test ! -z "`which otool | grep -v 'no otool'`"; then
         ac_cv_libname_$1=\"`otool -L conftest | grep $1 | sed -e 's/^[  ]*//' -e 's/ .*//' -e 's/.*\/\(.*\)$/\1/'`\";
    

    and configure:

     elif test ! -z "`which otool | grep -v 'no otool'`"; then
         ac_cv_libname_z=\"`otool -L conftest | grep z | sed -e 's/^  *//' -e 's/ .*//' -e 's/.*\/\(.*\)$/\1/'`\";
    
  3. apply patch to source code src/org/apache/hadoop/security/JniBasedUnixGroupsNetgroupMapping.c.

  4. run ant compile-native

  5. put the compiled library libhadoop.1.0.0.dylib and its symbolic links in /usr/lib/java, which is one of the default element of java.library.path.

  6. change install_name for libhadoop: sudo install_name_tool -id /usr/lib/java/libhadoop.1.dylib /usr/lib/java/libhadoop.1.0.0.dylib

Prepare

  1. put .jar from hadoop in .libs/javalibs; conf/ in .libs; see mktest.sh for details, or you can modify it to accommodate your environment.

  2. set LD_LIBRARY_PATH for Linux:

     export LD_LIBRARY_PATH=./lib:/opt/jdk/jre/lib/amd64/server
    

    make sure libhdfs.so and libjvm are declared in LD_LIBRARY_PATH

    You don't have to do this on OS X. You can always use install_name_tool to set or change a library's install name, also jvm on OS X is a system framework, so that it is not necessory to add jvm's path, while the only thing in step 3 is providing hdfs.h header path for #cgo.

  3. correct the #cgo header in hdfs.go, according to your enviornment.

Test

  • After the preparation, correct the constants in hdfs_test.go.
  • run ./mktest.sh.

Known Issues

  1. Currently connecting to local file system is not handled correctly. So Connect("", 0) would lead to error. It is okay now to access to local file system.
  2. errno in libhdfs is not handled precisely. For example, invokeMethod() would probably sets errno to 2 in a lot of routines.

About

go bindings for libhdfs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published