Skip to content

oss-tsukuba/gfarm-hadoop

Repository files navigation

Introduction
============

You can use Hadoop, an open-source MapReduce framework, on Gfarm with this
Hadoop-Gfarm plugin.

Install
=======

1. Installing Gfarm
Please read INSTALL.en or INSTALL.ja contained in Gfarm package.

2. Installing Hadoop
Please read the documents on Hadoop site (http://hadoop.apache.org/core/).

3. Installing Hadoop-Gfarm

3.1 Building Hadoop-Gfarm
To build Hadoop-Gfarm, edit JAVA_HOME, HADOOP_HOME and GFARM_HOME in build.sh and type
the following commands.

$ ./build.sh

3.2 Configuring Hadoop
Now hadoop-gfarm.jar and libGfarmFSNative.so is created and copied to Hadoop's library directory.
hadoop-gfarm.jar is copied to ${HADOOP_HOME}/lib
libGfarmFSNative.so is copied to ${HADOOP_HOME}/lib/native/Linux-amd64-64 and 
${HADOOP_HOME}/lib/native/Linux-i386-32

Finally, please add this configuration to the core-site.xml.

  <property>
    <name>fs.gfarm.impl</name>
    <value>org.apache.hadoop.fs.gfarmfs.GfarmFileSystem</value>
    <description>The FileSystem for gfarm: uris.</description>
  </property>

3.3 Access Gfarm through Hadoop
If installation is done, you can access Gfarm's root directory through Hadoop
like following.

$ ${HADOOP_HOME}/bin/hadoop dfs -ls gfarm:///

Or you can run sample MapReduce programs like this.

$ ${HADOOP_HOME}/bin/hadoop jar hadoop-${VERSION}-examples.jar wordcount gfarm:///input gfarm:///output

3.4 Cofiguring Hadoop-Gfarm
You can configure your working directory by the following property in core-site.xml.

  <property>
    <name>fs.gfarm.workingDir</name>
    <value>/home/${user.name}</value>
    <description>The working directory on gfarm file system.</description>
  </property>

Hadoop Versions
===============

Release 1.0.1 of Gfarm-Hadoop plugin supports the following Hadoop versions:

0.23.X
1.2.X