Install giraph in hadoop node

From Notes_Wiki
Revision as of 15:57, 14 October 2013 by Saurabh (talk | contribs) (Created page with "<yambe:breadcrumb>Java|Java</yambe:breadcrumb> =Install giraph in hadoop node= # Setup hadoop in single Cent-OS node as explained at [[Install hadoop in a single Cent-OS node...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

<yambe:breadcrumb>Java|Java</yambe:breadcrumb>

Install giraph in hadoop node

  1. Setup hadoop in single Cent-OS node as explained at Install hadoop in a single Cent-OS node
  2. Create a directory for temporary files such as /opt/hadoop/tmp and add following to 'conf/core-site.xml' file:
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop/tmp</value>
    </property>
  3. Edit conf/mapred-site.xml file and add following configuration to allow 4 mappers to run in parallel:
    <property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>4</value>
    </property>
    <property>
    <name>mapred.map.tasks</name>
    <value>4</value>
    </property>
  4. Edit conf/hdfs-site.xml and add:
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    <description></description>
    </property>
    to configure hdfs to maintain only one copy of data, effectively disabling replication.
  5. Format the node using ./bin/hadoop namenode -format, only if not formatted already.
  6. Start all services using ./bin/start-all.sh
  7. Install maven using
    sudo yum -y install maven
    Verify that installed version is >= 3.0 using mvn --version
  8. Download latest stable giraph from https://www.apache.org/dyn/closer.cgi/giraph/
  9. Extract giraph source in /opt/hadoop/giraph folder
  10. Make sure giraph files are owned by hadoop:hadoop
  11. Edit ~/.bash_profile for hadoop user and add:
    export GIRAPH_HOME=/opt/hadoop/giraph
  12. Exit from hadoop user and login again. Verify that variable is set using:
    set | grep GIRAPH
  13. Install maven using:
    cd $GIRAPH_HOME
    mvn package
    If you want to avoid running tests after install use:
    mvn package -DskipTests
  14. If installation is successful then folder 'giraph-core/target' should have file named 'giraph-<ver>-for-hadoop-<ver>-jar-with-dependencies.jar'. Also folder 'giraph-examples/target/' would have jar file for examples with similar naming.

Steps learned from https://giraph.apache.org/quick_start.html


<yambe:breadcrumb>Java|Java</yambe:breadcrumb>