Skip to content

GuichiZhao/COMP5349A1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

COMP5349A1

Analyse big dataset with hadoop mapreduce

Refer to assignment1_handout.pdf for detailed requirement

##How to run ###requirement Hadoop 2.6.0
###Steps

  1. create a hdfs dirctory in your hdfs home named place and upload the place.txt into it
  2. create another hdfs directory in your hdfs home named photo and upload n01.txt into it
  3. set A1_HOME environment variable to store the intermidiate output for each jobs
  4. In the pom.xml directory : mvn package
  5. cd to task1.sh and task2.sh for each tasks,making sure the scripts stay in the same directory as the MRDriverTask1.class (MRDriverTask2.class)
  6. pass an integer argument to the task1.sh (or task2.sh) indicating the job start from.for the first time,the argument is always 1

About

Analyse big dataset with hadoop mapreduce

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published