3 eclipse上配置mapreduce

1.配置插件

  • 把hadoop-eclipse-plugin-1.2.1.jar拷貝到eclipse的plugins目錄中,重啟eclipse。
  • 會(huì)看到eclipse左邊的project explorer中出現(xiàn)DFS Locations,點(diǎn)擊window->perspective->open perspective->other...,打開Map/Reduce。
Paste_Image.png
  • 在下方新建Hadoop Locations
Paste_Image.png
  • 填寫參數(shù):Location name隨便填,Map/Reducer Master中的Port好像填9001和50020都行,與mapred-site.xml中一致,右邊的Port與core-site.xml一致,寫9000。
Paste_Image.png
  • 啟動(dòng)start-all.sh后,就能通過插件來操作DFS了。
Paste_Image.png
  • 在hadoop-wsj下新建文件夾input/wc和output,在wc中上傳一個(gè)文件,用于統(tǒng)計(jì)單詞個(gè)數(shù);output用于存放輸出結(jié)果。
    注意:只新建output,不新建output/wc,因?yàn)閣c會(huì)在程序運(yùn)行時(shí)自動(dòng)生成,提前建了反而報(bào)錯(cuò)。

2.新建map/reduce工程

  • 注意建工程時(shí)要指定hadoop的安裝路徑
Paste_Image.png

最后是是mapreduce的Demo:
一共3個(gè)class文件,McMapper.class,WcReducer.class和JobRun.class。

  • McMapper.class:
package test0;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class McMapper extends Mapper<LongWritable, Text, Text, IntWritable>{//輸入(key,value)類型確定

 //每次調(diào)用map方法會(huì)傳入split中的一行數(shù)據(jù),key:該行數(shù)據(jù)所在文件中的位置下標(biāo),value:這行數(shù)據(jù)
 protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
 throws IOException, InterruptedException {
 String line = value.toString();
 StringTokenizer st = new StringTokenizer(line);
 while(st.hasMoreTokens()){
 String word = st.nextToken();
 context.write(new Text(word), new IntWritable(1));//map輸出
 }
 }
}
  • WcReducer.class:
package test0;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
 protected void reduce(Text key, Iterable<IntWritable> arg1,
 Reducer<Text, IntWritable, Text, IntWritable>.Context arg2) throws IOException, InterruptedException {
 int sum = 0;
 for(IntWritable i:arg1){
 sum = sum + i.get();
 }
 arg2.write(key, new IntWritable(sum));
 }
}
  • JobRun.class:
package test0;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class JobRun {
 public static void main(String[] args){
 Configuration conf = new Configuration();
 /**
  * 下面兩行很重要,是為了定位到HDFS的文件系統(tǒng)中,而不是本地的路徑
  * 但前提是core-site.xml和hdfs-site.xml中的配置信息完全按照官方文檔寫,
  * 自己不能改動(dòng)hadoop.tmp.dir的路徑,否則會(huì)報(bào)錯(cuò)
  */
 conf.addResource(new Path("/home/wsj/hadoop121/hadoop-1.2.1/conf/core-site.xml"));     
 conf.addResource(new Path("/home/wsj/hadoop121/hadoop-1.2.1/conf/hdfs-site.xml"));     
 try {
 Job job = new Job(conf);
 job.setJarByClass(JobRun.class);
 job.setMapperClass(McMapper.class);
 job.setReducerClass(WcReducer.class);
 job.setMapOutputKeyClass(Text.class);
 job.setMapOutputValueClass(IntWritable.class);
 job.setNumReduceTasks(1);
 
 FileInputFormat.addInputPath(job, new Path("/tmp/hadoop-wsj/input/wc"));
 FileOutputFormat.setOutputPath(job, new Path("/tmp/hadoop-wsj/output/wc"));
 try {
 System.exit(job.waitForCompletion(true) ? 0 : 1);
 } catch (ClassNotFoundException e) {
 e.printStackTrace();
 } catch (InterruptedException e) {
 e.printStackTrace();
 }
 
 } catch (IOException e) {
 e.printStackTrace();
 }

 }
}
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

推薦閱讀更多精彩內(nèi)容