mapreduce 中单表关联的问题

lty369963 2014-08-28 10:53:59

我写个一个单表关联的程序，但是没有输出，只输出表头，不知道为什么（每行数据是以tab分割的），请大侠指教

public class STjoin {

    public static int time = 0;

    public static class Map extends Mapper<Object,Text,Text,Text>{

        public void map(Object key,Text value,Context context)throws IOException,InterruptedException{

            String childnName = new String();

            String parentName = new String();

            //String relationtype = new String();

            String line = value.toString();

            String[] strs= line.split("	");



            context.write(new Text(strs[1]),new Text("1"+"-"+strs[0]));//输出左表

            context.write(new Text(strs[0]),new Text("2"+"-"+strs[1]));//输出右表

        }

    }



    public static class Reduce extends Reducer<Text,Text,Text,Text>{

        public void reduce(Text key,Iterable<Text> values,Context context)throws IOException,InterruptedException{

            if (time == 0){//输出表头

                context.write(new Text("grandchild"),new Text("grandparent"));

                time++;

            }

            int grandchildNum = 0;

            String grandchild[] = new String[20];

            int grandparentNum = 0;

            String grandparent[] = new String[20];



            Iterator iter = values.iterator();

            while (iter.hasNext()){

                String record = iter.next().toString();

                String[] st = record.split("-");

                if(st[0].equals("1")){

                    grandchild[grandchildNum]  =st[1];

                    grandchildNum ++;



                } else if(st[0].equals("2")){

                    grandparent [grandparentNum ]=st[1];

                    grandparentNum ++;

                }

            }





            //grandchild和grandparent数组求笛卡尔积

            if(grandchildNum !=0 && grandparentNum !=0){

                for (int m=0;m<grandchildNum;m++){

                    for (int n=0;n<grandparentNum;n++){

                        context.write(new Text(grandchild[m]),new Text(grandparent[n]));

                    }

                }



            }



        }



    }





    public  static void main(String[] args)throws Exception{

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();

        if(otherArgs.length !=2){

            System.err.println("Usage:...");

            System.exit(2);

        }

        Job job = new Job(conf,"single table join");

        job.setJarByClass(STjoin.class);

        job.setMapperClass(Map.class);

        job.setCombinerClass(Reduce.class);

        job.setReducerClass(Reduce.class);



        job.setOutputKeyClass(Text.class);

        job.setOutputValueClass(Text.class);



        FileInputFormat.addInputPath(job,new Path(otherArgs[0]));

        FileOutputFormat.setOutputPath(job,new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : -1);

    }

}

...全文

353 12 打赏收藏转发到动态举报

写回复

用AI写文章

12 条回复

切换为时间正序

请发表友善的回复…

发表回复

SG90 2014-09-04

打赏
举报

引用 11 楼 wulinshishen 的回复:

[quote=引用 10 楼 lty369963 的回复:] [quote=引用 7 楼 wulinshishen 的回复:] [quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

首先还是 String[] strs= line.split(" "); 应该写成 String[] strs= line.split("\t"); 要不然那样可能会有问题，再者仔细看了下应该是 job.setCombinerClass(Reduce.class); 这个问题，去掉设置Combiner。[/quote] 就是这个错误，太感谢了，能说说原因吗？[/quote]

引用 8 楼 Imbyr 的回复:

[quote=引用 7 楼 wulinshishen 的回复:] [quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

首先还是 String[] strs= line.split(" "); 应该写成 String[] strs= line.split("\t"); 要不然那样可能会有问题，再者仔细看了下应该是 job.setCombinerClass(Reduce.class); 这个问题，去掉设置Combiner。[/quote] 中间combine了会出什么问题吗[/quote] Combiner是实现本地key的聚合，也可以说是本地reduce，用来减少网络传输数据量，提高性能。Combiner使用的合适的话会提高Job作业的执行速度，但是使用不合适的话，会导致输出的结果不正确。Combiner的输出是Reduce的输入，它绝不会改变最终的计算结果。比如在汇总统计的时候，可以使用Conbiner，但是在求平均数的时候就不适合使用了。 [/quote] 多谢解答

人生偌只如初见 2014-09-03

打赏
举报

引用 10 楼 lty369963 的回复:

[quote=引用 7 楼 wulinshishen 的回复:] [quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

引用 8 楼 Imbyr 的回复:

[quote=引用 7 楼 wulinshishen 的回复:] [quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

lty369963 2014-09-03

打赏
举报

引用 7 楼 wulinshishen 的回复:

[quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

majy 2014-09-02

打赏
举报

多少版本的Hadoop啊，能不能把完全的源代码贴出来？

SG90 2014-08-30

打赏
举报

引用 7 楼 wulinshishen 的回复:

[quote=引用 2 楼 lty369963 的回复:] 恩，这个我也试过，结果还是那样的

人生偌只如初见 2014-08-29

打赏
举报

引用 2 楼 lty369963 的回复:

恩，这个我也试过，结果还是那样的

首先还是 String[] strs= line.split(" "); 应该写成 String[] strs= line.split("\t"); 要不然那样可能会有问题，再者仔细看了下应该是 job.setCombinerClass(Reduce.class); 这个问题，去掉设置Combiner。