hive on spark查询问题
问题大概如下
配置了一个hadoop集群,配置为8虚拟CPU,16g内存,100g物理磁盘。
集群已近导入了70G的数据,现在需要进行一个sql查询,
SELECT loid AS userid,count(1) AS count_all,count(if(PONRx > 0,1,null)) AS count_notnull_pon_down,count(if(PONTx > 0,1,null)) AS count_notnull_pon_up,AVG(PONRx) AS pon_speed_avg_down,MAX(PONRx) AS pon_speed_max_down,AVG(PONTx) AS pon_speed_avg_up,MAX(PONTx) AS pon_speed_max_up,count(IF(lan1Rx > 0, 1, NULL)) AS count_notnull_lan1_down,count(IF(lan1Tx > 0, 1, NULL)) AS count_notnull_lan1_up,AVG(lan1Rx) AS lan2_speed_avg_down,MAX(lan1Rx) AS lan2_speed_max_down,AVG(lan1Tx) AS lan2_speed_avg_up,MAX(lan1Tx) AS lan2_speed_max_up,count(IF(lan2Rx > 0, 1, NULL)) AS count_notnull_lan2_down,count(IF(lan2Tx > 0, 1, NULL)) AS count_notnull_lan2_up,AVG(lan2Rx) AS lan2_speed_avg_down,MAX(lan2Rx) AS lan2_speed_max_down,AVG(lan2Tx) AS lan2_speed_avg_up,MAX(lan2Tx) AS lan2_speed_max_up,sum(PONRx) AS sum_stat_pon_down,sum(PONTx) AS sum_stat_pon_up,sum(lan1Rx) AS sum_stat_lan1_down,sum(lan1Tx) AS sum_stat_lan1_up,sum(lan2Rx) AS sum_stat_lan2_down,sum(lan2Tx) AS sum_stat_lan2_up,count(IF(PONRx > 1, 1, NULL)) AS count_speed_pon_1M,count(IF(PONRx > 10, 1, NULL)) AS count_speed_pon_10M,count(IF(PONRx > 50, 1, NULL)) AS count_speed_pon_50M,count(IF(lan1Rx > 1, 1, NULL)) AS count_speed_lan1_1M,count(IF(lan1Rx > 10, 1, NULL)) AS count_speed_lan1_10M,count(IF(lan1Rx > 50, 1, NULL)) AS count_speed_lan1_50M,count(IF(lan2Rx > 4, 1, NULL)) AS count_speed_lan1_4M FROM test.flow GROUP BY loid;
在主机上用hive on spark 进行查询,但是每到查询快完毕时,就会报错,
现在想问问是需要提高集群配置或者新增分机,还是我的配置文件上有问题