hive on spark查询问题

java_emmm 2018-11-22 12:37:26
问题大概如下
配置了一个hadoop集群,配置为8虚拟CPU,16g内存,100g物理磁盘。
集群已近导入了70G的数据,现在需要进行一个sql查询,
SELECT loid AS userid,count(1) AS count_all,count(if(PONRx > 0,1,null)) AS count_notnull_pon_down,count(if(PONTx > 0,1,null)) AS count_notnull_pon_up,AVG(PONRx) AS pon_speed_avg_down,MAX(PONRx) AS pon_speed_max_down,AVG(PONTx) AS pon_speed_avg_up,MAX(PONTx) AS pon_speed_max_up,count(IF(lan1Rx > 0, 1, NULL)) AS count_notnull_lan1_down,count(IF(lan1Tx > 0, 1, NULL)) AS count_notnull_lan1_up,AVG(lan1Rx) AS lan2_speed_avg_down,MAX(lan1Rx) AS lan2_speed_max_down,AVG(lan1Tx) AS lan2_speed_avg_up,MAX(lan1Tx) AS lan2_speed_max_up,count(IF(lan2Rx > 0, 1, NULL)) AS count_notnull_lan2_down,count(IF(lan2Tx > 0, 1, NULL)) AS count_notnull_lan2_up,AVG(lan2Rx) AS lan2_speed_avg_down,MAX(lan2Rx) AS lan2_speed_max_down,AVG(lan2Tx) AS lan2_speed_avg_up,MAX(lan2Tx) AS lan2_speed_max_up,sum(PONRx) AS sum_stat_pon_down,sum(PONTx) AS sum_stat_pon_up,sum(lan1Rx) AS sum_stat_lan1_down,sum(lan1Tx) AS sum_stat_lan1_up,sum(lan2Rx) AS sum_stat_lan2_down,sum(lan2Tx) AS sum_stat_lan2_up,count(IF(PONRx > 1, 1, NULL)) AS count_speed_pon_1M,count(IF(PONRx > 10, 1, NULL)) AS count_speed_pon_10M,count(IF(PONRx > 50, 1, NULL)) AS count_speed_pon_50M,count(IF(lan1Rx > 1, 1, NULL)) AS count_speed_lan1_1M,count(IF(lan1Rx > 10, 1, NULL)) AS count_speed_lan1_10M,count(IF(lan1Rx > 50, 1, NULL)) AS count_speed_lan1_50M,count(IF(lan2Rx > 4, 1, NULL)) AS count_speed_lan1_4M FROM test.flow GROUP BY loid;
在主机上用hive on spark 进行查询,但是每到查询快完毕时,就会报错,
现在想问问是需要提高集群配置或者新增分机,还是我的配置文件上有问题
...全文
383 2 打赏 收藏 转发到动态 举报
写回复
用AI写文章
2 条回复
切换为时间正序
请发表友善的回复…
发表回复
  • 打赏
  • 举报
回复
报错的日志确认问题
4qw 2018-11-23
  • 打赏
  • 举报
回复
先拿出一小部分数据,验证下你的SQL的写法是否有问题;
发现有部分别名存在重复问题,自己去校验,以及验证下用到的SQL语法是否正常或者是否支持,
写一些简单的SQL验证下语法情况

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧