hive-hbase整合后查询缓慢

chuckpu 2014-05-27 04:08:48
hive和hbase整合后发现
select count(*) from hbase_test
的map阶段一直停留在0% 大概过了20多分钟才开始不为0
hbase数据量在850W条左右
我直接导入到hive的表中的数据做这个查询也就30s左右
不知道是什么情况呢?
谢谢
2014-05-27 16:04:46,122 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 927.95 sec
2014-05-27 16:04:47,128 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 927.95 sec
2014-05-27 16:04:48,134 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 927.95 sec
2014-05-27 16:04:49,140 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 927.95 sec
2014-05-27 16:04:50,146 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 927.95 sec
2014-05-27 16:04:51,153 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:52,158 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:53,166 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:54,172 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:55,178 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:56,183 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:57,189 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:58,195 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec
2014-05-27 16:04:59,201 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 1002.69 sec

...全文
528 1 打赏 收藏 转发到动态 举报
写回复
用AI写文章
1 条回复
切换为时间正序
请发表友善的回复…
发表回复
vah101 2014-05-28
  • 打赏
  • 举报
回复
配置 hbase.client.scanner.caching 建hbase表的时候,注意预分region 最后利用rowkey作为查询条件 另外,hbase的数据冗余、读性能是瓶颈,跟orcfile方式比起来性能差距很大 可以参考这个: http://doc.okbase.net/superlxw1234/archive/48484.html

20,808

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧