hive表改变列类型后查询异常
有一张hive表,通过show create table输出部分是:
isnewuser int,
livesection array<struct<dim_sectionid:int,sectionid:int,title:string>>,
tvsection array<struct<dim_sectionid:int,sectionid:int,title:string>>,
dim_livetv int,
cloudmapsnumber int,
cloudmapsstatus int)
PARTITIONED BY (
dt string,
hour string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
'hdfs://ddiv-namenode:9000/bip/hive_warehouse/fact/cl_play'
TBLPROPERTIES (
'numPartitions'='43',
'numFiles'='44',
'last_modified_by'='pplive',
'last_modified_time'='1389775563',
'transient_lastDdlTime'='1389775563',
'numRows'='0',
'totalSize'='661209295',
'rawDataSize'='0')
其中tvsection为array<struct<dim_sectionid:int,sectionid:int,title:string>>
此时导入数据:LOAD DATA INPATH '/bip/hive_warehouse/fact/cl_play/_dt=131202/hour=17/cl.Play.14-01-08-23-r-00001-0.1327914' into table cl_play partition (dt=131202,hour=17);
进行查询select count(1),dt from cl_play where dt=131202 and hour=17 group by dt;完全正常。
然而如果将tvsection的类型改为string:ALTER TABLE cl_Play CHANGE `TVSection` `TVSection` String;
此时在执行select则会出错,错误为:
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:136)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.<init>(ObjectInspectorConverters.java:301)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:138)
at org.apache.hadoop.hive.ql.exec.MapOperator.initObjectInspector(MapOperator.java:274)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:486)
通过测试得出的大致结论是load data时确定了data的数据类型,load完成后如果改变hive表中列的类型后select会出错。但是根据我的理解在load data时仅仅是copy数据,只有在查询时进行读取才进行类型检查,这个跟以上结论矛盾,不知道是否跟文件存储类型rcfile有关。
目前leader在让我查这种现象的原因,可是一点进展都没有,还请大神们给点线索!!!