spark加载elasticsearch缓慢
es总数据量大约有10亿,去最近一个月的数据(大概5000万),使用sparksql去加载,然后处理相关业务。加载异常缓慢,感谢有做过类似优化的朋友共享一下。另附加载代码:
val vehpassDataFrame = sparkSession.sqlContext.read.format("org.elasticsearch.spark.sql").options(options).load("alias_veh_pass/doc")
vehpassDataFrame.select("hphm","hpzl","jgsj","gctp1","gcbh","lhy_syxz").createTempView("alias_veh_pass")