Why do you want to access other partition's data?
mapParitions(func) or mapPartitionsWihIndex(func) are for performance optimization, which allow your function to be run once PER partition, that's why its the function type must be Iterator<T> => Iterator<U>. You access the whole parittion's data in one iterator, but should and can NOT access other partitions' data.
Why do you want to access other partition's data?
你为什么想访问另一个分区的数据?
mapParitions(func) or mapPartitionsWihIndex(func) are for performance optimization, which allow your function to be run once PER partition, that's why its the function type must be Iterator<T> => Iterator<U>. You access the whole parittion's data in one iterator, but should and can NOT access other partitions' data.
mapParitions(func) 或mapPartitionsWihIndex(func) 是优化时用到的,这些操作允许你依次访问每个分区,这就是为什这个函数提供一个Iterator迭代引用给你,你可以通过这个迭代器遍历分区内的全部数据,但是一个分区的迭代器不能访问其他分区的数据。