a.MMAC/MHz/DSP with scalar threads = 标量单元MAC/cycle? scalar的向量操作,vector是64bit,一条指令4个16bit mac,这里是8,是2个slot?
而且v66提到Scalar MAC and Floating Point capability is doubled, with each cluster now having its own dedicated units,那么 v66应该是 16 呀?
with HVX threads理解成向量单元MAC/cycle,vector是1024bit,一条指令64个16bit mac,那么也应该*2=128才能和上面对应上,这里对不上?
b.MFLOP/MHz = float运算/cycle, 支持单精度,标量单元一条指令2个float,v65也*2了,按2个slot能对上,而v66是8,是因为两个culster不共享资源了
c.MMOPS/MHz =op/cycle?什么含义
d.v65 2个HVX units,v66 4个HVX units,那么vector reg数要*2,*4,还有load bandwidth算的时候也要1024bit*2,1024bit*4?
e.另外SM8150的16bit MMAC/MHz/DSP w/ HVX threads是64,但我看其他文档上面有写是128?