CUFFT库实践(2)

adagio_chen 2014-07-14 09:26:03
若要熟练掌握一个库的使用,特别是数学相关的库,去熟悉一下它的原理还是有必要的。如果仅仅看看自带的例子,照老虎画猫,恐怕只能知其然而不知其所以然,遇到新的问题时就不知道该怎么调整算法和参数了。

对于离散傅立叶变换的公式,有些书上讲到“其实它并没有物理意义,只是为了工程上的计算而已”。那么用FFT程序变换得到的数值代表什么意义?是频率么,是什么范围内的频率?

不妨来看一下傅立叶级数的的公式。设x(t)为一个定义在[0, T]上的一维信号,则:


可以看到,当我们用黎曼求和方式吧公式(2)写成离散表达式,并且n=0,1,2,3....时,就是(为归一化的)离散傅立叶变换公式。实际上,很多时候我需要的频谱值是从[-N, N]采样的。这个时候,便需要对频谱做一下平移。即



高维情况下也是类似的原理,这里给出一个三维的例子:


__global__
void TranslateSignal(int N, float3 d, float2 *signal)
{
int i = threadIdx.x;
int j = blockIdx.x;
int k = blockIdx.y;
int index = k * N * N + j * N + i;
float3 t = make_float3(i,j,k) / N;

signal[index] = ComplexMul(signal[index], ExpPI(dot(t, d)));
}
...全文
169 1 打赏 收藏 转发到动态 举报
AI 作业
写回复
用AI写文章
1 条回复
切换为时间正序
请发表友善的回复…
发表回复
不可触碰 2014-07-29
  • 打赏
  • 举报
回复
虽然看不到懂,,帮顶,,,感觉把CPU程序改成GPU程序好难
This document describes CUFFT, the NVIDIA® CUDA™ Fast Fourier Transform (FFT) library. The FFT is a divide-and-conquer algorithm for efficiently computing discrete Fourier transforms of complex or real-valued data sets. It is one of the most important and widely used numerical algorithms in computational physics and general signal processing. The CUFFT library provides a simple interface for computing parallel FFTs on an NVIDIA GPU, which allows users to leverage the floating-point power and parallelism of the GPU without having to develop a custom, CUDA FFT implementation. FFT libraries typically vary in terms of supported transform sizes and data types. For example, some libraries only implement radix-2 FFTs, restricting the transform size to a power of two. The CUFFT Library aims to support a wide range of FFT options efficiently on NVIDIA GPUs. This version of the CUFFT library supports the following features: I Complex and real-valued input and output I 1D, 2D, and 3D transforms I Batch execution for doing multiple transforms of any dimension in parallel I Transform sizes up to 64 million elements in single precision and up to 128 million elements in double precision in any dimension, limited by the available GPU memory I In-place and out-of-place transforms I Double-precision (64-bit floating point) on compatible hardware (sm1.3 and later) I Support for streamed execution, enabling asynchronous computation and data movement I FFTW compatible data layouts I Arbitrary intra- and inter-dimension element strides I Thread-safe API that can be called from multiple independent host threads

589

社区成员

发帖
与我相关
我的任务
社区描述
CUDA™是一种由NVIDIA推出的通用并行计算架构,该架构使GPU能够解决复杂的计算问题。 它包含了CUDA指令集架构(ISA)以及GPU内部的并行计算引擎。
社区管理员
  • CUDA编程社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧