SCFFT3D(3S)SCFFT3D(3S)NAME
SCFFT3D, DZFFT3D, CSFFT3D, ZDFFT3D - Applies a three-dimensional real-
to-complex Fast Fourier Transform (FFT)
SYNOPSIS
Single precision -> Single precision complex
Fortran:
CALL SCFFT3D (isign, n1, n2, n3, scale, x, ldx, ldx2, y, ldy,
ldy2, table, work, isys)
C/C++:
#include <scsl_fft.h>
int scfft3d (int isign, int n1, int n2, int n3, float scale,
float *x, int ldx, int ldx2, scsl_complex *y, int ldy, int
ldy2, float *table, float *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int scfft3d (int isign, int n1, int n2, int n3, float scale,
float *x, int ldx, int ldx2, complex<float> *y, int ldy, int
ldy2, float *table, float *work, int *isys);
Double precision -> Double precision complex
Fortran:
CALL DZFFT3D (isign, n1, n2, n3, scale, x, ldx, ldx2, y, ldy,
ldy2, table, work, isys)
C/C++:
#include <scsl_fft.h>
int dzfft3d (int isign, int n1, int n2, int n3, double scale,
double *x, int ldx, int ldx2, scsl_zomplex *y, int ldy, int
ldy2, double *table, double *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int dzfft3d (int isign, int n1, int n2, int n3, double scale,
double *x, int ldx, int ldx2, complex<double> *y, int ldy, int
ldy2, double *table, double *work, int *isys);
Single precision complex -> Single precision
Fortran:
CALL CSFFT3D (isign, n1, n2, n3, scale, x, ldx, ldx2, y, ldy,
ldy2, table, work, isys)
C/C++:
#include <scsl_fft.h>
int csfft3d (int isign, int n1, int n2, int n3, float scale,
Page 1
SCFFT3D(3S)SCFFT3D(3S)
scsl_complex *x, int ldx, int ldx2, float *y, int ldy, int
ldy2, float *table, float *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int csfft3d (int isign, int n1, int n2, int n3, float scale,
complex<float> *x, int ldx, int ldx2, float *y, int ldy, int
ldy2, float *table, float *work, int *isys);
Double precision complex -> Double precision
Fortran:
CALL ZDFFT3D (isign, n1, n2, n3, scale, x, ldx, ldx2, y, ldy,
ldy2, table, work, isys)
C/C++:
#include <scsl_fft.h>
int zdfft3d (int isign, int n1, int n2, int n3, double scale,
scsl_zomplex *x, int ldx, int ldx2, double *y, int ldy, int
ldy2, double *table, double *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int zzfft3d (int isign, int n1, int n2, int n3, double scale,
complex<double> *x, int ldx, int ldx2, double *y, int ldy, int
ldy2, double *table, double *work, int *isys);
IMPLEMENTATION
These routines are part of the SCSL Scientific Library and can be loaded
using either the -lscs or the -lscs_mp option. The -lscs_mp option
directs the linker to use the multi-processor version of the library.
When linking to SCSL with -lscs or -lscs_mp, the default integer size is
4 bytes (32 bits). Another version of SCSL is available in which integers
are 8 bytes (64 bits). This version allows the user access to larger
memory sizes and helps when porting legacy Cray codes. It can be loaded
by using the -lscs_i8 option or the -lscs_i8_mp option. A program may use
only one of the two versions; 4-byte integer and 8-byte integer library
calls cannot be mixed.
The C and C++ prototypes shown above are appropriate for the 4-byte
integer version of SCSL. When using the 8-byte integer version, the
variables of type int become long long and the <scsl_fft_i8.h> header
file should be included.
DESCRIPTION
SCFFT3D/DZFFT3D computes the three-dimensional Fast Fourier Transform
(FFT) of the real matrix X, and it stores the results in the complex
matrix Y. CSFFT3D/ZDFFT3D computes the corresponding inverse transform.
Page 2
SCFFT3D(3S)SCFFT3D(3S)
In FFT applications, it is customary to use zero-based subscripts; the
formulas are simpler that way. First, the function of SCFFT3D is
described. Suppose the arrays are declared as follows:
Fortran:
REAL X(0:ldx-1, 0:ldx2-1, 0:n3-1)
COMPLEX Y(0:ldy-1, 0:ldy2-1, 0:n3-1)
C/C++:
float x[n3][ldx2][ldx];
scsl_complex y[n3][ldy2][ldy];
C++ STL:
float x[n3][ldx2][ldx];
complex<float> y[n3][ldy2][ldy];
SCFFT3D computes the formula:
Y(k1,k2,k3) =
n1-1 n2-1 n3-1
scale * Sum Sum Sum [X(j1,j2,j3)*w1**(j1*k1)*w2**(j2*k2)*w3**(j3*k3)]
j1=0 j2=0 j3=0
for k1 = 0, ..., n1/2,
k2 = 0, ..., n2 - 1,
k3 = 0, ..., n3 - 1,
where:
w1 = exp(isign*2*pi*i/n1),
w2 = exp(isign*2*pi*i/n2),
w3 = exp(isign*2*pi*i/n3),
i = + sqrt(-1)
pi = 3.14159...
isign = +1 or -1
Different authors use different conventions for which of the transforms,
isign = +1 or isign = -1, is the forward or inverse transform, and what
the scale factor should be in either case. You can make these routines
compute any of the various possible definitions, however, by choosing the
Page 3
SCFFT3D(3S)SCFFT3D(3S)
appropriate values for isign and scale.
The relevant fact from FFT theory is this: If you take the FFT with any
particular values of isign and scale, the mathematical inverse function
is computed by taking the FFT with -isign and 1/(n1 * n2 * n3 * scale).
In particular, if you use isign = +1 and scale = 1.0 for the forward FFT,
you can compute the inverse FFT by isign = -1 and
scale = 1.0/(n1*n2*n3).
SCFFT3D is very similar in function to CCFFT3D(3S), but it takes the
real-to-complex transform in the first dimension, followed by the
complex-to-complex transform in the second and third dimensions.
CSFFT3D does the reverse. It takes the complex-to-complex FFT in the
third and second dimensions, followed by the complex-to-real FFT in the
first dimension.
See the INTRO_FFT(3S) man page for more information about real-to-complex
and complex-to-real FFTs. The three dimensional analog of the conjugate
formula is as follows:
Yk1,k2,k3 = conjg Y n1 - k1, n2 - k2, n3 - k3
for n1/2 < k1 <= n1 - 1
0 <= k2 <= n2 - 1
0 <= k3 <= n3 - 1
where the notation conjg(z) represents the complex conjugate of z.
Thus, you have to compute only (slightly more than) half out the output
values, namely:
Yk1,k2,k3
for 0 <= k1 <= n1/2
0 <= k2 <= n2 - 1
0 <= k3 <= n3 - 1
Page 4
SCFFT3D(3S)SCFFT3D(3S)
See the NOTES section of this man page for information about the
interpretation of the data types described in the following arguments.
These routines have the following arguments:
isign Integer. (input)
Specifies whether to initialize the table array or to do the
forward or inverse Fourier transform, as follows:
If isign = 0, the routine initializes the table array and
returns. In this case, the only arguments used or checked are
isign, n1, n2, n3, and table.
If isign = +1 or -1, the value of isign is the sign of the
exponent used in the FFT formula.
n1 Integer. (input)
Transform size in the first dimension. If n1 is not positive,
the routine returns without computing a transform.
n2 Integer. (input)
Transform size in the second dimension. If n2 is not positive,
the routine returns without computing a transform.
n3 Integer. (input)
Transform size in the third dimension. If n3 is not positive,
the routine returns without computing a transform.
scale Scale factor. (input)
SCFFT3D: Single precision.
DZFFT3D: Double precision.
CSFFT3D: Single precision.
ZDFFT3D: Double precision.
Each element of the output array is multiplied by scale after
taking the Fourier transform, as defined previously.
x Array of dimensions (ldx, ldx2, n3). (input)
SCFFT3D: Single precision array.
DZFFT3D: Double precision array.
CSFFT3D: Single precision complex array.
ZDFFT3D: Double precision complex array.
Array of values to be transformed.
ldx Integer. (input)
The first dimension of x, as it was declared in the calling
program (the leading dimension of x).
SCFFT3D, DZFFT3D: ldx >= MAX(n1, 1).
CSFFT3D, ZDFFT3D: ldx >= MAX(n1/2 + 1, 1).
Page 5
SCFFT3D(3S)SCFFT3D(3S)
ldx2 Integer. (input)
The second dimension of x, as it was declared in the calling
program. ldx2 >= MAX(n2, 1).
y Array of dimensions (ldy, ldy2, n3). (output)
SCFFT3D: Single precision complex array.
DZFFT3D: Double precision complex array.
CSFFT3D: Single precision array.
ZDFFT3D: Double precision array.
Output array of transformed values. The output array can be
the same as the input array, in which case, the transform is
done in place; that is, the input array is overwritten with the
transformed values. In this case, it is necessary that the
following equalities hold:
SCFFT3D, DZFFT3D: ldx = 2 * ldy, and ldx2 = ldy2.
CSFFT3D, ZDFFT3D: ldy = 2 * ldx, and ldx2 = ldy2.
ldy Integer. (input)
The first dimension of y, as it was declared in the calling
program; that is, the leading dimension of y.
SCFFT3D, DZFFT3D: ldy >= MAX(n1/2 + 1, 1).
CSFFT3D, ZDFFT3D: ldy >= MAX(n1 + 2, 1).
In the complex-to-real routine, two extra elements are in the
first dimension (that is, ldy >= n1 + 2, rather than just ldy
>= n1). These elements are needed for intermediate storage
during the computation. On exit, their value is undefined.
ldy2 Integer. (input)
The second dimension of y, as it was declared in the calling
program. ldy2 >= MAX(n2, 1).
table Array of dimension (n1 + NFR) + (2 * n2 + NF) + (2 * n3 + NF)
(input or output)
SCFFT3D, CSFFT3D: Single precision array.
DZFFT3D, ZDFFT3D: Double precision array.
Table of factors and roots of unity. See the description of
the isys argument for the value of NF and NFR.
If isign = 0, the routine initializes table (table is output
only).
If isign = +1 or -1, the values in table are assumed to be
initialized already by a prior call with isign = 0 *table is
input only).
Page 6
SCFFT3D(3S)SCFFT3D(3S)
work Array of dimension n1 + 4 * n3
SCFFT3D, CSFFT3D: Single precision array.
DZFFT3D, ZDFFT3D: Double precision array.
Work array. This is a scratch array used for intermediate
calculations. Its address space must be different from that of
the input and output arrays.
isys Integer array dimensioned 0..isys(0).
An array that gives implementation-specific information. All
features and functions of the FFT routines specific to any
particular implementation are confined to this isys array.
In the Origin series implementation, isys(0)=0 and isys(0)=1
are supported. In SCSL versions prior to 1.3, only isys(0)=0
was allowed. For isys(0)=0, NF=30 and NFR=15, and for
isys(0)=1, NF=NFR=256. The NF(R) words of storage in the table
array contain a factorization of the length of the transform.
The smaller values of NF and NFR for isys(0)=0 are historical.
They are too small to store all the required factors for the
highest performing FFT, so when isys(0)=0, extra space is
allocated when the table array is initialized. To avoid memory
leaks, this extra space must be deallocated when the table
array is no longer needed. The SCFFT3DF routine is used to
release this memory. Due to the potential for memory leaks, the
use of isys(0)=0 should be avoided.
For isys(0)=1, the value of NF and NFR are large enough so that
no extra memory needs to be allocated, and there is no need to
call SCFFT3DF to release memory. If called, it does nothing.
NOTE: isys(0)=1 means that isys is an integer array with two
elements. The second element, isys(1), will not be accessed.
NOTES
The following data types are described in this documentation:
Term Used Data type
Fortran:
Array dimensioned 0..n-1 x(0:n-1)
Array of dimensions (m,n) x(m,n)
Array of dimensions (m,n,p) x(m,n,p)
Integer INTEGER (INTEGER*8 for -lscs_i8[_mp])
Page 7
SCFFT3D(3S)SCFFT3D(3S)
Single precision REAL
Double precision DOUBLE PRECISION
Single precision complex COMPLEX
Double precision complex DOUBLE COMPLEX
C/C++:
Array dimensioned 0..n-1 x[n]
Array of dimensions (m,n) x[m*n] or x[n][m]
Array of dimensions (m,n,p) x[m*n*p] or x[p][n][m]
Integer int (long long for -lscs_i8[_mp])
Single precision float
Double precision double
Single precision complex scsl_complex
Double precision complex scsl_zomplex
C++ STL:
Array dimensioned 0..n-1 x[n]
Array of dimensions (m,n) x[m*n] or x[n][m]
Array of dimensions (m,n,p) x[m*n*p] or x[p][n][m]
Integer int (long long for -lscs_i8[_mp])
Single precision float
Double precision double
Single precision complex complex<float>
Double precision complex complex<double>
CAUTIONS
Transform sizes with a prime factor exceeding 232-1 are not supported for
the 8-byte integer version of the library.
In addition to the work array, the FFT routines also dynamically allocate
scratch space from the stack. The amount of space allocated can be
slightly bigger than the size of the largest processor cache. For single
processor runs, the default stack size is large enough that these
Page 8
SCFFT3D(3S)SCFFT3D(3S)
allocations generally cause no problems. But for parallel runs, you need
to ensure that the stack size of slave threads is big enough to hold this
scratch space. Failure to reserve sufficient stack space will cause
programs to dump core due to stack overflows. The stack size of MP
library slave threads is controlled via the MP_SLAVE_STACKSIZE
environment variable or the mp_set_slave_stacksize() library routine. See
the mp(3C), mp(3F) and pe_environ(5) reference pages for more information
on controlling the slave stack size. For pthreads applications, the
thread's stack size is specified as one of many creation attributes
provided in the pthread_attr_t argument to pthread_create(3P). The
stacksize attribute should be set explicitly to a non-default value using
the pthread_attr_setstacksize(3P) call, described in the
pthread_attr_init(3P) man page.
Care must be exercised if copies of the table array are used: even though
a copy exists, the original must persist. As an example, the following
code will not work:
#include <scsl_fft.h>
float x[129][129][129];
scsl_complex y[129][129][65];
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
int isys[2];
isys[0] = 1;
{
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
scfft3d(0, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(scsl_complex *) y, 65, 129, table_orig, work, isys);
bcopy(table_orig, table,
((128+256)+(2*128+256)+(2*128+256))*sizeof(float));
}
scfft3d(1, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(scsl_complex *) y, 65, 129, table, work, isys);
In this example, because table_orig is a stack variable that does not
persist outside of the code block delimited by the braces, the data in
the copy, table, are not guaranteed to be valid. However, the following
code will work because table_orig is persistent:
#include <scsl_fft.h>
float x[129][129][129];
scsl_complex y[129][129][65];
float table_orig[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
int isys[2];
isys[0] = 1;
scfft3d(0, 128, 128, 128, 1.0f, (float *) x, 129, 129,
Page 9
SCFFT3D(3S)SCFFT3D(3S)
(scsl_complex *) y, 65, 129, table_orig, work, isys);
bcopy(table_orig, table,
((128+256)+(2*128+256)+(2*128+256))*sizeof(float));
scfft3d(1, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(scsl_complex *) y, 65, 129, table, work, isys);
EXAMPLES
The following examples are for Origin series only.
Example 1: Initialize the TABLE array in preparation for doing a three-
dimensional FFT of size 128 by 128 by 128. In this case only the isign,
n1, n2, n3, and table arguments are used; you can use dummy arguments or
zeros for other arguments.
Fortran:
REAL TABLE ((128 + 256) + (2*128 + 256) + (2*128 + 256))
INTEGER ISYS(0:1)
ISYS(0) = 1
CALL SCFFT3D (0, 128, 128, 128, 0.0, DUMMY, 1, 1, DUMMY, 1, 1,
& TABLE, DUMMY, ISYS)
C/C++:
#include <scsl_fft.h>
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
int isys[2];
isys[0] = 1;
scfft3d(0, 128, 128, 128, 0.0f, NULL, 1, 1, NULL, 1, 1,
table, NULL, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
int isys[2];
isys[0] = 1;
scfft3d(0, 128, 128, 128, 0.0f, NULL, 1, 1, NULL, 1, 1,
table, NULL, isys);
Example 2: X is a real array of dimensioned (0...128, 0...128, 0...128).
The first 128 elements of each dimension contain data; for performance
reasons, the extra element forces the leading dimensions to be odd
numbers. Y is a complex array of dimension (0...64, 0...128, 0...128).
Take the three-dimensional FFT of X and store it in Y. Initialize the
TABLE array, as in example 1.
Page 10
SCFFT3D(3S)SCFFT3D(3S)
Fortran:
REAL X(0:128, 0:128, 0:128)
COMPLEX Y(0:64, 0:128, 0:128)
REAL TABLE ((128 + 256) + (2*128 + 256) + (2*128 + 256))
REAL WORK(128 + 4*128)
INTEGER ISYS(0:1)
ISYS(0) = 1
CALL SCFFT3D(0, 128, 128, 128, 1.0, X, 129, 129,
& Y, 65, 129, TABLE, WORK, ISYS)
CALL SCFFT3D(1, 128, 128, 128, 1.0, X, 129, 129,
& Y, 65, 129, TABLE, WORK, ISYS)
C/C++:
#include <scsl_fft.h>
float x[129][129][129];
scsl_complex y[129][129][65];
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
int isys[2];
isys[0] = 1;
scfft3d(0, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(scsl_complex *) y, 65, 129, table, work, isys);
scfft3d(1, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(scsl_complex *) y, 65, 129, table, work, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
float x[129][129][129];
complex<float> y[129][129][65];
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
int isys[2];
isys[0] = 1;
scfft3d(0, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(complex<float> *) y, 65, 129, table, work, isys);
scfft3d(1, 128, 128, 128, 1.0f, (float *) x, 129, 129,
(complex<float> *) y, 65, 129, table, work, isys);
Example 3: Take the inverse FFT of Y and store it back in X. Note that
the leading dimension of X must be increased to 2*ldy. The scale factor
1.0/(128.0*128.0*128.0) is used. Assume that the TABLE array is
initialized already.
Page 11
SCFFT3D(3S)SCFFT3D(3S)
Fortran:
REAL X(0:129, 0:128, 0:128)
COMPLEX Y(0:64, 0:128, 0:128)
...
CALL CSFFT3D(-1, 128, 128, 128, 1.0/128.0**3, Y, 65, 129,
& X, 130, 129, TABLE, WORK, ISYS)
C/C++:
float x[129][129][130];
scsl_complex y[129][129][65];
csfft3d(-1, 128, 128, 128, 1.0f/(128.0f*128.0f*128.0f),
(scsl_complex *) y, 65, 129, (float *) x, 130, 129,
table, work, isys);
C++ STL:
float x[129][129][130];
complex<float> y[129][129][65];
csfft3d(-1, 128, 128, 128, 1.0f/(128.0f*128.0f*128.0f),
(complex<float> *) y, 65, 129, (float *) x, 130, 129,
table, work, isys);
Example 4: Perform the same computation as in example 2, but equivalence
the input and output arrays to save storage space. In this case, a row
must be added to X, because it is equivalenced to a complex array. Use
the 8-byte integer version of SCSL.
Fortran:
REAL X(130, 129, 129)
COMPLEX Y(65, 129, 129)
EQUIVALENCE (X(1, 1, 1), Y(1, 1, 1))
REAL TABLE ((128 + 256) + (2*128 + 256) + (2*128 + 256))
REAL WORK(128 + 4*128)
INTEGER*8 ISYS(0:1)
ISYS(0) = 1_8
CALL SCFFT3D(0_8, 128_8, 128_8, 128_8, 1.0, X, 130_8, 129_8,
& Y, 65_8, 129_8, TABLE, WORK, ISYS)
CALL SCFFT3D(1_8, 128_8, 128_8, 128_8, 1.0, X, 130_8, 129_8,
& Y, 65_8, 129_8, TABLE, WORK, ISYS)
C/C++:
#include <scsl_fft_i8.h>
float *x;
scsl_complex y[129][129][65];
Page 12
SCFFT3D(3S)SCFFT3D(3S)
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
long long isys[2];
isys[0] = 1LL;
x = (float *) &y[0][0][0];
scfft3d(0LL, 128LL, 128LL, 128LL, 1.0f, x, 130LL, 129LL,
(scsl_complex *) y, 65LL, 129LL, table, work, isys);
scfft3d(1LL, 128LL, 128LL, 128LL, 1.0f, x, 130LL, 129LL,
(scsl_complex *) y, 65LL, 129LL, table, work, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft_i8.h>
float *x;
complex<float> y[129][129][65];
float table[(128 + 256) + (2*128 + 256) + (2*128 + 256)];
float work[128 + 4*128];
long long isys[2];
isys[0] = 1LL;
x = (float *) &y[0][0][0];
scfft3d(0LL, 128LL, 128LL, 128LL, 1.0f, x, 130LL, 129LL,
(complex<float> *) y, 65LL, 129LL, table, work, isys);
scfft3d(1LL, 128LL, 128LL, 128LL, 1.0f, x, 130LL, 129LL,
(complex<float> *) y, 65LL, 129LL, table, work, isys);
Example 5: Perform the same computation as in example 2, but assume that
the lower bound of each Fortran array is 1, rather than 0. No change is
made in the subroutine calls.
Fortran:
REAL X(129, 129, 129)
COMPLEX Y(65, 129, 129)
...
CALL SCFFT3D(0, 128, 128, 128, 1.0, X, 129, 129,
& Y, 65, 129, TABLE, WORK, ISYS)
CALL SCFFT3D(1, 128, 128, 128, 1.0, X, 129, 129,
& Y, 65, 129, TABLE, WORK, ISYS)
SEE ALSOINTRO_FFT(3S), INTRO_SCSL(3S), CCFFT(3S), CCFFT2D(3S), CCFFT3D(3S),
CCFFTM(3S), SCFFT(3S), SCFFT2D(3S), SCFFTM(3S)
Page 13