SCFFTM(3S)SCFFTM(3S)NAME
SCFFTM, DZFFTM, CSFFTM, ZDFFTM - Applies multiple real-to-complex or
complex-to-real Fast Fourier Transforms (FFTs)
SYNOPSIS
Single precision -> Single precision complex
Fortran:
CALL SCFFTM (isign, n, lot, scale, x, ldx, y, ldy, table, work,
isys)
C/C++:
#include <scsl_fft.h>
int scfftm (int isign, int n, int lot, float scale, float *x,
int ldx, scsl_complex *y, int ldy, float *table, float *work,
int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int scfftm (int isign, int n, int lot, float scale, float *x,
int ldx, complex<float> *y, int ldy, float *table, float *work,
int *isys);
Double precision -> Double precision complex
Fortran:
CALL DZFFTM (isign, n, lot, scale, x, ldx, y, ldy, table, work,
isys)
C/C++:
#include <scsl_fft.h>
int dzfftm (int isign, int n, int lot, double scale, double *x,
int ldx, scsl_zomplex *y, int ldy, double *table, double *work,
int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int dzfftm (int isign, int n, int lot, double scale, double *x,
int ldx, complex<double> *y, int ldy, double *table, double
*work, int *isys);
Single precision complex -> Single precision
Fortran:
CALL CSFFTM (isign, n, lot, scale, x, ldx, y, ldy, table, work,
isys)
C/C++:
#include <scsl_fft.h>
int csfftm (int isign, int n, int lot, float scale,
Page 1
SCFFTM(3S)SCFFTM(3S)
scsl_complex *x, int ldx, float *y, int ldy, float *table,
float *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int csfftm (int isign, int n, int lot, float scale,
complex<float> *x, int ldx, float *y, int ldy, float *table,
float *work, int *isys);
Double precision complex -> Double precision
Fortran:
CALL ZDFFTM (isign, n, lot, scale, x, ldx, y, ldy, table, work,
isys)
C/C++:
#include <scsl_fft.h>
int zdfftm (int isign, int n, int lot, double scale,
scsl_zomplex *x, int ldx, double *y, int ldy, double *table,
double *work, int *isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
int zdfftm (int isign, int n, int lot, double scale,
complex<double> *x, int ldx, double *y, int ldy, double *table,
double *work, int *isys);
IMPLEMENTATION
These routines are part of the SCSL Scientific Library and can be loaded
using either the -lscs or the -lscs_mp option. The -lscs_mp option
directs the linker to use the multi-processor version of the library.
When linking to SCSL with -lscs or -lscs_mp, the default integer size is
4 bytes (32 bits). Another version of SCSL is available in which integers
are 8 bytes (64 bits). This version allows the user access to larger
memory sizes and helps when porting legacy Cray codes. It can be loaded
by using the -lscs_i8 option or the -lscs_i8_mp option. A program may use
only one of the two versions; 4-byte integer and 8-byte integer library
calls cannot be mixed.
The C and C++ prototypes shown above are appropriate for the 4-byte
integer version of SCSL. When using the 8-byte integer version, the
variables of type int become long long and the <scsl_fft_i8.h> header
file should be included.
DESCRIPTION
SCFFTM/DZFFTM computes the FFT of each column of the real matrix X, and
it stores the results in the corresponding column of the complex matrix
Y. CSFFTM/ZDFFTM computes the corresponding inverse transforms.
Page 2
SCFFTM(3S)SCFFTM(3S)
In FFT applications, it is customary to use zero-based subscripts; the
formulas are simpler that way. First, the function of SCFFTM is
described. Suppose that the arrays are declared as follows:
Fortran:
REAL X(0:ldx-1, 0:lot-1)
COMPLEX Y(0:ldy-1, 0:lot-1)
C/C++:
float x[lot][ldx];
scsl_complex y[lot][ldy];
C++ STL:
float x[lot][ldx];
complex<float> y[lot][ldy];
where ldx >= n, ldy >= n/2 + 1.
Then column L of the output array is the FFT of column L of the input
array, using the following formula for the FFT:
n-1
Y(k, L) = scale * Sum [ X(j, L)*w**(isign*j*k) ]
j=0
for k = 0, ..., n/2
L = 0, ..., lot-1
where:
w =exp(2*pi*i/n),
i = + sqrt(-1)
pi = 3.14159...,
isign = +1 or -1,
lot = the number of columns to transform
Different authors use different conventions for which transform
(isign = +1 or isign = -1) is used in the real-to-complex case, and what
the scale factor should be. Some adopt the convention that isign = 1 for
the real-to-complex transform, and isign = -1 for the complex-to-real
inverse. Others use the opposite convention. You can make these
Page 3
SCFFTM(3S)SCFFTM(3S)
routines compute any of the various possible definitions, however, by
choosing the appropriate values for isign and scale.
The relevant fact from FFT theory is this: If you use SCFFTM to take the
real-to-complex FFT, using any particular values of isign and scale, the
mathematical inverse function is computed by using CSFFTM with -isign and
1/(n*scale). In particular, if you call SCFFTM with isign = +1 and scale
= 1.0, you can use CSFFTM to compute the inverse complex-to-real FFT by
using isign = -1 and scale = 1.0/n.
See the NOTES section of this man page for information about the
interpretation of the data types described in the following arguments.
These routines have the following arguments:
isign Integer. (input)
Specifies whether to initialize the table array or do the
forward or inverse Fourier transform, as follows:
If isign = 0, the routine initializes table and returns. In
this case, the only arguments used or checked are isign, n, and
table.
If isign = +1 or -1, the value of isign is the sign of the
exponent used in the FFT formula.
n Integer. (input)
Size of the transforms (the number of elements in each column
of the input matrix to be transformed). If n is not positive,
SCFFTM or CSFFTM returns without computing a transform.
lot Integer. (input)
The number of transforms to be computed (or "lot size"). This
is the number of elements in each row of the input and output
matrix. If lot is not positive, the routine returns without
computing a transform.
scale Scale factor. (input)
SCFFTM: Single precision.
DZFFTM: Double precision.
CSFFTM: Single precision.
ZDFFTM: Double precision.
Each element of the output array is multiplied by scale after
taking the transform, as defined in the preceding formula.
x Array of dimensions (ldx, lot). (input)
SCFFTM: Single precision array.
DZFFTM: Double precision array.
CSFFTM: Single precision complex array.
ZDFFTM: Double precision complex array.
Input array of values to be transformed.
Page 4
SCFFTM(3S)SCFFTM(3S)
ldx Integer. (input)
The number of rows in the x array, as it was declared in the
calling program (the leading dimension of x).
SCFFTM, DZFFTM: ldx >= MAX(n, 1).
CSFFTM, ZDFFTM: ldx >= MAX(n/2 + 1, 1).
y Array of dimensions (ldy, lot). (input or output)
SCFFTM: Single precision complex array.
DZFFTM: Double precision complex array.
CSFFTM: Single precision array.
ZDFFTM: Double precision array.
Output array of transformed values. Each column of the output
array, y, is the FFT of the corresponding column of the input
array, x, computed according to the preceding formula.
The output array may be equivalenced to the input array. In
that case, the transform is done in place and the input array
is overwritten with the transformed values. In this case, the
following conditions on the leading dimensions must hold:
SCFFTM, DZFFTM: ldx = 2ldy.
CSFFTM, ZDFFTM: ldy = 2ldx.
ldy Integer. (input)
Number of rows in the y array, as declared in the calling
program (the leading dimension of y).
SCFFTM, DZFFTM: ldy >= MAX(n/2 + 1, 1).
CSFFTM, ZDFFTM: ldy >= MAX(n, 1).
table Array of dimension (n + NFR) (input or output)
SCFFTM, CSFFTM: Single precision array.
DZFFTM, ZDFFTM: Double precision array.
Table of factors and roots of unity. See the description of
the isys argument for the value of NFR.
If isign = 0, the routine initializes table (table is output
only).
If isign = +1 or -1, the values in table are assumed to be
initialized already by a prior call with isign = 0 (table is
input only).
work Array of dimension n + 2
SCFFTM, CSFFTM: Single precision array.
DZFFTM, ZDFFTM: Double precision array.
Work array used for intermediate calculations. Its address
space must be different from that of the input and output
arrays.
Page 5
SCFFTM(3S)SCFFTM(3S)
isys Integer array dimensioned 0..isys(0).
An array that gives implementation-specific information. All
features and functions of the FFT routines specific to any
particular implementation are confined to this isys array.
In the Origin series implementation, isys(0)=0 and isys(0)=1
are supported. In SCSL versions prior to 1.3, only isys(0)=0
was allowed. For isys(0)=0, NFR=15, and for isys(0)=1, NFR=256.
The NFR words of storage in the table array contain a
factorization of the length of the transform.
The smaller value of NFR for isys(0)=0 is historical. It is too
small to store all the required factors for the highest
performing FFT, so when isys(0)=0, extra space is allocated
when the table array is initialized. To avoid memory leaks,
this extra space must be deallocated when the table array is no
longer needed. The SCFFTMF routine is used to release this
memory. Due to the potential for memory leaks, the use of
isys(0)=0 should be avoided.
For isys(0)=1, the value of NFR is large enough so that no
extra memory needs to be allocated, and there is no need to
call SCFFTMF to release memory. If called, it does nothing.
NOTE: isys(0)=1 means that isys is an integer array with two
elements. The second element, isys(1), will not be accessed.
NOTES
The following data types are described in this documentation:
Term Used Data type
Fortran:
Array dimensioned 0..n-1 x(0:n-1)
Array of dimensions (m,n) x(m,n)
Array of dimensions (m,n,p) x(m,n,p)
Integer INTEGER (INTEGER*8 for -lscs_i8[_mp])
Single precision REAL
Double precision DOUBLE PRECISION
Single precision complex COMPLEX
Double precision complex DOUBLE COMPLEX
Page 6
SCFFTM(3S)SCFFTM(3S)
C/C++:
Array dimensioned 0..n-1 x[n]
Array of dimensions (m,n) x[m*n] or x[n][m]
Array of dimensions (m,n,p) x[m*n*p] or x[p][n][m]
Integer int (long long for -lscs_i8[_mp])
Single precision float
Double precision double
Single precision complex scsl_complex
Double precision complex scsl_zomplex
C++ STL:
Array dimensioned 0..n-1 x[n]
Array of dimensions (m,n) x[m*n] or x[n][m]
Array of dimensions (m,n,p) x[m*n*p] or x[p][n][m]
Integer int (long long for -lscs_i8[_mp])
Single precision float
Double precision double
Single precision complex complex<float>
Double precision complex complex<double>
CAUTIONS
Transform sizes with a prime factor exceeding 232-1 are not supported for
the 8-byte integer version of the library.
In addition to the work array, the FFT routines also dynamically allocate
scratch space from the stack. The amount of space allocated can be
slightly bigger than the size of the largest processor cache. For single
processor runs, the default stack size is large enough that these
allocations generally cause no problems. But for parallel runs, you need
to ensure that the stack size of slave threads is big enough to hold this
scratch space. Failure to reserve sufficient stack space will cause
programs to dump core due to stack overflows. The stack size of MP
library slave threads is controlled via the MP_SLAVE_STACKSIZE
environment variable or the mp_set_slave_stacksize() library routine. See
the mp(3C), mp(3F) and pe_environ(5) reference pages for more information
on controlling the slave stack size. For pthreads applications, the
Page 7
SCFFTM(3S)SCFFTM(3S)
thread's stack size is specified as one of many creation attributes
provided in the pthread_attr_t argument to pthread_create(3P). The
stacksize attribute should be set explicitly to a non-default value using
the pthread_attr_setstacksize(3P) call, described in the
pthread_attr_init(3P) man page.
Care must be exercised if copies of the table array are used: even though
a copy exists, the original must persist. As an example, the following
code will not work:
#include <scsl_fft.h>
float x[56][129];
scsl_complex y[56][65];
float table[128 + 256];
float work[128+2];
int isys[2];
isys[0] = 1;
{
float table_orig[128+256];
scfftm(0, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table_orig, work, isys);
bcopy(table_orig, table, (128+256)*sizeof(float));
}
scfftm(1, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table, work, isys);
In this example, because table_orig is a stack variable that does not
persist outside of the code block delimited by the braces, the data in
the copy, table, are not guaranteed to be valid. However, the following
code will work because table_orig is persistent:
#include <scsl_fft.h>
float x[56][129];
scsl_complex y[56][65];
float table_orig[128+256];
float table[128 + 256];
float work[128+2];
int isys[2];
isys[0] = 1;
scfftm(0, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table_orig, work, isys);
bcopy(table_orig, table, (128+256)*sizeof(float));
scfftm(1, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table, work, isys);
EXAMPLES
These examples use the table and workspace sizes appropriate to the
Origin series.
Page 8
SCFFTM(3S)SCFFTM(3S)
Example 1: Initialize the TABLE array in preparation for doing an FFT of
size 128. In this case only the isign, n, and table arguments are used;
you may use dummy arguments or zeros for the other arguments in the
subroutine call.
Fortran:
REAL TABLE(128 + 256)
INTEGER ISYS(0:1)
ISYS(0) = 1
CALL SCFFTM(0, 128, 1, 0.0, DUMMY, 1, DUMMY, 1,
& TABLE, DUMMY, ISYS)
C/C++:
#include <scsl_fft.h>
float table[128 + 256];
int isys[2];
isys[0] = 1;
scfftm(0, 128, 0, 0.0f, NULL, 1, NULL, 1, table, NULL, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
float table[128 + 256];
int isys[2];
isys[0] = 1;
scfftm(0, 128, 0, 0.0f, NULL, 1, NULL, 1, table, NULL, isys);
Example 2: X is a real array of dimension (0...128, 0...55), and Y is a
complex array of dimension (0...64, 0...55). The first 128 elements in
each column of X contain data; the extra element forces an odd leading
dimension. Take the FFT of the first 50 columns of X and store the
results in the first 50 columns of Y. Before taking the FFT, initialize
the TABLE array, as in example 1.
Fortran:
REAL X(0:128, 0:55)
COMPLEX Y(0:64, 0:55)
REAL TABLE(128 + 256)
REAL WORK(128+2)
INTEGER ISYS(0:1)
ISYS(0) = 1
CALL SCFFTM(0, 128, 50, 1.0, X, 129, Y, 65, TABLE, WORK, ISYS)
CALL SCFFTM(1, 128, 50, 1.0, X, 129, Y, 65, TABLE, WORK, ISYS)
Page 9
SCFFTM(3S)SCFFTM(3S)
C/C++:
#include <scsl_fft.h>
float x[56][129];
scsl_complex y[56][65];
float table[128 + 256];
float work[128+2];
int isys[2];
isys[0] = 1;
scfftm(0, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table, work, isys);
scfftm(1, 128, 50, 1.0f, (float *) x, 129,
(scsl_complex *) y, 65, table, work, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft.h>
float x[56][129];
complex<float> y[56][65];
float table[128 + 256
int isys[2];
isys[0] = 1;
scfftm(0, 128, 50, 1.0f, (float *) x, 129,
(complex<float> *) y, 65, table, work, isys);
scfftm(1, 128, 50, 1.0f, (float *) x, 129,
(complex<float> *) y, 65, table, work, isys);
Example 3: With X and Y as in example 2, take the inverse FFT of Y and
store it back in X. Note that the leading dimension of X must be
increased to 2*ldy. The scale factor 1.0/128.0 is used. Assume that the
TABLE array is initialized already.
Fortran:
REAL X(0:129, 0:55)
COMPLEX Y(0:64, 0:55)
...
CALL CSFFTM(-1, 128, 50, 1.0/128.0, Y, 65, X, 130,
& TABLE, WORK, ISYS)
C/C++:
float x[56][130];
scsl_complex y[56][65];
csfftm(-1, 128, 50, 1.0f/128.0f, (scsl_complex *) y, 65,
(float *) x, 130, table, work, isys);
Page 10
SCFFTM(3S)SCFFTM(3S)
C++ STL:
float x[56][130];
complex<float> y[56][65];
csfftm(-1, 128, 50, 1.0f/128.0f, (complex<float> *) y, 65,
(float *) x, 130, table, work, isys);
Example 4: Perform the same computation as in example 2, but equivalence
the input and output arrays to save storage space. In this case, a row
must be added to X, because it is equivalenced to a complex array. Use
the 8-byte integer version of SCSL.
Fortran:
REAL X(0:129, 0:55)
COMPLEX Y(0:64, 0:55)
EQUIVALENCE ( X(0, 0), Y(0, 0) )
REAL TABLE(128 + 256)
REAL WORK(128+2)
INTEGER*8 ISYS(0:1)
ISYS(0) = 1_8
CALL SCFFTM(0_8, 128_8, 50_8, 1.0, X, 130_8, Y, 65_8,
& TABLE, WORK, ISYS)
CALL SCFFTM(1_8, 128_8, 50_8, 1.0, X, 130_8, Y, 65_8,
& TABLE, WORK, ISYS)
C/C++:
#include <scsl_fft_i8.h>
float *x;
scsl_complex y[56][65];
float table[128 + 256];
float work[128+2];
long long isys[2];
isys[0] = 1LL;
x = (float *) &y[0][0];
scfftm(0LL, 128LL, 50LL, 1.0f, x, 130LL, (scsl_complex *) y, 65LL,
table, work, isys);
scfftm(1LL, 128LL, 50LL, 1.0f, x, 130LL, (scsl_complex *) y, 65LL,
table, work, isys);
C++ STL:
#include <complex.h>
#include <scsl_fft_i8.h>
float *x;
complex<float> y[56][65];
float table[128 + 256];
float work[128+2];
Page 11
SCFFTM(3S)SCFFTM(3S)
long long isys[2];
isys[0] = 1LL;
x = (float *) &y[0][0];
scfftm(0LL, 128LL, 50LL, 1.0f, x, 130LL,
(complex<float> *) y, 65LL, table, work, isys);
scfftm(1LL, 128LL, 50LL, 1.0f, x, 130LL,
(complex<float> *) y, 65LL, table, work, isys);
Example 5: Perform the same computation as in example 2, but assume that
the lower bound of each Fortran array is 1, rather than 0. No change is
made in the subroutine calls.
Fortran:
REAL X(129, 56)
COMPLEX Y(65, 56)
CALL SCFFTM(0, 128, 50, 1.0, X, 129, Y, 65, TABLE, WORK,= ISYS)
CALL SCFFTM(1, 128, 50, 1.0, X, 129, Y, 65, TABLE, WORK, ISYS)
SEE ALSOINTRO_FFT(3S), INTRO_SCSL(3S), CCFFT(3S), CCFFTM(3S), SCFFT(3S)
Page 12