INTRO_CBLAS(3S)INTRO_CBLAS(3S)NAME
INTRO_CBLAS - Introduction to the C interface to Fortran 77 Basic Linear
Algebra Subprograms (legacy BLAS)
IMPLEMENTATION
See individual man pages for operating system and hardware availability.
DESCRIPTION
The SCSL Scientific Library provides two C/C++ interfaces to the Fortran
77 Basic Linear Algebra Subprograms (legacy BLAS). This man page
describes a C interface proposed by the Basic Linear Algebra Subprograms
Technical (BLAST) Forum as well as several SCSL extensions to that
standard. An alternative C/C++ interface, similar to that implemented
for the SCSL signal processing library, is described in individual BLAS
man pages.
Header Files
To use the CBLAS interface, a program must include the header file
cblas.h:
#include <cblas.h>
For compatibility with SCSL releases prior to version 1.3, the
scsl_cblas.h header file may be used instead of cblas.h.
Naming Conventions
Names of the CBLAS routines are obtained from their legacy BLAS
counterparts by prefixing the name with cblas_ and converting to lower
case. For example, the routine DGEMM becomes cblas_dgemm.
Character Arguments
Arguments which were characters in the Fortran 77 interface are handled
by enumerated types in the CBLAS interface, as shown in the following
table.
Fortran interface CBLAS interface
Character Argument Value Enumerated type Value
SIDE 'L' CBLAS_SIDE CblasLeft
'R' CblasRight
UPLO 'U' CBLAS_UPLO CblasUpper
'L' CblasLower
DIAG 'N' CBLAS_DIAG CblasNonUnit
'U' CblasUnit
TRANSPOSE 'N' CBLAS_TRANSPOSE CblasNoTrans
'T' CblasTrans
'C' CblasConjTrans
Page 1
INTRO_CBLAS(3S)INTRO_CBLAS(3S)
CBLAS_ORDER CblasRowMajor
CblasColMajor
The last enumerated type listed above, CBLAS_ORDER, has no Fortran
counterpart. It is used as an additional argument to all routines
involving two-dimensional arrays, as discussed in the following section.
Array Arguments
Array elements are required to be contiguous in memory. All legacy BLAS
routines which take one or more two-dimensional arrays as arguments have
an extra argument in the CBLAS interface. First in the argument list,
this parameter is of the enumerated type:
enum CBLAS_ORDER {CblasRowMajor=101, CblasColMajor=102};
CblasRowMajor indicates that elements within a row of the array(s) are
contiguous in memory while elements within array columns are offset by a
constant stride. The stride parameter is equivalent to the leading
dimension (LDA) in the Fortran 77 interface.
Similarly, CblasColMajor indicates that elements within a column of the
array(s) are contiguous in memory while elements within array rows are
offset by a constant stride.
The CBLAS_ORDER parameter applies to all array operands in a routine.
Complex Data Types
The BLAST standard does not define a complex data type for use in
routines having complex arguments. Instead, all complex scalars and
arrays are prototyped as void *. This has the advantage of allowing the
use of any complex data structure without warnings from the compiler,
provided that the structure meets the specifications described below.
The disadvantage, however, is that the compiler will not catch type
mismatches.
Any C/C++ complex data type used in conjunction with the CBLAS interface
must satisfy the following requirements:
1. The real and imaginary components must be contiguous in memory.
2. Sequential array elements must also be contiguous in memory.
As an extension to the BLAST standard, SCSL provides support for stronger
type checking for complex arguments. To enable this, define
SCSL_NO_VOID_ARGS before including the CBLAS header file (for example, at
compile time with -DSCSL_NO_VOID_ARGS as an argument or with an explicit
#define SCSL_NO_VOID_ARGS in the source code). With this definition, the
default behavior is as follows:
Page 2
INTRO_CBLAS(3S)INTRO_CBLAS(3S)
* For C++ code in which the complex standard template library (STL) is
used, single precision complex arguments are prototyped as
complex<float> * and double precision complex arguments are
prototyped as complex<double> *.
* Otherwise, single precision complex arguments are prototyped as
scsl_complex * and double precision complex arguments are prototyped
as scsl_zomplex * for both C and C++. The SCSL complex types are
defined as follows:
typedef struct { float re; float im; } scsl_complex;
typedef struct { double re; double im; } scsl_zomplex;
Strong type checking also can be enabled in programs employing their own
(non-SCSL, non-C++ STL) complex types. To do this, define
SCSL_USER_COMPLEX_T=my_complex and SCSL_USER_COMPLEX_T=my_zomplex, where
my_complex and my_zomplex are the names of user-defined complex types.
These complex types, as well as SCSL_NO_VOID_ARGS, must be defined before
including the CBLAS header file (see Example 5 later in this man page).
Routines that Return Indices
Following the array indexing convention of Fortran 77, the legacy BLAS
return indices in the range 1 <= i <= n, where n is the number of entries
and i is the index. This allows the returned indices to be used to index
standard arrays directly. The C interface therefore returns indices in
the range 0 <= i < n for the same reason. Functions that return an index
are I[SDCZ]AMAX, I[SDCZ]AMIN, I[SD]MAX and I[SD]MIN, which are declared
to be of type CBLAS_INDEX.
Routines that Return Complex Values
For each routine returning a complex value ([CZ]DOTC, [CZ]DOTU, [CZ]SUM)
the BLAST standard defines a subroutine that returns a pointer to the
result as the last parameter of the argument list. All other arguments
are otherwise the same. The name of the subroutine is obtained by
appending _sub to the CBLAS name; for example, CDOTC becomes
cblas_cdotc_sub.
In the SCSL implementation complex functions can be called directly
provided that SCSL_NO_VOID_ARGS is defined, in which case the function
returns a structure of the appropriate type. The function naming and
calling conventions are the same as those for real functions (i.e., _sub
is not appended to the name, and no extra parameter is added).
Other Interface Notes
Input-only arguments are declared with the const modifier.
Non-complex scalar input arguments are passed by value. This allows the
user to put in constants when desired.
Page 3
INTRO_CBLAS(3S)INTRO_CBLAS(3S)
Array arguments are passed by address.
Output scalar arguments are passed by address.
The CBLAS routines can be loaded at compile time using either the -lscs
or the -lscs_mp option. The -lscs_mp option directs the linker to use
the multi-processor version of SCSL.
When linking to SCSL with -lscs or -lscs_mp, the default integer size is
4 bytes (32 bits). Another version of the library is available in which
integers are 8 bytes (64 bits). This version allows the user access to
larger memory sizes and helps when porting legacy Cray codes. It can be
loaded by using the -lscs_i8 option or the -lscs_i8_mp option. A program
may use only one of the two versions; 4-byte integer and 8-byte integer
library calls cannot be mixed.
When using the 8-byte integer version, variables of type int become long
long and the cblas_i8.h header file should be included.
EXAMPLES
Example 1: Multiply a real 10 x 20 matrix by a real 20 x 30 matrix. Use
the "natural" form for C arrays.
#include <cblas.h>
float a[10][20], b[20][30], c[10][30];
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 10, 30,
20, 1.0f, a, 20, b, 30, 0.0f, c, 30);
Example 2: Multiply a real 10 x 20 matrix by a real 20 x 30 matrix. Use
8-byte integers and column-major array ordering.
#include <cblas_i8.h>
float a[20][10], b[30][20], c[30][10];
cblas_sgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, 10LL, 30LL,
20LL, 1.0f, a, 10LL, b, 20LL, c, 10LL);
Examples 1 and 2 will result in a warning message when compiled as C code
and an error message when compiled with as C++ code because a, b, and c
are prototyped as float *. There are several ways to avoid these
problems, perhaps the easiest of which is to make explicit casts when
calling the CBLAS routine:
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 10, 30,
20, 1.0f, (float *) a, 20, (float *) b, 30, 0.0f,
(float *) c, 30);
Another solution is to declare a, b, and c as the following:
float a[10*20], b[20*30], c[10*30];
Page 4
INTRO_CBLAS(3S)INTRO_CBLAS(3S)
Of course, in this case two-dimensional indexing is no longer possible.
For example, if we assume that b has 20 rows and 30 columns, then the ith
element of the jth column of b must be referenced as b[i*30+j] rather
than b[i][j].
Note that the following is acceptable:
#include <stdlib.h>
float *a, *b, *c;
a = (float *) malloc(10 * 20 * sizeof(float));
b = (float *) malloc(20 * 30 * sizeof(float));
c = (float *) malloc(10 * 30 * sizeof(float));
The following gives unpredictable results since the array elements are
not contiguous in memory.
#include <stdlib.h>
float *a[10], *b[20], *c[10];
int i;
for (i = 0; i < 10; i++) {
a[i] = (float *) malloc(20 * sizeof(float));
b[2*i] = (float *) malloc(30 * sizeof(float));
b[2*i+1] = (float *) malloc(30 * sizeof(float));
c[i] = (float *) malloc(30 * sizeof(float));
}
Example 3: Multiply a complex 10 x 20 matrix by a complex 20 x 30
matrix. Use the C++ STL and row-major ordering.
#include <complex.h>
#include <cblas.h>
complex<float> a[10][20], b[20][30], c[10][30];
complex<float> alpha(1.0,0.0);
complex<float> beta(0.0,0.0);
cblas_cgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 10, 30,
20, &alpha, a, 20, b, 30, &beta, c, 30);
Because complex arguments are prototyped as void * by default, the
multidimensional array declarations in the example above will not result
in any type mismatches at compile time. In the following strong type
checking examples the complex matrices are stored in explicitly one-
dimensional form.
Example 4: Multiply a complex 10 x 20 matrix by a complex 20 x 30
matrix. Use the SCSL complex type and strong type checking.
#define SCSL_NO_VOID_ARGS
#include <cblas.h>
scsl_complex a[10*20], b[20*30], c[10*30];
Page 5
INTRO_CBLAS(3S)INTRO_CBLAS(3S)
scsl_complex alpha = {1.0, 0.0};
scsl_complex beta = {0.0, 0.0};
cblas_cgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 10, 30,
20, &alpha, a, 20, b, 30, &beta, c, 30);
Example 5: Multiply a complex 10 x 20 matrix by a complex 20 x 30
matrix. Define your own complex type.
#define SCSL_NO_VOID_ARGS
#define SCSL_USER_COMPLEX_T CBLAS_COMPLEX
#define SCSL_USER_COMPLEX_T CBLAS_ZOMPLEX
typedef struct { float real; float imag; } CBLAS_COMPLEX;
typedef struct { double real; double imag; } CBLAS_ZOMPLEX;
#include <cblas.h>
CBLAS_COMPLEX a[10*20], b[20*30], c[10*30];
CBLAS_COMPLEX alpha = {1.0, 0.0};
CBLAS_COMPLEX beta = {0.0, 0.0};
cblas_cgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 10, 30,
20, &alpha, a, 20, b, 30, &beta, c, 30);
SEE ALSOINTRO_SCSL(3S), INTRO_BLAS1(3S), INTRO_BLAS2(3S), INTRO_BLAS3(3S),
INTRO_LAPACK(3S)
The working document for the Basic Linear Algebra Subprograms (BLAS)
standard from the Basic Linear Algebra Subprograms Technical (BLAST)
Forum is available at
http://www.netlib.org/cgi-bin/checkout/blast/blast.pl.
Page 6