The BLAS_GEMM procedure updates an existing matrix by adding a multiple of the product of two other matrices, according to the following vector operation:
M = alpha * op(A) * op(B) + beta * M
where alpha and beta are scale factors, A, B, and M are input matrices, and op(X) is one of X, XT, or XH. Here the * operator is the mathematician's matrix multiplication, rows of the first matrix times columns of the second, which is the same as the ## operator in IDL.
BLAS_GEMM is based on the following BLAS routines:
Output Type |
BLAS Routine |
Float |
sgemm |
Double |
dgemm |
Complex |
cgemm |
Double complex |
zgemm |
Note: BLAS_GEMM uses MKL, so it only supports output of type float, double, complex, or dcomplex. The input arguments can be integer types, at which point they will be converted to the type of the output argument C before being multiplied.
BLAS_GEMM can be faster and use less memory than the usual IDL array notation (e.g., M += A ## B) for updating existing arrays.
The # Operator vs. MATRIX_MULTIPLY vs. BLAS_GEMM
The following table illustrates how various operations are performed using the # operator versus the MATRIX_MULTIPLY function and BLAS_GEMM procedure:
# Operator |
MATRIX_MULTIPLY Function
|
BLAS_GEMM Procedure
|
M = A # B |
M = matrix_multiply(A, B)
|
blas_gemm, M, B, A
|
M = transpose(A) # B
|
M = matrix_multiply(A, B, /atranspose)
|
blas_gemm, M, B, A, btranspose=1
|
M = A # transpose(B)
|
M = matrix_multiply(A, B, /btranspose)
|
blas_gemm, M, B, A, atranspose=1
|
M = transpose(A) # transpose(B)
|
M = matrix_multiply(A, B, /atrans, /btrans)
|
blas_gemm, M, B, A, atrans=1, btrans=1
|
M = conj(transpose(A)) # B
|
M = matrix_multiply(conj(A), B, /atranspose)
|
blas_gemm, M, B, A, btranspose=-1
|
M = A # conj(transpose(B))
|
M = matrix_multiply(A, conj(B), /btranspose)
|
blas_gemm, M, B, A, atranspose=-1
|
M = conj(transpose(A)) #transpose(B)
|
M = matrix_multiply(conj(A), B, /atrans, /btrans)
|
blas_gemm, M, B, A, atrans=1, btrans=-1
|
M = transpose(A) # conj(transpose(B))
|
M = matrix_multiply(A, conj(B), /atrans, /btrans)
|
blas_gemm, M, B, A, atrans=-1, btrans=1
|
M = conj(transpose(A)) # conj(transpose(B))
|
M = matrix_multiply(conj(A), conj(B), /atrans, /btrans)
|
blas_gemm, M, B, A, atrans=-1, btrans=-1
|
Note: BLAS_GEMM can also be used in place of the ## operator. For example, M = A ## B is equivalent to BLAS_GEMM, M, A, B, and M = A ## TRANSPOSE(B) is equivalent to BLAS_GEMM, M, A, B, /BTRANSPOSE.
Syntax
BLAS_GEMM, C, A, B [, ALPHA=value] [, BETA=value] [, ATRANSPOSE=value] [, BTRANSPOSE=value]
Arguments
C
The array to be updated. C must be of float, double, complex, or dcomplex type. BLAS_GEMM does not change the size and type of C.
A
The first matrix operand in the matrix product. A may be any array that IDL can convert to the type of C. BLAS_GEMM does not change A, but A will be internally converted to the type of C before multiplication.
B
The second matrix operand in the matrix product. B may be any array that IDL can convert to the type of C. BLAS_GEMM does not change B, but A will be internally converted to the type of C before multiplication.
NOTE: Based on the values of ATRANSPOSE and BTRANSPOSE, the arguments A and B must be compatible with each other to perform matrix multiplication, as well as the output argument C.
Keywords
ALPHA
An optional scaling factor to be multiplied with A and B. ALPHA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change ALPHA. If ALPHA is not specified then a default value of 1.0 is used.
ATRANSPOSE
Set this keyword to a positive value to multiply using the transpose of A. Set this keyword to a negative value to multiply using the complex conjugate transpose of A.
Note: If C is of type float or double, then any non-zero value for ATRANSPOSE has the same effect of a normal transpose of A.
BETA
An optional scaling factor to be multiplied with C before adding the matrix product of A and B. BETA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change BETA. If BETA is not specified, then a default value of 0.0 is used.
BTRANSPOSE
Set this keyword to a positive value to multiply using the transpose of B. Set this keyword to a negative value to multiply using the complex conjugate transpose of B.
Note: If C is of type float or double, then any non-zero value for BTRANSPOSE has the same effect of a normal transpose of B.
Thread Pool Keywords
This routine is written to make use of IDL’s thread pool, which can increase execution speed on systems with multiple CPUs. The values stored in the !CPU system variable control whether IDL uses the thread pool for a given computation. In addition, you can use the thread pool keywords TPOOL_MAX_ELTS, TPOOL_MIN_ELTS, and TPOOL_NOTHREAD to override the defaults established by !CPU for a single invocation of this routine. See Thread Pool Keywords for details.
Note: The thread pool keywords have no effect on Mac, MKA will always use multiple cores.
Examples
The following examples show how to use the BLAS_GEMM procedure to perform matrix multiplication, with scaling.
Create a multidimensional array:
A = RANDOMU(1, 4, 5)
Print A:
PRINT, A
IDL prints:
0.417022 0.997185 0.720325 0.932557
0.000114381 0.128124 0.302333 0.999041
0.146756 0.236089 0.0923386 0.396581
0.186260 0.387911 0.345561 0.669746
0.396767 0.935539 0.538817 0.846311
Create another multidimensional array:
B = RANDOMU(2, 4, 5)
Print B
PRINT, B
IDL prints:
0.435995 0.185082 0.0259262 0.931541
0.549662 0.947731 0.435322 0.484749
0.420368 0.320536 0.330335 0.154427
0.204649 0.698863 0.619271 0.119951
0.299655 0.485176 0.266827 0.632738
Create an array to store the product of A and B:
C = FLTARR(4, 4)
Print C
PRINT, C
IDL prints:
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
Set C equal to the product of A and B (i.e., C = TRANSPOSE(A) ## B):
BLAS_GEMM, C, A, B, /ATRANSPOSE
Print C:
PRINT, C
IDL prints:
0.400585 0.447005 0.280554 0.684583
0.964161 1.10666 0.649466 1.66597
0.751232 0.952367 0.538557 1.21421
1.51310 2.12521 1.23066 2.03007
Copy C into a temporary variable:
CPrime = C
Add the product of A and B to C (i.e., C += TRANSPOSE(A) ## B):
BLAS_GEMM, C, A, B, /ATRANSPOSE, BETA=1.0
Print C:
PRINT, C
IDL prints:
0.801170 0.894009 0.561108 1.36917
1.92832 2.21332 1.29893 3.33193
1.50246 1.90473 1.07711 2.42841
3.02620 4.25042 2.46132 4.06014
Show that C is now twice CPrime:
PRINT, C / CPrime
IDL prints:
2.00000 2.00000 2.00000 2.00000
2.00000 2.00000 2.00000 2.00000
2.00000 2.00000 2.00000 2.00000
2.00000 2.00000 2.00000 2.00000
Version History
See Also
MATRIX MULTIPLY