The BLAS_GEMM procedure updates an existing matrix by adding a multiple of the product of two other matrices, according to the following vector operation:

M = alpha * op(A) * op(B) + beta * M

where alpha and beta are scale factors, A, B, and M are input matrices, and op(X) is one of X, XT, or XH. Here the * operator is the mathematician's matrix multiplication, rows of the first matrix times columns of the second, which is the same as the ## operator in IDL.

BLAS_GEMM is based on the following BLAS routines:

 Output Type BLAS Routine Float sgemm Double dgemm Complex cgemm Double complex zgemm

Note: BLAS_GEMM uses MKL, so it only supports output of type float, double, complex, or dcomplex. The input arguments can be integer types, at which point they will be converted to the type of the output argument C before being multiplied.

BLAS_GEMM can be faster and use less memory than the usual IDL array notation (e.g., M += A ## B) for updating existing arrays.

### The # Operator vs. MATRIX_MULTIPLY vs. BLAS_GEMM

The following table illustrates how various operations are performed using the # operator versus the MATRIX_MULTIPLY function and BLAS_GEMM procedure:

 # Operator MATRIX_MULTIPLY Function BLAS_GEMM Procedure M = A # B M = matrix_multiply(A, B) blas_gemm, M, B, A M = transpose(A) # B M = matrix_multiply(A, B, /atranspose) blas_gemm, M, B, A, btranspose=1 M = A # transpose(B) M = matrix_multiply(A, B, /btranspose) blas_gemm, M, B, A, atranspose=1 M = transpose(A) # transpose(B) M = matrix_multiply(A, B, /atrans, /btrans) blas_gemm, M, B, A, atrans=1, btrans=1 M = conj(transpose(A)) # B M = matrix_multiply(conj(A), B, /atranspose) blas_gemm, M, B, A, btranspose=-1 M = A # conj(transpose(B)) M = matrix_multiply(A, conj(B), /btranspose) blas_gemm, M, B, A, atranspose=-1 M = conj(transpose(A)) #transpose(B) M = matrix_multiply(conj(A), B, /atrans, /btrans) blas_gemm, M, B, A, atrans=1, btrans=-1 M = transpose(A) # conj(transpose(B)) M = matrix_multiply(A, conj(B), /atrans, /btrans) blas_gemm, M, B, A, atrans=-1, btrans=1 M = conj(transpose(A)) # conj(transpose(B)) M = matrix_multiply(conj(A), conj(B), /atrans, /btrans) blas_gemm, M, B, A, atrans=-1, btrans=-1

Note: BLAS_GEMM can also be used in place of the ## operator. For example, M = A ## B is equivalent to BLAS_GEMM, M, A, B, and M = A ## TRANSPOSE(B) is equivalent to BLAS_GEMM, M, A, B, /BTRANSPOSE.

## Syntax

BLAS_GEMM, C, A, B [, ALPHA=value] [, BETA=value] [, ATRANSPOSE=value] [, BTRANSPOSE=value]

## Arguments

### C

The array to be updated. C must be of float, double, complex, or dcomplex type. BLAS_GEMM does not change the size and type of C.

### A

The first matrix operand in the matrix product. A may be any array that IDL can convert to the type of C. BLAS_GEMM does not change A, but A will be internally converted to the type of C before multiplication.

### B

The second matrix operand in the matrix product. B may be any array that IDL can convert to the type of C. BLAS_GEMM does not change B, but A will be internally converted to the type of C before multiplication.

NOTE: Based on the values of ATRANSPOSE and BTRANSPOSE, the arguments A and B must be compatible with each other to perform matrix multiplication, as well as the output argument C.

## Keywords

### ALPHA

An optional scaling factor to be multiplied with A and B. ALPHA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change ALPHA. If ALPHA is not specified then a default value of 1.0 is used.

### ATRANSPOSE

Set this keyword to a positive value to multiply using the transpose of A. Set this keyword to a negative value to multiply using the complex conjugate transpose of A.

Note: If C is of type float or double, then any non-zero value for ATRANSPOSE has the same effect of a normal transpose of A.

### BETA

An optional scaling factor to be multiplied with C before adding the matrix product of A and B. BETA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change BETA. If BETA is not specified, then a default value of 0.0 is used.

### BTRANSPOSE

Set this keyword to a positive value to multiply using the transpose of B. Set this keyword to a negative value to multiply using the complex conjugate transpose of B.

Note: If C is of type float or double, then any non-zero value for BTRANSPOSE has the same effect of a normal transpose of B.

This routine is written to make use of IDL’s thread pool, which can increase execution speed on systems with multiple CPUs. The values stored in the !CPU system variable control whether IDL uses the thread pool for a given computation. In addition, you can use the thread pool keywords TPOOL_MAX_ELTS, TPOOL_MIN_ELTS, and TPOOL_NOTHREAD to override the defaults established by !CPU for a single invocation of this routine. See Thread Pool Keywords for details.

Note: The thread pool keywords have no effect on Mac, MKA will always use multiple cores.

## Examples

The following examples show how to use the BLAS_GEMM procedure to perform matrix multiplication, with scaling.

Create a multidimensional array:

`A = RANDOMU(1, 4, 5)`

Print A:

`PRINT, A`

IDL prints:

`     0.417022     0.997185     0.720325     0.932557`
`  0.000114381     0.128124     0.302333     0.999041`
`     0.146756     0.236089    0.0923386     0.396581`
`     0.186260     0.387911     0.345561     0.669746`
`     0.396767     0.935539     0.538817     0.846311`

Create another multidimensional array:

`B = RANDOMU(2, 4, 5)`

Print B

`PRINT, B`

IDL prints:

`     0.435995     0.185082    0.0259262     0.931541`
`     0.549662     0.947731     0.435322     0.484749`
`     0.420368     0.320536     0.330335     0.154427`
`     0.204649     0.698863     0.619271     0.119951`
`     0.299655     0.485176     0.266827     0.632738`

Create an array to store the product of A and B:

`C = FLTARR(4, 4)`

Print C

`PRINT, C`

IDL prints:

`     0.000000     0.000000     0.000000     0.000000`
`     0.000000     0.000000     0.000000     0.000000`
`     0.000000     0.000000     0.000000     0.000000`
`     0.000000     0.000000     0.000000     0.000000`

Set C equal to the product of A and B (i.e., C = TRANSPOSE(A) ## B):

`BLAS_GEMM, C, A, B, /ATRANSPOSE`

Print C:

`PRINT, C`

IDL prints:

`     0.400585     0.447005     0.280554     0.684583`
`     0.964161      1.10666     0.649466      1.66597`
`     0.751232     0.952367     0.538557      1.21421`
`      1.51310      2.12521      1.23066      2.03007`

Copy C into a temporary variable:

`CPrime = C`

Add the product of A and B to C (i.e., C += TRANSPOSE(A) ## B):

`BLAS_GEMM, C, A, B, /ATRANSPOSE, BETA=1.0`

Print C:

`PRINT, C`

IDL prints:

`     0.801170     0.894009     0.561108      1.36917`
`      1.92832      2.21332      1.29893      3.33193`
`      1.50246      1.90473      1.07711      2.42841`
`      3.02620      4.25042      2.46132      4.06014`

Show that C is now twice CPrime:

`PRINT, C / CPrime`

IDL prints:

`      2.00000      2.00000      2.00000      2.00000`
`      2.00000      2.00000      2.00000      2.00000`
`      2.00000      2.00000      2.00000      2.00000`
`      2.00000      2.00000      2.00000      2.00000`

## Version History

 8.8 Introduced