The IMSL_RANKS function computes the ranks, normal scores, or exponential scores for a vector of observations.

This routine requires an IDL Advanced Math and Stats license. For more information, contact your sales or technical support representative.

Ties

If the assignment RANK = IMSL_RANKS(x) is made, then in data without ties, the output values are the ordinary ranks (or a transformation of the ranks) of the data in x. If x(i) has the smallest value among the values in x and there is no other element in x with this value, then RANK(i) = 1. If both x(i) and x(j) have the same smallest value, then the output value depends on the option used to break ties. The table that follows shows the results for some of the keywords.

Keyword

Result

AVERAGE_TIE

Result (i) = Result (j) = 1.5

HIGHEST

Result (i) = Result (j) = 2.0

LOWEST

Result (i) = Result (j) = 1.0

RANDOM_SPLIT

Split Result (i) = 1.0 and Result (j) = 2.0 or, randomly, Result (i) = 2.0 and Result (j) = 1.0

When the ties are resolved randomly, IMSL_RANDOM is used to generate random numbers. Different results occur from different executions of the program unless the “seed” of the random-number generator is set explicitly by use of IMSL_RANDOMOPT ( ).

Scores

Normal and other functions of the ranks can optionally be returned. Normal scores can be defined as the expected values, or approximations to the expected values, of order statistics from a normal distribution. The simplest approximations are obtained by evaluating the inverse cumulative normal distribution function, IMSL_NORMALCDF (with the keyword INVERSE), at the ranks scaled into the open interval (0,1).

In the Blom version (Blom 1958), the scaling transformation for the rank ri (1 ≤ ri ≤ n, where n is the sample size) is (ri – 3/8) / (n + 1/4). The Blom normal score corresponding to the observation with rank ri is:

where Φ(⋅) is the normal cumulative distribution function.

Adjustments for ties are made after the normal score transformation; that is, if x(i) equals x(j) (within FUZZ) and their value is the k-th smallest in the data set, the Blom normal scores are determined for ranks of k and k + 1. Then, these normal scores are averaged or selected in the manner specified. (Whether the transformations are made first or the ties are resolved first is irrelevant, except when AVERAGE_TIE is specified.)

In the Tukey version (Tukey 1962), the scaling transformation for the rank ri is (ri – 1/3) / (n + 1/3). The Tukey normal score corresponding to the observation with rank ri follows:

Ties are handled in the same way as for the Blom normal scores.

In the Van der Waerden version (see Lehmann 1975, p. 97), the scaling transformation for the rank ri is ri /(n + 1). The Van der Waerden normal score corresponding to the observation with rank ri is as follows:

Ties are handled in the same way as for the Blom normal scores.

When option EXP_NORM_SCORES is nonzero, the output values are the expected values of the normal order statistics from a sample of size n = N_ELEMNTS(x). If the value in x(i) is the k-th smallest, then the value output in RANK (i) is E(zk), where E(¥) is the expectation operator, and zk is the k-th order statistic in a sample of size n from a standard normal distribution. Ties are handled in the same way as for the Blom normal scores.

 

Savage scores are the expected values of the exponential order statistics from a sample of size n. These values are called Savage scores because of their use in a test discussed by Savage (1956) and Lehmann (1975). If the value in x(i) is the k-th smallest, then the value output in RANK (i) is E(yk) where yk is the k-th order statistic in a sample of size n from a standard exponential distribution. The expected value of the k-th order statistic from an exponential sample of size n follows:

Ties are handled in the same way as for the Blom normal scores.

Example


The data for this example, from Hinkley (1977), contains 30 observations. Note that the fourth and sixth observations are tied, and the third and twentieth observations are tied.

x = [0.77, 1.74, 0.81, 1.20, 1.95, 1.20, 0.47,$
1.43, 3.37, 2.20, 3.00, 3.09, 1.51, 2.10,$
0.52, 1.62, 1.31, 0.32, 0.59, 0.81, 2.81,$
1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.90, 2.05]
r = IMSL_RANKS(x)
; Call IMSL_RANKS.
FOR i =	0, 29 DO PM,	i + 1, r(i), FORMAT = '(i5, f7.1)'
1	5.0
2	18.0
3	6.5
4	11.5
5	21.0
6	11.5
7	2.0
8	15.0
9	29.0
10	24.0
11	27.0
12	28.0
13	16.0
14	23.0
15	3.0
16	17.0
17	13.0
18	1.0
19	4.0
20	6.5
21	26.0
22	19.0
23	10.0
24	14.0
25	30.0
26	25.0
27	9.0
28	20.0
29	8.0
30	22.0

Syntax


Result = IMSL_RANKS(X [, AVERAGE_TIE=value] [, BLOM_SCORES=value] [, /DOUBLE] [, EXP_NORM_SCORES=value] [, FUZZ=value] [, HIGHEST=value] [, LOWEST=value] [, RANDOM_SPLIT=value] [, RANKS=value] [, SAVAGE_SCORES=value] [, TUKEY_SCORES=value] [, VDW_SCORES=value])

Return Value


A one-dimensional array containing the rank (or optionally, a transformation of the rank) of each observation.

Arguments


X

One-dimensional array containing the observations to be ranked.

Keywords


AVERAGE_TIE (optional)

Average of the scores of the tied observations (default). At most, one of the following keywords can be set to a nonzero value to change the method used to assign a score to tied observations: AVERAGE_TIE, HIGHEST, LOWEST, RANDOM_SPLIT.

BLOM_SCORES (optional)

Blom version of normal scores. At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

DOUBLE (optional)

If present and nonzero, double precision is used.

EXP_NORM_SCORES (optional)

Expected value of normal order statistics (for tied observations, the average of the expected normal scores). At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

FUZZ (optional)

Value used to determine when two items are tied. If ABS(x(I) – x(J)) is less than or equal to FUZZ, then x(I) and x(J) are said to be tied. Default: 0.0

HIGHEST (optional)

Highest score in the group of ties. At most, one of the following keywords can be set to a nonzero value to change the method used to assign a score to tied observations: AVERAGE_TIE, HIGHEST, LOWEST, RANDOM_SPLIT.

LOWEST (optional)

Lowest score in the group of ties. At most, one of the following keywords can be set to a nonzero value to change the method used to assign a score to tied observations: AVERAGE_TIE, HIGHEST, LOWEST, RANDOM_SPLIT.

RANDOM_SPLIT (optional)

Tied observations are randomly split using a random-number generator. At most, one of the following keywords can be set to a nonzero value to change the method used to assign a score to tied observations: AVERAGE_TIE, HIGHEST, LOWEST, RANDOM_SPLIT.

RANKS (optional)

Ranks (default). At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

SAVAGE_SCORES (optional)

Savage scores (expected value of exponential order statistics). At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

TUKEY_SCORES (optional)

Tukey version of normal scores. At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

VDW_SCORES (optional)

Van der Waerden version of normal scores. At most, one of the following keywords can be set to a nonzero value to specify the type of values returned: RANKS, BLOM_SCORES, TUKEY_SCORES, VDW_SCORES, EXP_NORM_SCORES, SAVAGE_SCORES.

Version History


6.4

Introduced

See Also


IMSL_NORMALCDF, IMSL_RANDOMOPT