The IDLmlPartition function partitions data so that it can be separated into two or more groups.
Examples
Example 1 splits two arrays into two groups, one with 80% of the elements, the other with 20%:
Features = randomu(seed, 3, 1000)
Values = randomu(seed, 1000)
Part = IDLmlPartition({train:80, test:20}, features, values)
Print, n_elements(part.train.features)
Print, n_elements(part.train.values)
Print, n_elements(part.test.features)
Print, n_elements(part.test.values)
Example 2 splits two arrays into three groups, one with 60% of the elements, one with 30%, and one with 10%:
Features = randomu(seed, 3, 1000)
Values = randomu(seed, 1000)
Part = IDLmlPartition({a:0.6, b:0.3, c:0.1}, features, values)
Print, n_elements(part.a.features)
Print, n_elements(part.b.features)
Print, n_elements(part.c.features)
Print, n_elements(part.a.values)
Print, n_elements(part.b.values)
Print, n_elements(part.c.values)
Example 3 splits two arrays into three groups of equal size:
Attributes = randomu(seed, 3, 1000)
Values = randomu(seed, 1000)
Part = IDLmlPartition({group1:1, group2:1, group3:1}, attributes, values)
Print, n_elements(part.group1.attributes)
Print, n_elements(part.group2.attributes)
Print, n_elements(part.group3.attributes)
Print, n_elements(part.group1.values)
Print, n_elements(part.group2.values)
Print, n_elements(part.group3.values)
Syntax
Result = IDLmlPartition(Partitions, Features, Values [, PARTITION_OFFSET=value])
Return Value
This function returns a dictionary of dictionaries, where the first level of keys is defined by the keys of the Partitions argument, and the second level of keys is defined by the names of the variables passed as arguments. For example, partition = IDLmlPartition({a:60, b:30, c:10}, feats, vals) will return a nested dictionary with the following keys: partition.a.feats, partition.a.vals, partition.b.feats, partition.b.vals, partition.c.feats, and partition.c.vals.
Arguments
Features
Specify an array of features of size n x m, where n is the number of attributes and m is the number of examples.
If you pass in a scalar number for this argument, the function will return the actual indices so you can do the partition yourself.
Partitions
Specify how to partition both features and values. You can use a structure, a dictionary, or an array of numbers. The number of keys in the definition will determine the number of partitions. The keys of the definition will determine the names of the partitions. The values of the definition will determine the relative sizes of the partitions. For example, {train:0.8, test:0.2} will split the dataset into two groups, one named ‘train’, the other named ‘test’, with a relative size of 80% and 20%, respectively.
Values (optional)
Specify an array of values of size m, where m is the number of examples.
If the Features argument is a scalar number, the Values argument is optional.
Keywords
PARTITION_OFFSET (optional)
Set this keyword to return an array to the indices where the partitions are being made.
Version History
See Also
IDLmlShuffle