ENVI Machine Learning Anomaly Detection Tutorial ENVI Machine Learning Supervised Classification Tutorial

Use the Machine Learning Labeling Tool to create labeled data using ROIs, then train a model that can be used with input rasters for anomaly detection or supervised classification. It simplifies the process of marking the background for anomaly detection, or features of interest for supervised classification.

With the Machine Learning Labeling Tool, you will:

Create a Project


In the steps that follow, you will create a new project for labeling data. Each project has its own folder that contains the data and files created during the labeling process. Labeling projects are named machine_learning_labeling.json, and you open restore a previously created project by selecting File > Open Project in the Labeling Tool menu bar. Navigate to the project folder and select the file machine_learning_labeling.json to restore it.

To create a project:

  1. In the ENVI Toolbox, select Machine Learning > Machine Learning Labeling Tool.

  2. Select File > New Project from the Labeling Tool menu bar. The Create New Labeling Project dialog appears.

  3. From the Project Type drop-down list, select one of the following:

    • Supervised Classification

    • Anomaly Detection

  4. Enter a Project Name.
  5. Enter a Project Folder to save the project files in, or click the Browse button to select a folder you have created for this project. The folder must be empty.
  6. Click OK, and click Yes if you are prompted to create the folder.

The project folder is created, and ENVI creates a machine_learning_labeling.json file in the folder. You can restore this JSON file in the Machine Learning Tool using File > Open.

Next, add rasters to the project that will be used for labeling.

Add Training Rasters


Add rasters to the project that will be used for labeling. Select one or more rasters as follows:

  1. Click the Add button below the Rasters section of the Labeling Tool. The Data Selection dialog appears.
  2. Select one or more training rasters and click OK. All training rasters must have the same number of bands. If one raster has a different number of bands than the others, select that raster in the Data Selection dialog and click the Spectral Subset button, then select the appropriate number of bands.
    • The training rasters are listed in the Rasters section. The Classes column shows "0/1," where 0 means no class labels have been drawn and 1 represents the total number of classes defined. Label Raster column shows "No" to indicate label rasters have not been created yet.
    • You can optionally rename the training raster from its original filename. Select the row of the rater to rename, click the Rename Selected button , and enter a new name.
    • To remove all training rasters, click the Select All button , then the Remove Selected button .

ENVI creates subfolders for each training raster that you select as input. Each subfolder will contain the ROIs used to label data.

ENVI also creates a file named source_raster.json in each subfolder. This is a simplified version (also called a dehydrated form) of the training raster, where all of its information is condensed down to JSON code. If you move the project folder to a different location, the source_raster.json file in each project subfolder tells ENVI where to find the training raster. This way, you do not have to keep track of file locations yourself.

Next, label data in the training rasters.

Label Data


The way you label data in the training raster depends on the project type selected when you created the project:

  • For Supervised Classification projects, label features in the training raster(s) by creating class definitions and adding ROIs to the classes. See Label Features for Supervised Classification for steps.

  • For Anomaly Detection projects, the only class to label in the training raster(s) is Background. Add ROIs to identify the "known" feature. Everything else that does not match pixels marked as Background is considered an anomaly. See Label Features for Anomaly Detection for steps.

You can also optionally import ROI or vector files into your project for one or more training rasters; for example, if you have a shapefile of features that you want to use for labeling. For steps, see the Import ROIs or Vectors section in this topic.

Label Features for Supervised Classification

Define the features you want to label under the Class Definitions section of the Labeling Tool. The feature names you specify here determine the class names in the output classification raster.

By default, the Background class definition is added when you create the project. You can rename this class as needed.

  1. Click the Add button in the lower-left corner of the Class Definitions section.
  2. Enter a Name for the class, then click the button. The feature is added to the Class Definitions list.
  3. To optionally change the color of the class, click the Color column and choose a different color.
  4. Repeat the steps above for each class definition you want to create.
  5. Select a training raster from the Rasters list and click the Draw button . The selected image appears in the view. ENVI creates a new set of ROIs with the same names and colors as the classes that you defined. The ROIs are listed in the Layer Manager and Data Manager. The Region of Interest (ROI) Tool appears, with the first ROI selected.

  6. Draw ROIs on as many examples of the feature as possible within the image.

  7. To label features for a different class, select the class name from the Labeling Tool drop-down list; the class name will update in the ROI Name field in the ROI Tool.
  8. To use a different training raster for drawing ROIs, select the raster name in the Labeling Tool from the Rasters list, then click the Draw button.
  9. Continue labeling the raster(s) in this manner for each feature/class. Your ROIs are automatically saved to the current project, even if you close the Labeling Tool. You can view the statistics for your labeling project at any point. See View Project and Labeling Statistics for details.

    The Labeling Tool synchronizes the names and colors of classes. Do not change the names and colors of the classes outside of the Labeling Tool.

  10. When you are finished labeling, close the ROI Tool.

Next, train the model using the labels you created.

Label Background for Anomaly Detection

The only class definition to use for an Anomaly Detection project is Background. This class will define the "known" feature in the raster. Everything else that does not match Background is considered an anomaly.

Be very specific when labeling data for anomaly detection - emphasize quality over quantity. Too few pixels labeled might not provide enough information to find all anomalous targets. Too many pixels labeled can result in longer classification run times. If pixels that belong to an anomalous target are labeled as Background, it will result in confusion and information loss.

  1. Select a training raster from the Rasters list and click the Draw button . The selected image appears in the view. ENVI creates a new ROI named Background and it is listed in the Layer Manager and Data Manager. The Region of Interest (ROI) Tool appears, with the Background ROI selected.

  2. Use the ROI Tool to label areas that belong in the Background class.

  3. To use a different training raster for drawing ROIs, select the raster name in the Labeling Tool from the Rasters list, then click the Draw button.
  4. Continue labeling the raster(s) in this manner for each feature/class. Your ROIs are automatically saved to the current project, even if you close the Labeling Tool. You can view the statistics for your labeling project at any point. See View Project and Labeling Statistics for details.

    The Labeling Tool synchronizes the names and colors of classes. Do not change the name and color of the Background class outside of the Labeling Tool.

  5. When you are finished labeling, close the ROI Tool.

Next, train the model using the labels you created.

Train the Model


Once the training rasters are labeled, the next step is to train the machine learning model. The Labeling Tool provides a simplified way to train models for anomaly detection and supervised classification.

You can optionally skip these steps and select the option to Generate Training Data that can be used with the ENVI Modeler.

Follow these steps:

  1. Click the Train button in the Labeling Tool. The Train Machine Learning Model dialog appears.
  2. From the Method drop-down list, select the method you want to train on your ROIs.
  3. Set the parameters in the Train Machine Learning Model dialog. The parameters that are available depend on the project type and the method you select. See the following selections for descriptions of the training parameters:

  4. Click OK to train the model.

Parameters for Train Supervised Classification

Some parameters listed below may not be available, depending on the training method you select.

  • Model Name: Change the model name, if desired. The default name that appears is the name of the Method you selected.

  • Description: Add a description of the model's purpose, if desired.

  • Balance Classes: Select Yes or No to specify whether all classes should be considered equal during training. Selecting Yes helps to account for classes with few samples compared to classes with many examples.

  • Estimators: Enter the number of decision trees to use. The estimators are the predictors of the algorithm. The default is 100.

  • Max Depth:Enter the maximum depth of the tree. If not specified, then nodes are expanded until all leaves are pure.

  • Output Model: Enter the filename and location of the output model. The model will be saved as a .json file, which can be used to classify other rasters as well as the raster it was initially trained on.

Parameters for Train Anomaly Detection

Some parameters listed below may not be available, depending on the training method you select.

  • Model Name: Change the model name, if desired. The default name that appears is the name of the Method you selected.

  • Description: Add a description of the model's purpose, if desired.

  • Estimators: Enter the number of decision trees to use. The estimators are the predictors of the algorithm. The default is 100.

  • Leaf Size:Change the leaf size, if needed. Changing the leaf size can affect the speed of construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.

  • Output Model: Enter the filename and location of the output model. The model will be saved as a .json file, which can be used to classify other rasters as well as the raster it was initially trained on.

Import ROIs or Vectors


You can optionally import ROIs or vectors into your project to use for labels with one or more of the training rasters; for example, if you want to use a shapefile of features that you already created.

Follow these steps:

  1. Define classes, as described previously.
  2. Add training rasters, as described previously.
  3. In the Labeling Tool, select the training raster for which you want to import a vector or ROI file and click the Draw button to display the training raster.
  4. In the Labeling Tool, click the Options button .
  5. Select one of the following:

    • Import ROIs

      Select an ROI filename in the dialog that appears, and click Open.

      In the Match Input ROIs to Class Definitions dialog that appears, click the buttons to map the ROIs from the input file (left side) to the Class Definitions to use in the Labeling Tool (right side). A line is drawn to show each match.

    • Import Vectors

      Select an Input Vector filename in the dialog that appears. See the Supported Data Types topic in ENVI Help for a list of supported vector formats.

      Select an Attribute Name from the drop-down list. Attribute names come from the attribute columns in the vector file. If it does not have any attributes, the option is <none>.

      In the Record Selection dialog that appears, select a value from the drop-down list, if available for that attribute. Then, click the buttons to map the Attributes from the input file (left side) to the Class Definitions to use in the Labeling Tool (right side). A line is drawn to show each match. If you used <none> for the attribute, all Input Records will be mapped to the Class Definition you select.

  6. Click OK. The ROI records or vector records are imported into the current project for the selected training raster.

Create Training Data


After data is labeled, you can optionally create classification rasters without proceeding with the Train the Model steps. The output can be used in an ENVI Modeler workflow.

Click the Options drop-down list and select Generate Training Data. ENVI automatically creates classification rasters for training and populates the Label Raster column with OK for each training raster. The files *.dat and *.hdr for the project are added to the project folder.

View Project and Labeling Statistics


To view statistics for your current labeling project, click the Options drop-down list and select Show Labeling Statistics. The Project Statistics dialog appears.

The top of the dialog provides general information about the project, including:

  • Project name and location
  • Project type
  • Number of classes and their names
  • Number of bands in each training raster (this should be the same for all training rasters)
  • Number of training rasters
  • Number of ROIs drawn for each class

To save the project statistics to a text file, click the Save button in the Project Statistics dialog. Then select a location to save the text file. The default filename is report.txt.

To copy the project statistics to your system's clipboard, click the Copy button in the Project Statistics dialog.