Object Detection Tutorial

This topic describes how to use the Train TensorFlow Object Model tool to train models. This is an alternative to using the Deep Learning Labeling Tool to create labeled data and to train object detection models.

Once you have labeled your training rasters, you can train an object detection model so that it will learn what the features look like. Training a deep learning model involves a number of stochastic processes. Training with the same parameters will yield different models. This is due to the way the algorithm tries to converge to an answer, and it is a fundamental part of the training process.

You can also write a script to train an object detection model using the TrainTensorFlowObjectModel task.

See the following sections:

Select Training and Validation Rasters


  1. Choose one of the following options to start the Train TensorFlow Object Model tool:
    • In the ENVI Toolbox, select Deep Learning > Object Detection > Train TensorFlow Object Model.
    • In the Deep Learning Guide Map, click this sequence of buttons: Object Detection > Train a New Object Model > Train Object Detection Model.

    The Train TensorFlow Object Model dialog appears with the Main tab selected. Use this tab to specify the training rasters, validation rasters, and output models.

  2. Click the Add File(s) button next to the Training Rasters field. Select one or more object detection rasters that you want to use for training, and click Open. They are added to the Training Rasters field. Use the Build Object Detection Raster from Annotation or Build Object Detection Raster from ROI tool to create them.

    To use pre-trained model weights for object detection, click the Spectral Subset button and select three bands for input. The pre-trained weights come from ImageNet, an open-source image database used for deep learning research. The benefit of pre-trained weights is faster training time and having a rigorously tested set of parameters that are ideal for object detection. If you do not spectrally subset the training rasters, all bands will be used, and the model will use randomly initialized weights.

  3. Click the Add File(s) button next to the Validation Rasters field. These are separate object detection rasters that can be used to evaluate the accuracy of a TensorFlow model. Although classification accuracy is typically better if you use different rasters for training and validation, you can still use the same raster for both.

    Tip: For Training Rasters and Validation Rasters, you can click the button to open an Object Detection Raster Label Statistics dialog that shows the total label count for each object detection raster, and the number of labels per class in each raster. You can also click the Add from Data Manager button to load one or more label rasters that are already open in the Data Manager.

  4. Specify a filename (.h5) and location for the Output Model. This will be the "best" trained model, which is the model from the epoch with the lowest validation loss. By default, the tool will save the best and last model. Usually, the best model will perform the best compared to the last model, but not always. Having both outputs lets you choose which model works best for your scenario.
  5. Specify a filename (.h5) and location for the Output Last Model. This will be the trained model from the last epoch. Usually, the best model will perform the best compared to the last model, but not always. Having both outputs lets you choose which model works best for your scenario.

Set Model Parameters


Select the Model tab. These parameters are optional.

  1. Enter a Model Name. If you do not provide one, "ENVI Deep Learning OD" will be used as the model name.

  2. Enter a Model Description in the field provided.

Set Training Parameters


Select the Training tab. The parameters on the Training tab are designed for users who want more control over how a TensorFlow model learns to recognize features during training. They are all optional.

If you are not sure how to set the values for these fields, you can use the ENVI Modeler to create random values. See Train Deep Learning Models Using the ENVI Modeler for details.

  1. Enable the Pad Small Features option when features are small; for example: vehicles, utilities, road markings, etc. Features with bounding boxes drawn around them must have at least 25 pixels in the X and Y directions. If the labeled features are smaller than this, the Pad Small Features option will pad them with extra pixels so they are at least 25 pixels in both directions.

  2. Set the Augment Scale option to Yes to augment the training data with resized (scaled) versions of the data. See Data Augmentation.

  3. Set the Augment Rotation option to Yes to augment the training data with rotated versions of the data. See Data Augmentation.

  4. In the Number of Epochs field, enter the number of epochs to run. An epoch is a full pass of the entire training dataset through the algorithm's learning process. Training parameters are adjusted at the end of each epoch. The default value is 25. If a feature of interest is relatively sparse (less than 100 instances) in your object detection rasters, a higher number of epochs such as 50-75 will ensure adequate training.

  5. In the Patches per Batch field, enter the number of patches to run per batch. The default value is 1. A patch is a small image subset passed to the trainer to help it learn what a feature looks like. The default patch size used for object detection is 640 x 640 pixels. A batch comprises one iteration of training; model parameters are adjusted at the end of each iteration.

    The Patches per Batch parameter controls how much data you send to the trainer in each batch. This is directly tied to how much GPU memory you have available. With higher amounts of GPU memory, you can increase the Patches per Batch. The following table shows the amount of GPU memory successfully tested with different values:

    GPU memory (MB)

    Patches per Batch

    5099

    1

    5611

    2

    9707

    3-4

    10731

    5-8

    11711

    9-10

  6. In the Feature Patch Percentage field, specify the percentage of patches that contain labeled features to use during training. Values should range from 0 to 1. This applies to both the training and validation datasets. The default value is 1, which means that 100% of the patches that contain features will be used for training. The resulting patches are then used as input to the Background Patch Ratio, described in the next step.

    Example: Suppose that an object detection raster has 50 patches that contain labeled features. A Feature Patch Percentage value of 0.4 means that 20 of those patches will be used for training (20/50 = 0.4, or 40%).

    The default value of 1 ensures that you are training on all of the features that you labeled. In general, if you have a large training dataset (hundreds of images), lowering the Feature Patch Percentage will decrease training time.

  7.  In the Background Patch Ratio field, enter the ratio of background patches (those that contain no labeled features) to patches with features. For example, a ratio of 1.0 for 100 patches with features would provide 100 patches without features. The default value is 0.15.

    When features are sparse in a training raster, the training can be biased by empty patches throughout. The Background Patch Ratio parameter allows you to restrict the number of empty patches, relative to those that contain features. Increasing the value tends to reduce false positives, particularly when features are sparse. The following image shows an example. Increasing the Background Patch Ratio to 0.25 and performing longer training (by increasing the Number of Epochs to 60) resulted in fewer false positives with identifying vessels. The vessels are sparse when compared to the rest of the image. Click on the thumbnail image to see the full image.

  8. To reuse these task settings in future ENVI sessions, save them to a file. Click the down arrow next to the OK button and select Save Parameter Values, then specify the location and filename to save to. Note that some parameter types, such as rasters, vectors, and ROIs, will not be saved with the file. To apply the saved task settings, click the down arrow and select Restore Parameter Values, then select the file where you previously stored your settings.

  9. To run the process in the background, click the down arrow and select Run Task in the Background. If an ENVI Server has been set up on the network, the Run Task on remote ENVI Server name is also available. The ENVI Server Job Console will show the progress of the job and will provide a link to display the result when processing is complete. See the ENVI Servers topic in ENVI Help for more information.

  10. Click OK. ENVI automatically creates label rasters for training and populates the "Label Raster" column with "OK" for each training raster.

Training a model takes a significant amount of time due to the computations involved. Depending on your system and graphics hardware, processing can take several minutes to several hours. While training is in progress, a dialog will show a progress bar, the current Epoch and Step of the progress, and the updated validation loss value. For example:

At the same time, a TensorBoard page displays in a new web browser. TensorBoard is a visualization toolkit included with TensorFlow. It reports real-time metrics such as Loss, Accuracy, Precision, and Recall during training. See View Training Metrics for details.

With each epoch, the weights of the model are adjusted to make it more correct, and the label rasters are exposed to the model again. The weights from the epoch that produces the lowest validation loss will be used in the final, trained model. For example, if you want the model to complete 25 epochs and the lowest validation loss was achieved during Epoch #20, ENVI will retain and use the weights from that epoch in the trained model.

When training is complete, you can pass the trained model to the TensorFlow Object Classification tool.

See Also


Label Features, TensorFlow Object Classification, Train TensorFlow Grid Models, Train TensorFlow Pixel Models, Train Deep Learning Models Using the ENVI Modeler