You can provide a deep learning model with samples of features that you want it to find—a process called labeling. Rasters that have been labeled with features of interest are used as input to train a deep learning model.
With object detection, you can label images using annotations or regions of interest (ROIs). The result is an object detection raster. This is the same as a deep learning raster, except that it contains additional information about bounding boxes, stored in the raster metadata.
With pixel segmentation, you can label images using ROIs. The result is a label raster.
Features can be two-dimensional; for example, outlines of buildings and parking lots. One-dimensional features include roads, paths, and railroad tracks. Point features include the locations of trees or stop signs.
You can train a deep learning model to look for one feature, which results in a binary classification image. Or you can train it to look for multiple features, which results in a classification image with multiple classes. This is the most common scenario.
To create label rasters, you need at least one input image from which to collect samples of the features you are interested in. These are called training rasters.
Prepare Training Rasters
Training rasters are images that you label with rectangle annotations (object detection) or ROIs (pixel segmentation). They can originate from any sensor or data type that ENVI supports, as long as you convert them to ENVI raster format (.dat) first. See the Supported Data Types topic in ENVI Help for a list of supported data types.
If you are working with one large training raster, it may take too much time to label all of the features in the entire image. Consider creating a spatial subset of the image to use for training purposes, then draw annotations or ROIs on the features in the subsetted image. To create a spatial subset, select File > Save As from the ENVI menu bar, select the file to subset, then click the Spatial Subset button.
You can also label multiple images for training. The images can be different sizes. For example, you may have dozens of UAV images that each cover a small area. Defining annotations or ROIs from multiple images can provide more general results in a deep learning model, compared to using a single image.
Obtaining quality results from ENVI Deep Learning depends in part on the quality and integrity of the spectral data in your images. Some preprocessing steps may be necessary to normalize the different images used for training and classification.
- The image that you use for classification should have the same spectral characteristics as the image(s) that you label for training. All images used for training and classification should have the same data type (byte) and range of data values. For better results, use the Build Deep Learning Raster tool with the same minimum and maximum pixel values for all images prior to using the Deep Learning Labeling Tool. Note that while the Deep Learning Labeling Tool will do this conversion for you when images are not of byte data type, it converts each image with its own minimum and maximum, which is not as optimal as a single minimum and maximum for all images.
- Images from different dates and viewing angles are acceptable, and often preferred, for training.
- Images can come from different sensors and resolutions. They all will be converted to ENVI raster format for use with Deep Learning.
- If the images have the appropriate metadata, you can use the ENVI Radiometric Calibration tool to calibrate the training and classification images to radiance or top-of-atmosphere reflectance. This is a common method for ensuring all images have consistent data units.
After preparing the training rasters, use the Deep Learning Labeling Tool or standalone tools to create labeled data for training.
Use Standalone Tools to Create Labeled Data
You can also create labeled data by using the following tools:
See Also
Build Label Rasters from Classification Images, Tips for Labeling Features