(Internal) ENVI EVF Format Description

Anonym Thursday, September 18, 2003

The following topic is for INTERNAL use only.

ENVI EVF files are comprised of two parts; the header section, and the data section. Frist, the header section currently contains a static size of 812 bytes: <>
<>

Byte Location	Size	Description
0-3	bytarr(4)	Magic cookie - byte("Palm")
4	byte	Byte Order (result of "byte(256,0)")
5-8	long	Number of vectors contained in the file
9-12	long	Number of records contained in the file (each record will contain 1 or more vectors)
13-44	dblarr(4)	Corners - [xmin,xmax,ymin,ymax] in projection units of extent of the full file
45-172	bytarr(128)	String name of Layer (If strlen(layer_name) is greater than 128, then it will be clipped to 128 characters
173	byte	Data Type of vectors. Should always be 5 (double precision), but can be other values

Note: Original EVF magic cookie was "JIMY". This version of the file is incompatable with the current version and must be converted on open. The second magic cookie was "Dhou" and this EVF file only differed from the current EVF file by how and what projection information was stored. It is essentially compatable with the most recent version although the byte locations for its header are different than this one.

The following items contain the projection information associated with the evf file and are all stored in the {envi_proj_struct} structure.

174-175 integer Projection Type
176-295 dblarr(15) Projection Params
296-423 bytarr(128) Projection Name
424-551 bytarr(128) Projection Datum Name
552-679 bytarr(128) Projection Units Name
680-807 bytarr(128) Extra; current unsed bytes reserved for potential future changes/additions
808-811 long Index Pointer; byte location in the file where the indexing information resides beyond the current data stack (see below for explanation).

Notes about the header:

The record/vector counts as well as data pointers are all LONG integers, and as such, limit the maximum size of an EVF file to 2 Gigabytes.
All data types are stored in native machine byteorder, so if you are on a machine where "byte(256,0)" is different than the value at byte location 4, then you will need to byte swap all data types in the file > byte.
The Data Type of the file should always be 5 (double precision), but if it is a different data_type, then the newer types (UINT, ULONG, LONG64, ULONG64) are not currently supported.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

After the header, the EVF file contains the "data". The "data" is comprised of a DATA STACK followed by an INDEX array, a BOX array, and NUM_PARTS array followed by the PARTS locations, if any.

The DATA STACK array is a make_array(2, num_vectors, type=Data_Type) and contains the (x,y) vertices of all the records in the file. The vertices are stored in projection coordinates, so they are usually stored in double precision.

The INDEX array is a lonarr(2, num_records+1).
INDEX[0,record] contains the starting location of the points for that record within the DATA STACK array).
INDEX[1,record] contains the TYPE of the record.
The TYPE is the same convention that ShapeFiles use:
      0 = Previously Deleted Record (no type)
      1 = Point
      3 = Polyline
      5 = Polygon
      8 = Multi Point

Note: Unlike ShapeFiles, EVF files can contain a mixture of different types in the same file.

If stack then represents the DATA STACK, a given record within the DATA STACK can be referenced via:

xpts = reform(stack[0,index[0,record]:index[0,record+1]-1])
ypts = reform(stack[1,index[0,record]:index[0,record+1]-1])

This works because the INDEX array always contains records+1 entries and the final entry points to the end of the DATA STACK.

The BOX array is a make_array(4, num_records, type=Data_Type). BOX[*,record] contains the bounding box of a given record in projection coordinates ([xmin,xmax,ymin,ymax]).

Note: Having the bounding box of each record is extremely useful when plotting the records because you can very easily determine which records fall within the boundary of your current plot window and skip plotting those records which do not.

The NUM_PARTS arrray is a lonarr(num_records). NUM_PARTS[record] contains the number of parts+1 of the record (this is only relevant for records of TYPE=3 or TYPE=5 and the record is a multi-part polygon/polyline, otherwise NUM_PARTS[record] = 2).

The PARTS array will be a lonarr(NUM_PARTS), one for each record which contains a NUM_PARTS[record] greater than 0. Conversely, if a record has NUM_PARTS[record] = 0, then there will be no PARTS information for that record. PARTS, like INDEX[0,records] will point into the DATA STACK array to the start of each part of a multi-part polygon/polyline. It contains NUM_PARTS[record]+1 PARTS so that the following standard logic works:

for i=0L, num_parts-1 do begin
x = reform(stack[0,abs(parts[i]):abs(parts[i+1])-1])
y = reform(stack[1,abs(parts[i]):abs(parts[i+1])-1])

PARTS also contains the topology information for the multi-part polygon. PARTS[i+1] will be positive for "exterior" (or "normal") polygon parts and PARTS[i+1] will be negative for an "interior" polygon part which represents a hole inside a "normal" polygon part (All interior polygons must be wholly contained within an "exterior" polygon for the multi-part polygon to be topologically correct). In the example above, parts[i] and parts[i+1] is referenced with abs(parts[i]) and abs(parts[i+1]) because they may contain negative values.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Here is an example of Hand Parsing of an existing EVF file in IDL (Note: I am assuming for this example that "test.evf" contains more than 0 records (an .evf file CAN contain zero records) and that the byte order of the file is the same as the machine you are on - so there is no byte swapping logic - but if you needed to byte swap, then any data read from the file with a data_type greater than 1 would need to be byte swapped appropriately. I am also assuming for this example, that the number of points in this file is not HUGE and all will fit into a dblarr(2, num_points). ENVI does not make this assumption, and will display LARGE .evf files in groups of 128 record chunks, so that the stack array doesn't get TOO large).

IDL> openr, 1, 'test.evf'
IDL> num_vecs = (num_recs = 0L)
IDL> point_lun, 1, 5
IDL> readu, 1, num_vecs, num_recs
IDL> index_ptr = 0L
IDL> point_lun, 1, 808
IDL> readu, 1, index_ptr
; note, the file pointer is now at the end of the header info
; and right at the start of the STACK DATA
IDL> stack = dblarr(2, num_vecs, /nozero)
IDL> readu, 1, stack
IDL> index = lonarr(2,num_recs+1, /nozero)
IDL> box = dblarr(4, num_recs, /nozero)
IDL> num_parts = lonarr(num_recs, /nozero)
IDL> point_lun, 1, index_ptr
IDL> readu, 1, index, box, num_parts
IDL> p_parts = ptrarr(num_recs, /allocate)
IDL> for i=0L, num_recs-1 do $
IDL> if (num_parts[i] gt 0) then begin & $
IDL> temp = lonarr(num_parts[i], /nozero) & $
IDL> readu, 1, temp & $
IDL> *p_parts[i] = temporary(temp) & $
IDL> endif
IDL> close, 1
; you got all you could ever want now...

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

So it is relatively easy to reference an existing .EVF file, but trickier to add or make changes to an existing .EVF file. Any change to a record which uses the same or less number of points as the original record can be made to the DATA STACK without any other changes to the file (with the possible exception of the file CORNERS conatined in the header). However, if points are added to an existing record, or a new record is added, then the DATA STACK will increase in size and all the index/box/parts information contained beyond the DATA STACK in the file will have to be "moved" as well. This is why it is best to use existing EVF procedures/functions and, possibly, the new EVF object to make changes to existing EVF files, rather than doing it by hand byte for byte.