It seems to me that is there are "characters not used in representing numbers" then it can be called a string; likewise, if there are no numerals, then it also should automatically be a string.
If one defines:
like_not_num='[^.+-0123456789de]' ; MJM [^....] means not this list
like_int='(^[+-]?[0123456789]+$)' ; "^" means different here than above; here it anchors start of expression
like_real='^([+-]?[0123456789]*)(\.)([0123456789]*)([eEdD]?)([+-]?[0-9]*)$'
like_real1='^([+-]?[0123456789]*)(\.)([0123456789]*)$' ; non-exponents
like_real2='^([+-]?[0123456789]+)(\.)([0123456789]*)([eEdD]{1})([+-]?[0-9]+)$' ; exponents num before dec 5.e5 and 5.5e5
like_real3='^([+-]?[0123456789]+)([eEdD]{1})([+-]?[0-9]+)$' ; exponents, no decimal, i.e. 5e-5
like_real4='^([+-]?[0123456789]*)(\.)([0123456789]+)([eEdD]{1})([+-]?[0-9]+)$' ; exponents num after dec .5e5 and 5.5e5
And then in the section of read_csv.pro where the type determination is done insert this:
; MJM test to see if any non-numerical chars, indicating must be string
test_m=stregex(strtrim(subdata,2),like_not_num)
if (max(test_m) ne -1) then continue ; this col must be string since at least one has non num chars
test_n=stregex(strtrim(subdata,2),'[0-9]+') ; require at least one numeral in non-string
; hmm.. max(test_n) eq -1 means ALL rows in col have no digits
; min(test_n) eq -1 means at least one row in col has no digits
if (min(test_n) eq -1) then continue ; no numbers, must be string (guards against combos of +-.de)
; end test
; OK, at this point there is at least on numeral in each row in sub-column under consideration
test_i=stregex(strtrim(subdata,2),like_int) ; if all >=0, consistent with integers
test_r1=stregex(strtrim(subdata,2),like_real1) ; 5.5
test_r2=stregex(strtrim(subdata,2),like_real2) ; 5.e5 and 5.5e5
test_r3=stregex(strtrim(subdata,2),like_real3) ; 5e5
test_r4=stregex(strtrim(subdata,2),like_real4) ; .5e5 and 5.5e5
test_num=test_r1+test_r2+test_r3+test_r4+test_i
if (min(test_num) eq -5) then continue ; at least one row in this col cannot be a valid number so col is string
;;;;; END MJM tests
-M
|