Satellite Data Preprocessing with Python/PCI II – Cloud Masking, Haze Removal, Terrain Derivates – ATCOR
GEO214 /409 /402 Python Data Preprocessing Code – Section II
22.03.2018: Section II: MASKING/HAZEREM/TERSETUP/ATCOR/RESAMP processing
Batch multispectral data pre processing for RapidEye and Sentinel-2 data (L1C) in PCI Geomatica 2017. A reader for two of the sessions in GEOG214 and GEO402/GEOG409 at FSU Jena (BSc Geography / MSc Geoinformatics).
Section I: NITF file import and reprojection in a given directory:
- imports NITF RE L1B data to PIX files by also importing all metadata there is
- reprojects the files to UTM WGS84 CC,
- resamples to low resolution copies of the data files with 30 and 100m and
- reads header information and georef to text report
- reads simple band statistics to text report
Section II: Cloud masking/haze removal/terrain Analysis – full processing in Atmospheric Correction package ATCOR.
- Calculate a cloud, haze and water mask
- Remove the haze from satellite date
- Calculate DEM derivates: slope/aspect/shadow-cast layer
- Optional: lowpass filter the DEM
- slope/aspect/shadow cast layer calculation
- Atmospheric correction on haze corrected input calculating the local incidence angle and outputting: 16 Bit scaled topographic normalized reflectance.
Section III: R statistics integration using rpy2 and numpy2
Section IV: Integration of GRASS binaries into workflows
Section V: Integration of AGISOFT
1. Loading libraries and defining locale environment
Here we load PCI libraries, define the locale environment and define what information the user should provide on the CLI (data path and DEM).
print "******************* ATCOR Processing with RE Data **************************" # # by SHese 22.3.2018 RE ATCOR processing # Script to loop through files in a folder and apply the full atmosphere correction workflow. # - cloud masking # - hazerem uses cld/hazemasks for hazerem # - tersetup creates the illufiles # - atcor with illufiles/hazerem files # --------------------------------------------------------------------------- # Import the required algorithms # --------------------------------------------------------------------------- # usage: py-atcor.py /full/directory/to/datafiles /fulldir/path/andfilename-DEM.pix print "1. Importing PCI Libraries ... " from pci.fun import * from pci.lut import * from pci.pcimod import * from pci.exceptions import * from pci.ortho import * from pci.fimport import fimport from pci.resamp import * from pci.masking import * from pci.hazerem import * # from pci.tersetup import * # is not supporting multithreading unfortunately from pci.atcor import * # uses big tmp files in the tmp file directory - cld pot crash from sys import argv import sys # These two libraries are required to extract file names from folders import os import fnmatch # The following locale settings are used to ensure that Python is configured # the same way as PCI's C/C++ code. import locale locale.setlocale(locale.LC_ALL, "") locale.setlocale(locale.LC_NUMERIC, "C") script, InFolder, DEMFile = argv # here we define the input by use on the CLI # usage: py-atcor.py /full/directory/to/datafiles /fulldir/path/andfilename-DEM.pix # ------------------------------------------------------------------------ # Set File variables # ------------------------------------------------------------------------ print "2. Processing in folder path: " + InFolder # InFolder = raw_input("Please enter full input folder path: ") # User will enter pathname at a later stage - we dont want this here
2. Next we are listing the input files that should be processed:
This section searches for the filenames that we want in our filename input list. Its important to get the file_filter here right to populate the right input files in our for-loop later.
# ---------------------------------------------------- # Create list of files to loop through from InFolder # ---------------------------------------------------- print "3. Creating list of valid files ..." # This line is a wildcard filter. In this case only ORTHO.pix files will be used file_filter = "*ORTHO.pix" # This list will become populated with the full pathnames of each file in InFolder input_files_list = [] # os.walk searches a directory tree and produces the file name of any file it encounters # r is the main directory # d is any subdirectory within r # f image file names within r for r,d,f in os.walk(InFolder): for file in fnmatch.filter(f, file_filter): # fnmatch.filter filters out non .pix files print "4. Found valid input file: ", file input_files_list.append(os.path.join(r,file)) # The file name and full path are added to input_files_list
3. Cloud Masking using MASKING
Now we start the cloud masking algorithm from PCI: “MASKING” – here the basic parameters are derived from expected TOA values that we can measure using the C0C1 coeffis. This isnt extremely accurate but allows masking in order to avoid overcorrection at a later stage.
for ORTHO in input_files_list: print "5. Beginning Cloudmasking ..." fili = ORTHO filo = ORTHO + "MASKING.pix" visirchn = [1,3,5,0] # RE channels blue red nir and swir missing asensor = "RapidEye" # ssensor is RapidEye should be "Eye"! here - should be changed as appropriate cfile = "/opt/geomatica_2017/atcor/cal/rapideye_m1/rapideye_mode1.cal" #The path and file name of the text # file that contains, for each band, the calibration coefficients # used to transform the values from the image to absolute radiance values. znangle = [] # ingest that from metadata hazecov = [15] # default is 50% - me thinks this is usually too much! clthresh = [] wuthresh = [] srcbgd = "" masking(fili, srcbgd, asensor, visirchn, cfile, znangle, hazecov, clthresh, wuthresh, filo)
cfile = “/opt/geomatica_2017/atcor/cal/rapideye_m1/rapideye_mode1.cal”
The path and file name of the text file that contains, for each band, the calibration coefficients used to transform the values from the image to absolute radiance values.
4. Haze Removal using HAZEREM
With the cloud and haze mask – stored as a bitmap layers we now apply the haze correction using HAZEREM.
# ---------------------------------------------------------------------------------- # Use HAZERM algorithm with cloud mask from MASKING to avoid overcorrection # ---------------------------------------------------------------------------------- print "6. Beginning HAZEREM ..." fili = ORTHO # we use the original ORTHO output as input here but dont mix it with maskfili fili_pan = "" # we dont have this for RapidEye srcbgd = "" asensor = "Rapideye" # Rapideye "eye"here visirchn = [] # this is taken vom ASENSOR automatically chanopt = "p,p,p,c,c" # "p,p,p,c,c" - apply haze removal to 1,2,3 and copy all NIR channels wo maskfili = ORTHO + "MASKING.pix" maskseg = [2,3,4] # hazerem expects haze mask in 2, cloud mask in 3 and water mask in segment 4 hazecov = [15] # we go as low as possible here - overcorrection should be avoided hazeflsz = [] filo = ORTHO + "MASKING-HAZEREM.pix" filo_pan = "" ftype = "PIX" foptions = "" hazerem( fili, fili_pan, srcbgd, asensor, visirchn, chanopt, maskfili, maskseg, hazecov, hazeflsz, filo, filo_pan, ftype, foptions )
5. TERSETUP creates the DEM derivates
To perform topographic normalization we need a DEM -slope/aspect and illumination angles to calculate the local incidence angle. This is done by ATCOR on the fly given you provide aspect/slope. This is what TERSETUP does. ILLUMCAST can calculate the local incidence angle but ILLUMCAST is not needed when we use TERSETUP. Note however that you might need another lowpass filtering run to reduce artefacts that originate often from edges in your DEM. Some higher order AVERAGE low pass filter runs are helpful than:
From PCI online help: http://www.pcigeomatics.com/geomatica-help/references/pciFunction_r/python/P_fme.html from pci.fme import * file = 'input.pix' dbic = [10] # elevation data channel dboc = [11] # same as input channel flsz = [5,5] # 5x5 filter mask = [] # process entire image bgrange = [] failvalu = [] bgzero = '' fme( file, dbic, dboc, flsz, mask, bgrange, failvalu, bgzero )
or use an average filter:
http://www.pcigeomatics.com/geomatica-help/references/pciFunction_r/python/P_fav.html
from pci.fav import fav file = 'input.pix' dbic = [1] # elevation data channel dboc = [6] # output channel flsz = [5,5] # use a 5x5 filter mask = [] # bitmap mask bgrange = [0] # remove background values failvalu = [] bgzero = '' # default, set background to 0 fav( file, dbic, dboc, flsz, mask, bgrange, failvalu, bgzero )
# ---------------------------------------------------------------------------------- # Use TERSETUP algorithm to prepare terrain derivates: slope, aspect, and skyview rasters from an input DEM. # ---------------------------------------------------------------------------------- print "7. Beginning TERSETUP ..." filedem = DEMFile # input DEM file should be hard coded ASTER GDEM 15 or SRTM 30 dbec = [1] # input DEM channel expected in channel 1 here terfile = "tersetup.pix" # output terrain image file - we use a temp file here backelev = [] # search for No Data value in metadata elevref = "ELLIPS" # default, Mean Sea Level elevunit = "METER" # default, METER elfactor = [] # default, [0.0,1.0] tersetup( filedem, dbec, terfile, backelev, elevref, elevunit, elfactor )
ILLUMCAST is not needed by ATCOR when TERSETUP is run – ATCOR calculates local incidence angle on the fly from tersetup output.
6. ATCOR – Atmospheric Correction – calculating scaled reflectance
Now we have all the input layers for ATCOR. MASKING and HAZEREM provide the cloud/haze/water mask and generated the haze corrected dataset, TERSETUP created the DEM derivates to perform a topographic normalization.
# ---------------------------------------------------------------------------------- # ATCOR algorithm # ---------------------------------------------------------------------------------- print "8. Beginning ATCOR ..." fili = ORTHO + "MASKING-HAZEREM.pix" dbic = [1,2,3,4,5] srcbgd = "" asensor = "Rapideye Mode1" cfile = "/opt/geomatica_2017/atcor/cal/rapideye_m1/rapideye_mode1.cal" # this is linux specific - should be change to your setup maskfili = ORTHO + "MASKING.pix" terfile = "tersetup.pix" illufile = "" meanelev = [10] vistype = "constant,25.0" visfilo = "" atmdef = "Urban" # needs modification for every project atmcond = "summer" # needs modification for every project satilaz = [] # is taken from the metadata by ATCOR - should be left open sazangl = [] # is also taken from the metadata by ATCOR, #note: Solar zenith = 90 degrees - solar_elevation adjacen = "ON,5" brdffun = [] terrefl = [] # Optionally specifies the number iterations used to apply adjacency effect corrections. use 0 to switch off outunits = "16bit_Reflectance" # we want 16bit Reflectance to keep it simple filo = ORTHO + "MASKING-HAZEREM-ATCOR.pix" ftype = "PIX" foptions= "" try: atcor( fili, dbic, srcbgd, asensor, cfile, maskfili, terfile, illufile, meanelev, vistype, visfilo,\ atmdef, atmcond, satilaz, sazangl, adjacen, brdffun, terrefl, outunits, filo, ftype, foptions ) except PCIException, e: print e except Exception, e: print e print " Beginning RESAMP 100m ..." fili = ORTHO + "MASKING-HAZEREM-ATCOR.pix" filo = ORTHO + "MASKING-HAZEREM-ATCOR-RESAMP100m.pix" dbic = [1,2,3,4,5] dbsl = [] sltype = "ALL" ftype = "PIX" foptions = ' ' pxszout = [100, 100] # output with meter resolution resample = "CUBIC" # Cubic Convolution method try: resamp(fili, dbic, dbsl, sltype, filo, ftype, foptions, pxszout, resample ) except PCIException, e: print e except Exception, e: print e print "END Atcor finished", file
Some comments that I removed from the script for better readability (partly take from the PCI Online Help):
ATCOR receives as input a haze-free image, the cloud and cloud shadow, water, ice/snow and saturated pixel masks, and the terrain derivatives and illumination map for each scene to process to compute the visibility map and apply the atmospheric LUT to the scene, so that the atmospheric addition to the DN of each pixel can be removed. ATCOR allows users to prepare their data for analysis, such as GCP collection, segmentation, classification, or extraction of vegetation indices. ATCOR depends heavily on the quality of the visibility map and the accuracy of the atmospheric tables (based on MOTRAN1 code), as well as on the quality of the haze-free scene and terrain derivatives.
maskfili = ORTHO + “MASKING.pix”
specifies the name of the input file that contains the Haze and Cloud mask produced by MASKING for the reference image. The specified file must match exactly the geocoding of the input image and the mask for cloud and haze. ATCOR assumes the first bitmap found to be the haze mask, and the second to be the Cloud mask.
illufile = “”
Optionally specifies the name of the input illumination file that contains the illumination map and topographic cast shadow bitmap produced by ILLUMCAST for the input image. If the illumination file is not provided, the information is computed on-the-fly when a Terrain Derivatives file (TERFILE) is provided.
adjacen = “ON,5”
Optionally specifies whether a correction for the adjacency effect is required. This parameter is composed of a keyword and an optional filter size. Available options are:
OFF: no adjacency effect correction is applied,
ON,n: the correction for adjacency is applied using a nxn kernel size, where n is an odd number between 3 and 39
ON: the correction for adjacency is applied, using a 3×3 kernel size
The adjacency is the radiation reflected from the neighborhood and scattered energy into the viewing direction. The effect is a result of atmospheric scattering, and depends on the reflectance contrast between a target pixel and its large-scale neighborhood, and decreases with wavelength. It reduces the apparent surface contrast by decreasing the top-of-the-atmosphere radiance over bright pixels and increasing the brightness of the dark pixels. The adjacency effect causes a certain amount of blurring, known as crosstalk. This effect is often noticeable at the boundary of adjacent features; for example, roads passing through a forest may appear blurred. The adjacency effect is especially important for sensors of high spatial resolution, such as RapidEye, KazEOSat-2 and SPOT and is usually negligible for low spatial resolution sensors such as AVHRR and Landsat.
brdffun = []
[],[],[] where:
FUNCTION: a number between 0 and 4, each representing a different correction function
THRSANG: the threshold illumination angle (by default, set as the solar zenith angle)
LBOUND: indicates the lowest correction applied. By default, set to 0.25, meaning that the pixel values will be changed at most by 75%, in order to avoid overcorrection.
This parameter can receive up to five different values, which set the following correction functions:
0: no BRDF correction is applied,
1: sets the correction factor as a linear function based on the illumination angle (i), defined as cos(i)/cos(THRSANG),
2: sets the correction factor as an exponential function, defined as sqrt [cos(i)/cos(THRSANG)],
3: sets the correction factor as a linear function based on the illumination angle and exitance angles, defined as cos (i)*cos(e)/cos(THRSANG),
4: sets the correction factor as an exponential function based on the illumination angle and exitance angles, defined as sqrt[cos(i)*cos(e)/cos(THRSANG)]. This function is relevant only for tilt sensors.
When a Terrain Derivative (TERFILE) file is specified, this parameter is set to 2 by default. The effects of the threshold illumination are scene-dependent and its effect varies according to local topography. If the topographic correction must be increased on the steepest slopes, the following rules are recommended, to define the threshold angle: if solar zenith is less than 20 degrees, THRSANG = Solar Zenith + 20 degrees if the solar zenith angle is between 20 and 45 degrees, THRSANG = Solar Zenith + 15 degrees if the solar zenith angle is higher than 45 degrees, THRSANG = Solar Zenith + 10 degrees.
tbc