I want to calculate the clipped raster layer percentage for a shapefile consist of overlapping polygons, since ArcGIS doesn't support tabulate area for overlapping features, so I used a loop to process each of these overlapping feature one by one:
1: Create an independent shapefile for each polygon from the attribute table
2: Clip the raster layer using the single one polygon created from step 1.
3: Calculate the raster grid percentage by counting the numbers of raster grids in each categories.
I was able to code this up, and the script run well but slowly for a big dataset. I run this loop for 2000+ big overlapping polygons and it took hours. I tried to optimize this computation by:
1: Use arcpy.da.SearchCursor instead of arcpy.SearchCursor 2: Using in-memory workspace to storage the temperate files. 3: Use Parallel Processing: arcpy.env.parallelProcessingFactor = "100%"
These are all helpful but I still think can improve more. I am thinking about using python parallel processing tool such as map function. Could anyone suggest on how to apply the map function on this loop process?
The following is the skeleton of my script:
featureClass = "E:/ArcGIS/access_to_recreation/new.gdb/Portland_ME_service_2hr" rows = arcpy.da.SearchCursor(featureClass, ['FacilityID']) # Loop through attribute table for row in rows: id = row # Extract each row as a independent polygon layer throughout the entire table inputfc = "E:/ArcGIS/access_to_recreation/new.gdb/Portland_ME_service_2hr" outputfc = "in_memoryfeature" # edit here! fieldname = "FacilityID" fieldvalue = id # construct the where clause by calling the helper function where_clause = buildWhereClause(inputfc, fieldname, fieldvalue) # execute Select analysis, create separate layers arcpy.Select_analysis(inputfc, outputfc, where_clause) # # Clip the raster data using each polygon inRaster = "E:/ArcGIS/PAD/For_Recreational.gdb/PAD_Parks_continental_final" inMaskData = "in_memoryfeature" # edit here! # Execute ExtractByMask outExtractByMask = ExtractByMask(inRaster, inMaskData) outExtractByMask.save("in_memoryextract_mask.img") # build raster attribute table try: arcpy.BuildRasterAttributeTable_management("in_memoryextract_mask.img", "Overwrite") # Count the 0, 1 pixels in the mask output raster_rows = arcpy.da.SearchCursor("in_memoryextract_mask.img", ['OID', 'Value', 'Count']) try: for ra_row in raster_rows: if ra_row == 0: blank = ra_row else: value = ra_row percentage = 100 * float(value)/(float(blank)+ float(value)) except: percentage = 0 Percentage.append(percentage) ID_list.append(id) # Delete intermediate files arcpy.Delete_management("in_memory")
You should not need to write the temporary rasters on disks. arcpy optimization is quite obscure to me, but not writing the feature class should help. In fact, you should be able to use geometry object directly for the clip
for row in rows: # Execute ExtractByMask outExtractByMask = ExtractByMask(inRaster, row.getValue(shapefieldname)) outExtractByMask.save("in_memoryextract_mask.img")
If I understand your question correctly, the following workflow could be an alternative.
Create sets of non-overlapping polygon either by code (see this discussion) or manually (run 'zonal statistics as table', identify the zones that have not been calculated and separate them into a new layer. Repeat with the separated layer to find more potential overlapping zones until none exist.)
Then reclassify your parcs raster to a binary grid with zeros (what you refer to as 'blank') and everything else with ones (your field 'value').
Create a script and run the "zonal statistics as table" function with each of your polygon sets and concatenate the tables. The resulting tables will have the total number of cells within each zone and the sum of the values which will be the number of cells with parks. From there it should be easy to calculate percentage.
Zonal stats is lightning fast in current versions of ArcGIS.