API

ep_class

class ecopop.ep_class.epu(path_pop, path_hthi, path_water_raster=None, bounding=None, waterbody_thresh=None)

The methods of this class are for generating and exporting ecopop Units. Functions are available for plotting and exporting as well.

compute_ep_classes_kmeans(n_groups)

Creates an image where each pixel value is the EP group to which the pixel belongs. Pixels are grouped via a k-means clustering based on the (population, hab index) for each pixel. The number of groups must be specified and can be thought of as the number of regions in which the population, hab index space is divided into.

n_groups is NOT the total number of ecopop units, but the number of EP unit types.

self.centroids contains the centroid of the (population, hab index) “coordinates” of each group–there is no spatial information here.

compute_ep_classes_ranges(breaks={'hthi': [-0.1, 0.3, 0.5, 0.6, 0.7, 0.8, 1.1], 'pop': [-100, -4, -2, -1, 0, 1, 100]})

Divides the pop vs mhi space into ecopop units based on a supplied breaks dictionary that defines the boundaries along each axis.

Populations of 0 were set to a very low number so as not to error in log-transformation. This should be accounted for when supplying breaks; i.e. make sure there’s an interval that captures only this value.

simplify_epu_classes(min_class_size=4, nodata=0, maxiter=10, unique_neighbor=False)

Merges smaller epu classes into their neighbors. Uses an iterative approach because class regions change if a neighboring class region is absorbed into it. unique_neighbor = True : unique_neighbor = False : fills all patches smaller than minpatchsize with the mode of the neighboring pixel labels - this option will ensure that

Parameters

min_class_sizeinteger, optional: Minimum area, in pixels, that a class size can have. The default is 4.
nodatainteger, optional: Class type 0 corresponds to nodata in the epu class code. Specifying this is necessary to avoid setting valid classes to nodata types. The default is 0.
maxiterinteger, optional: Maximum number of iteration to attempt to . The default is 10.
unique_neighborboolean, optional: If True, only merges a class region if its neighboring pixels all share the same label (more conservative). This option will allow patches smaller than min_class_size to persist. If False, will merge ALL patches smaller than min_class_size using the mode of the neighboring pixel classes. This option will ensure that no class regions will be larger than min_class_size EXCEPT in cases of a class regions surrounded by nodata. The default is False.

Returns

Adds a ‘epu_class_simplified’ layer to the epu.I dictionary.

compute_epus(target_epu_size, min_epu_size, nodata=0)

Divides the computed classes into regions of (approximately) target_epu_size pixels each. The resulting epu raster is then polygonized.

target_size_pixels : target epu size in pixels minpatch is the smallest patch size allowed, in pixels unique_neighbor = True : only fills a patch if its neighboring pixels all share the same label (more conservative) - this option will allow patches larger than minpatch to persist. unique_neighbor = False : fills all patches smaller than minpatchsize with the mode of the neighboring pixel labels

Parameters

target_epu_sizeint: The desired size of each epu, in pixels.
min_epu_sizeint: The desired minimum size of each epu.
nodataint, optional: Specify nodata class value. The epu class is designed to set these to 0. The default is 0.

Returns

Adds ‘epu’ and ‘epu_simlified’ layers to the epu.I dictionary. Adds a new attribute (‘epus’) to the class; this attribute is a polygonized version of self.I[‘epu_simplfied’].

compute_epu_stats(do_stats)

do_statsdict: keys are names of layers to compute stats for values are two-element lists of [path_to_raster, [stats to compute]]

smooth_layers(layers, sigmas, write=False): By default will smooth layers from the layers_norm dict. Call eut.smooth_layer() directly if unnormalized layer smoothing is desired. sigma is the smoothing parameter. Higher sigma -> more smoothing.

watersheds(path, path_out=None)

Computes the fraction of watershed within each epu.

pathstr: The path to the geopandas-readable watershed geometry file.
path_outstr: The path to write the watershed/epu dataframe. If None, nothing will be written but the dataframe will be stored as an object in the epu class.

compute_adjacency(layer='epu_simplified')

Computes the adjacency of a raster layer, typically ‘epu_simplfied’. Must be run after computing epus if layer is not specified.

Parameters

layerstr, optional: The layer within self.I to compute adjaceny on. The default is ‘epu_simplfied’.

Returns

adj_dfpandas.DataFrame: The adjacency dataframe.

export_raster(whichraster, path)

Exports the geotiff and polygon versions of the ecopop Units. Check the paths dictionary for where these are exported.

whichrasterstr: The key within self.I to export.
pathstr: The path to export to.

ep_utils

ecopop.ep_utils.load_layers(layer_names, paths): Loads all the specified layers into a dictionary

ecopop.ep_utils.smooth_layer(layer, sigma)

Replaces the astropy smoothing method with a much more efficient one. Smooths a layer containing nan values by considering only weights from non-nan values. sigma is the size of the Gaussian smoothing kernel.

https://stackoverflow.com/questions/18697532/gaussian-filtering-a-image-with-nan-in-python/36307291

ecopop.ep_utils.nan_waterbodies(layers, paths): Sets all persistent waterbodies to np.nan in all layers. Layers may also be an image.

ecopop.ep_utils.layer_means(layers): Gets the mean of each layer. Population must be treated separately because we want the non-zero mean.

ecopop.ep_utils.call_gdal(callstr): Executes a command-line gdal string with subprocess.

ecopop.ep_utils.fit_geotiff_into_another(ref, tofit, outpath, dtype='Byte', matchres=True, src_nodata=None, dst_nodata=None, resampling='bilinear'): Clips a geotiff (tofit) by a reference geotiff (ref), then matches the extents of the clipped to that of the reference.

ecopop.ep_utils.add_raster_stats(path_raster)

Adds raster statistics to a raster’s metadata using GDAL. Can take awhile for large rasters as the statistics are not approximated, but computed on all the available values. Only needs to be run once for a given raster or virtual raster, as the stats are stored in the raster’s metadata.

Parameters

path_rasterTYPE: DESCRIPTION.

Returns

None.

ecopop.ep_utils.get_raster_stats(path_raster)

Retrieves raster statistics from metadata of a raster. If none are available, they will automatically be computed. Currently designed for a single-band raster.

Parameters

path_rasterstr: Path to the raster to fetch statistics.

Returns

minvalfloat: Minimum value of the raster.
maxvalfloat: Maximum value of the raster.
meanvalfloat: Mean value of the raster.
stdvalfloat: Standard deviation of the raster.

ecopop.ep_utils.normalize_layers(layers, layerlist)

Normalizes layers appropriately between 0 and 1, where 0 and 1 correspond to the layer’s contribution to the MHI. E.g. for HAND, higher values corresponds to lower MHI, so this layer will be inverted when normalizing.

Returns a dictionary of normalized layers. gdp - normalized on 0,1 with higher values corresponding to lower gdp pop - normalized on 0,1 with higher values corresponding to higher gdp

In order to have epus be consistent across all spatial domains, normalization parameters are hard-coded based on physical reasoning or global statistics of the layer.

ecopop.ep_utils.simplify_classes(Ilabeled, minpatchsize, nodata=0, unique_neighbor=True, maxiter=10): Given an image whose pixels are all integer labels, this will fill any patches of the same label equal to or smaller than minpatchsize with either unique_neighbor = True : only fills a patch if its neighboring pixels all share the same label unique_neighbor = False : fills all patches smaller than minpatchsize with the mode of the neighboring pixel labels

ecopop.ep_utils.simplify_epus(Iepu, Iclasses, target_epu_size, min_epu_size, nodata): Given an image where pixel values correspond to the epu to which the pixel belongs, this attempts to merge smaller epus with neighboring ones of the same class such that no epus’ areas are smaller than min_epu_size.

ecopop.ep_utils.polygonize_epu(I, geotransform, proj_wkt, Imask=None)

Polygonizes epus using in-memory process (no need to write geotiff to disk). The resulting GeoDataFrame has an ‘epu_id’ column that represents the value of the pixels comprising each polygon.

Parameters

Inp.array: Raster to polygonize.
geotransformtuple: 6-element GDAL GeoTransform
proj_wktstr: Well-known-text representation of the CRS.
Imasknp.array, optional: Binary array where 1s are valid. Must be same shape as I.

Returns

gdfgeopandas.GeoDataFrame: Polygons of the rasterized image I.

ecopop.ep_utils.epu_stats(do_stats, poly_gdf): Computes all desired stats for each epu. which_stats: dictionary whose kyes correspond to those in paths and whose values are the desired stats for each variable (look at rasterstats for stat choices, but they’re pretty intuitive.) paths: dictionary containing paths to the various rasters to be analyzed

ecopop.ep_utils.get_stats(rastpath, poly_path, nodata=-999, stats='mean', prefix='')

Given the path to the rasterized epu polygons and a path to a raster we want to compute statistics, this computes the stats in ‘stats’ and returns a DataFrame containining all the stats for each epu.

Note that epu polygons must be in the same coordinate reference system as the provided raster. In the case of epus, the rasters are all in EPSG:4326 and the epus are derived from these rasters, so they are also in EPSG:4326.

ecopop.ep_utils.get_nodata_value(tifpath): Reads a geotiff’s metadata to return the notdata value. This is converted to an int if the value is whole.

ecopop.ep_utils.areagrid(georaster_path): Must provide georaster in 4326 CRS

ecopop.ep_utils.build_vrt(tilespath, clipper=None, extents=None, outputfile=None, nodataval=None, res=None, sampling='nearest', ftype='tif')

Creates a text file for input to gdalbuildvrt, then builds vrt file with same name. If output path is not specified, vrt is given the name of the final folder in the path.

INPUTS: tilespath - str: the path to the file (or folder of files) to be clipped if tilespath contains an extension (e.g. .tif, .vrt), then that file is used. Otherwise, a virtual raster will be built of all the files in the provided folder. if tilespath contains an extension (e.g. .tif, .vrt), filenames of tiffs to be written to vrt. This list can be created by tifflist and should be in the same folder extents - list: (optional) - the extents by which to crop the vrt. Extents should be a 4 element list: [left, right, top, bottom] in the ssame projection coordinates as the file(s) to be clipped clipper - str: path to a georeferenced image, vrt, or shapefile that will be used to clip outputfile - str: path (including filename w/ext) to output the vrt. If none is provided, the vrt will be saved in the ‘filespath’ path res - flt: resolution of the output vrt (applied to both x and y directions) sampling - str: resampling scheme (nearest, bilinear, cubic, cubicspline, lanczos, average, mode) nodataval - int: (optional) - value to be masked as nodata ftype - str: ‘tif’ if buuilding from a list of tiffs, or ‘vrt’ if building from a vrt

OUTPUTS:: vrtname - str: path+filname of the built virtual raster

ecopop.ep_utils.get_raster_clipping_coords(bounds, gdobj)

Given a gdobj pointing to a raster and a list-like bounds (minx, miny, maxx, maxy), returns the row and col of the upper-leftmost pixel and the number of rows and columns to fetch. Also returns the GeoTransform of the clipped raster.

Bounds will be clipped to the extents of the raster if they’re beyond its limits.

ecopop.ep_utils.parse_path(path): Parses a file or folderpath into: base, folder (where folder is the outermost subdirectory), filename, and extention. Filename and extension are empty if a directory is passed.

ecopop.ep_utils.overlay_watersheds(epus, basins, check_coverage=False)

Overlays epus on a GeoDataFrame of watersheds/basins and returns a dataframe grouped by watersheds that contains the epus and respective areas for each within each watershed.

Parameters

epusgeopandas.GeoDataFrame: Computed by the epu class.
basinsgeopandas.GeoDataFrame: At a minimum, needs two columns: the watershed geometries and an id column called ‘id_gage’.

Returns

regroupedpandas.DataFrame: Contains three columns: id_gage, epu_id (array), area_km2 (array). The ordering of the epu_id and area_km2 arrays correspond.

ecopop.ep_utils.segment_binary_im(all_coords, imshape, target_n_pix, initial_label=1)

Takes a binary image of imshape, with “on” pixel coordinates defined by all_pixels and attempts to divide the binary image into regions of size target_n_pix, giving each region a unique label starting with initial_label.

Uses a breadth-first traversal algorithm to “grow” from initial points. An initial point is determined by the pixel that is farthest from the “centroid” of all pixels. Not actual centroid, simply the mean of all row, column coordinates.

Parameters

all_pixelsset of tuples: One entry per “on” pixel of the binary image.
imshapetuple OR list-like: (number of rows, number of columns).
target_n_pixinteger: Desired size of regions to divide the binary image into. This algorithm does not guarantee these sizes exactly.
initial_labelinteger, optional: The value to start with to apply labels to regions. The default is 1.

Returns

Iparentnp.array: Array of imshape size where each pixel value is the region it belongs to. “Background” pixels (i.e. those that are “off” in the initial binary image) are labeled 0.
label_idint: The highest label assigned to a region in Iparent (i.e. Iparent.flatten().max()).

ecopop.ep_utils.create_epus_from_classes(Iclasses, target_n_pix)

Given an initial image of EP classes (Iclasses), this will divide those classes into epus of approximately target_n_pix areas.

Parameters

Iclassesnumpy.array: Image of epu class labels for each pixel in the domain.
target_n_pixinteger: Target size for each epu.

Returns

Iregionsnumpy.array: Same shape as Iclasses; each epu is uniquely labeled.