natcap.invest.utils¶
InVEST specific code utils.
-
class
natcap.invest.utils.
ThreadFilter
(thread_name)¶ Bases:
logging.Filter
Filters out log messages issued by the given thread.
Any log messages generated by a thread with the name matching the threadname provided to the constructor will be excluded.
-
filter
(record)¶ Filter the given log record.
- Parameters
record (log record) – The log record to filter.
- Returns
True if the record should be included, false if not.
-
-
natcap.invest.utils.
array_equals_nodata
(array, nodata)¶ Check for the presence of
nodata
values inarray
.The comparison supports
numpy.nan
nodata values.- Parameters
array (numpy array) – the array to mask for nodata values.
nodata (number) – the nodata value to check for. Supports
numpy.nan
.
- Returns
A boolean numpy array with values of 1 where
array
is equal tonodata
and 0 otherwise.
-
natcap.invest.utils.
build_file_registry
(base_file_path_list, file_suffix)¶ Combine file suffixes with key names, base filenames, and directories.
- Parameters
base_file_tuple_list (list) – a list of (dict, path) tuples where the dictionaries have a ‘file_key’: ‘basefilename’ pair, or ‘file_key’: list of ‘basefilename’s. ‘path’ indicates the file directory path to prepend to the basefile name.
file_suffix (string) – a string to append to every filename, can be empty string
- Returns
dictionary of ‘file_keys’ from the dictionaries in base_file_tuple_list mapping to full file paths with suffixes or lists of file paths with suffixes depending on the original type of the ‘basefilename’ pair.
- Raises
ValueError if there are duplicate file keys or duplicate file paths. –
ValueError if a path is not a string or a list of strings. –
-
natcap.invest.utils.
build_lookup_from_csv
(table_path, key_field, column_list=None, to_lower=True)¶ Read a CSV table into a dictionary indexed by
key_field
.Creates a dictionary from a CSV whose keys are unique entries in the CSV table under the column named by
key_field
and values are dictionaries indexed by the other columns intable_path
includingkey_field
whose values are the values on that row of the CSV table.If an entire row is NA/NaN (including
key_field
) then it is dropped from the table and a warning is given of the dropped rows.- Parameters
table_path (string) – path to a CSV file containing at least the header key_field
key_field – (string): a column in the CSV file at table_path that can uniquely identify each row in the table and sets the row index.
column_list (list) – a list of column names to subset from the CSV file, default=None
to_lower (bool) – if True, converts all unicode in the CSV, including headers and values to lowercase, otherwise uses raw string values. default=True.
- Returns
a dictionary of the form {key_field_0: {csv_header_0: value0, csv_header_1: value1…}, key_field_1: {csv_header_0: valuea, csv_header_1: valueb…}}
if
to_lower
all strings including key_fields and values are converted to lowercase unicode.- Return type
lookup_dict (dict)
- Raises
ValueError – If ValueError occurs during conversion to dictionary.
KeyError – If
key_field
is not present duringset_index
call.
-
natcap.invest.utils.
capture_gdal_logging
()¶ Context manager for logging GDAL errors with python logging.
GDAL error messages are logged via python’s logging system, at a severity that corresponds to a log level in
logging
. Error messages are logged with theosgeo.gdal
logger.- Parameters
None –
- Returns
None
-
natcap.invest.utils.
create_coordinate_transformer
(base_ref, target_ref, osr_axis_mapping_strategy=0)¶ Create a spatial reference coordinate transformation function.
- Parameters
base_ref (osr spatial reference) – A defined spatial reference to transform FROM
target_ref (osr spatial reference) – A defined spatial reference to transform TO
osr_axis_mapping_strategy (int) – OSR axis mapping strategy for
SpatialReference
objects. Defaults toutils.DEFAULT_OSR_AXIS_MAPPING_STRATEGY
. This parameter should not be changed unless you know what you are doing.
- Returns
An OSR Coordinate Transformation object
-
natcap.invest.utils.
exponential_decay_kernel_raster
(expected_distance, kernel_filepath)¶ Create a raster-based exponential decay kernel.
The raster created will be a tiled GeoTiff, with 256x256 memory blocks.
- Parameters
expected_distance (int or float) – The distance (in pixels) of the kernel’s radius, the distance at which the value of the decay function is equal to 1/e.
kernel_filepath (string) – The path to the file on disk where this kernel should be stored. If this file exists, it will be overwritten.
- Returns
None
-
natcap.invest.utils.
has_utf8_bom
(textfile_path)¶ Determine if the text file has a UTF-8 byte-order marker.
- Parameters
textfile_path (str) – The path to a file on disk.
- Returns
A bool indicating whether the textfile has a BOM. If
True
, a BOM is present.
-
natcap.invest.utils.
log_to_file
(logfile, exclude_threads=None, logging_level=0, log_fmt='%(asctime)s (%(name)s) %(module)s.%(funcName)s(%(lineno)d) %(levelname)s %(message)s', date_fmt=None)¶ Log all messages within this context to a file.
- Parameters
logfile (string) – The path to where the logfile will be written. If there is already a file at this location, it will be overwritten.
exclude_threads=None (list) – If None, logging from all threads will be included in the log. If a list, it must be a list of string thread names that should be excluded from logging in this file.
logging_level=logging.NOTSET (int) – The logging threshold. Log messages with a level less than this will be automatically excluded from the logfile. The default value (
logging.NOTSET
) will cause all logging to be captured.log_fmt=LOG_FMT (string) – The logging format string to use. If not provided,
utils.LOG_FMT
will be used.date_fmt (string) – The logging date format string to use. If not provided, ISO8601 format will be used.
- Yields
handler
–- An instance of
logging.FileHandler
that represents the file that is being written to.
- An instance of
- Returns
None
-
natcap.invest.utils.
make_directories
(directory_list)¶ Create directories in directory_list if they do not already exist.
-
natcap.invest.utils.
make_suffix_string
(args, suffix_key)¶ Make an InVEST appropriate suffix string.
Creates an InVEST appropriate suffix string given the args dictionary and suffix key. In general, prepends an ‘_’ when necessary and generates an empty string when necessary.
- Parameters
args (dict) – the classic InVEST model parameter dictionary that is passed to execute.
suffix_key (string) – the key used to index the base suffix.
- Returns
- If suffix_key is not in args, or args[‘suffix_key’] is “”
return “”,
- If args[‘suffix_key’] starts with ‘_’ return args[‘suffix_key’]
else return ‘_’+`args[‘suffix_key’]`
-
natcap.invest.utils.
matches_format_string
(test_string, format_string)¶ Assert that a given string matches a given format string.
This means that the given test string could be derived from the given format string by replacing replacement fields with any text. For example, the string ‘Value “foo” is invalid.’ matches the format string ‘Value “{value}” is invalid.’
- Parameters
test_string (str) – string to test.
format_string (str) – format string, which may contain curly-brace delimited replacement fields
- Returns
True if test_string matches format_string, False if not.
-
natcap.invest.utils.
mean_pixel_size_and_area
(pixel_size_tuple)¶ Convert to mean and raise Exception if they are not close.
- Parameter:
- pixel_size_tuple (tuple): a 2 tuple indicating the x/y size of a
pixel.
- Returns
tuple of (mean absolute average of pixel_size, area of pixel size)
- Raises
ValueError if the dimensions of pixel_size_tuple are not almost – square.
-
natcap.invest.utils.
prepare_workspace
(workspace, name, logging_level=0, exclude_threads=None)¶ Prepare the workspace.
-
natcap.invest.utils.
read_csv_to_dataframe
(path, to_lower=False, sep=None, encoding=None, engine='python', **kwargs)¶ Return a dataframe representation of the CSV.
Wrapper around
pandas.read_csv
that standardizes the column names by stripping leading/trailing whitespace and optionally making all lowercase. This helps avoid common errors caused by user-supplied CSV files with column names that don’t exactly match the specification.- Parameters
path (string) – path to a CSV file
to_lower (bool) – if True, convert all column names to lowercase
sep – separator to pass to pandas.read_csv. Defaults to None, which lets the Python engine infer the separator (if engine=’python’).
encoding (string) – name of encoding codec to pass to pandas.read_csv. Defaults to None. Setting engine=’python’ when encoding=None allows a lot of non-UTF8 encodings to be read without raising an error. Any special characters in other encodings may get replaced with the replacement character. If encoding=None, and the file begins with a BOM, the encoding gets set to ‘utf-8-sig’; otherwise the BOM causes an error.
engine (string) – kwarg for pandas.read_csv: ‘c’, ‘python’, or None. Defaults to ‘python’ (see note about encoding).
**kwargs – any kwargs that are valid for
pandas.read_csv
- Returns
pandas.DataFrame with the contents of the given CSV
-
natcap.invest.utils.
reclassify_raster
(raster_path_band, value_map, target_raster_path, target_datatype, target_nodata, error_details)¶ A wrapper function for calling
pygeoprocessing.reclassify_raster
.This wrapper function is helpful when added as a
TaskGraph.task
so a better error message can be provided to the users if apygeoprocessing.ReclassificationMissingValuesError
is raised.- Parameters
raster_path_band (tuple) – a tuple including file path to a raster and the band index to operate over. ex: (path, band_index)
value_map (dictionary) – a dictionary of values of {source_value: dest_value, …} where source_value’s type is the same as the values in
base_raster_path
at bandband_index
. Must contain at least one value.target_raster_path (string) – target raster output path; overwritten if it exists
target_datatype (gdal type) – the numerical type for the target raster
target_nodata (numerical type) – the nodata value for the target raster Must be the same type as target_datatype
error_details (dict) –
a dictionary with key value pairs that provide more context for a raised
pygeoprocessing.ReclassificationMissingValuesError
. keys must be {‘raster_name’, ‘column_name’, ‘table_name’}. Values each key represent:’raster_name’ - string for the raster name being reclassified ‘column_name’ - name of the table column that
value_map
dictionary keys came from. ‘table_name’ - table name thatvalue_map
came from.
- Returns
None
- Raises
ValueError if values_required is True and a pixel value from –
raster_path_band` is not a key in value_map –
-
natcap.invest.utils.
sandbox_tempdir
(suffix='', prefix='tmp', dir=None)¶ Create a temporary directory for this context and clean it up on exit.
Parameters are identical to those for
tempfile.mkdtemp()
.When the context manager exits, the created temporary directory is recursively removed.
- Parameters
suffix='' (string) – a suffix for the name of the directory.
prefix='tmp' (string) – the prefix to use for the directory name.
dir=None (string or None) – If a string, a directory that should be the parent directory of the new temporary directory. If None, tempfile will determine the appropriate tempdir to use as the parent folder.
- Yields
sandbox
(string) – The path to the new folder on disk.- Returns
None