landmapy

Land Mapping Python Package

The landmapy package is being built as a complement to the 2024-25 Earth Data Analytics course taught through the Earth Lab. Special thanks to Elsa Culler as well as Nate Quarderer, Lilly Jones-Sanovia, and Alison Post.

Interestingly, Earth Lab members developed a Python package a few years ago, earthpy (GitHub repo). It seems fairly self-contained, but may have some dated features. For instance, is uses rasterio, which seems to now be superceded by rioxarray. Still there are some interesting and subtle ideas here that are worth exploring.

This is somewhat a companion to my R package landmapr. They are being developed in parallel, with somewhat different goals. Right now, focus is on the python package to keep up with the Earth Data Analytics course.

Install and Import

From within python, you can install this package directly from GitHub:

pip install git+https://github.com/byandell-envsys/landmapy.git

Then you would use import landmapy, or more likely from landmapy.<module> import <function> to import desired functions.

I for now use my local machine cloned copy of the package in ~/Documents/GitHub/landmapy and the python command

pip install ~/Documents/GitHub/landmapy

Collaboration

I am happy to collaborate on development of this package. Please contact me and/or create issues. If you want to become more involved, contact me, fork the repo, modify (in a tame way, please) and submit pull requests.

Use Cases

This python package was begun in nov-dec 2024 as I found the project tools growing. I got some initial advice from EDA staff, then learned by doing and looking at other tools. To date, this package has been used in the following projects:

These are all craft pieces, with increasing use of functions. More recent projects shifted from a Jupyter notebook (project.ipynb) to a Quarto document project.qmd that is rendered as Markdown file project.md with accompanying *.png figures in project_files/figure-markdown/ using the shell command

$ quarto render project.qmd -t markdown

With care (see Plot Data section of Package Modules and Functions below), the resulting markdown project.md and *.png figures are compact (Kb, not Mb) and can be pushed to GitHub for ready viewing and sharing. Note that I set up the .gitignore file to ignore *_files/ folders; commenting this line out briefly to enable commiting the png files (followed by uncommenting *_files/) is a handy way to incorporate figures into the project.md once committed and pushed to GitHub.

Example use with Habitat Project

In a sense, this package enables me to off-load pages of code, replacing them by one-line commands. These basically look like pseudocode, but are actually functional. For instance, for the Habitat Suitability project last December (and now being revisited), here is the beginning.

First I visited USFS Geospatial Data Discovery: National Grassland Units (Feature Layer) and manually downloaded the GeoJSON file from DataSet into directory ~/earth-analytics/data/habitat. Then I did the following steps, shown below in code:

# Install `landmapy` package.
pip install --quiet git+https://github.com/byandell-envsys/landmapy.git

# Import needed libraries.
import geopandas as gpd # read geojson file into gdf
from landmapy.initial import create_data_dir # create (or retrieve) data directory
from landmapy.plot import plot_gdf_state # plot gdf with state overlay

data_dir = create_data_dir('habitat')
# Read all grasslands GeoJSON into `grassland_gdf`.
grassland_url = f"{data_dir}/National_Grassland_Units_(Feature_Layer).geojson"
grassland_gdf = gpd.read_file(grassland_url)
# Subset to desired locations.
buffalo_gdf = grassland_gdf.loc[grassland_gdf['GRASSLANDNAME'].isin(
        ["Buffalo Gap National Grassland", "Oglala National Grassland"])]
plot_gdf_state(buffalo_gdf)

Goals

Goal of EDA project

Organize tools by topic (module) & function
Build Quarto & Markdown environs
Viz data patterns with ggplot (plotnine)
Explore stats to prioritize interesting patterns, not to test
Collaborate with others to improve & share
Develop Shiny modular interactive apps (see my examples in Shiny Apps)

Broader goal

Collaborate widely
Share via self-documented training examples
Viz data patterns to improve insight
Explore AI tool environment
Evolve data as a verb

Technical goal

Rationalize plots more
- Fewer routines that are more flexible
- plot, hvplot/gvplot analogs
- ggplot widgets to visualize relationships
- overlays, side-by-side, over time movies/sliders
Better grasp of moving between da, df, gdf, other
- Should lead to simpler plot options
Algebra on images
- visualize in lat/lon/elev space
- explore multiple measurements

Package Modules and Functions

Plot Data

Several HoloViews and GeoViews functions have arisen and are included. These are cool functions and easy to manipulate or render interactively, but they generate massive image objects--Mb vs Kb for matplotlib.pyplot similar image objects. In some cases, I have created simpler plot functions to generate simpler qmd and md pages. For instance

#| label: fig-resid
from landmapy.plot import plot_gdfs_map
plot_gdfs_map(logndvi_cdc_gdf, column=['asthma','resid','edge_density'], color=['Blues','RdBu','Greens'])

generates a small (168Kb) named figure, big-data_files/figure-markdown/fig-resid-output-1.png with optional accompanying figure caption (via a line #| fig-cap: "Blah Blah"). An alternative is the fancier GeoViews/HoloViews, which generates a larger (Mb) object that is embedded in the Markdown, making it too big to render on GitHub. Here is that code:

import holoviews as hv
from landmapy.gvplot import gvplot_ndvi_index, gvplot_resid

model_fit = gvplot_ndvi_index(ndvi_cdc_gdf)
resid = gvplot_resid(logndvi_cdc_gdf, reg, yvar='asthma')
models_gv = (model_fit[0] + resid + model_fit[1])
hv.save(models_gv, 'bigdata_model.html')

Below are current plot functions:

module	function	return	effect	project	description
ggplot	coming...
gvplot	gvplot_gdf	gvplot	plot	plot	Plot asthma data as chloropleth
gvplot	gvplot_chloropleth	gvplot	plot	plot	Generate a chloropleth with the given color column
gvplot	gvplot_ndvi_index	gvplot	plot	plot	Plot NDVI and CDC data
gvplot	gvplot_resid	gvplot	plot	plot	Plot model residual
hvplot	hvplot_cluster		hvplot	plot	Plot of RGB and Clusters
hvplot	hvplot_delta_gdf	hvplot	plot	plot	HV Plot Delta GDF
hvplot	hvplot_matrix	hvplot	plot	plot	Plot of model matrix
hvplot	hvplot_tract_gdf	hvplot	plot	plot	Plot census tracts with satellite imagery background
hvplot	hvplot_train_test	hvplot	plot	plot	Plot test fit
hvplot	hvplot_index_grade	hvplot	plot	plot	Plots for index and grade
hvplot	hvplot_index_pred	hvplot	plot	plot	Plot the model results
plot	plot_cluster		plot	plot	Plot of RGB and Clusters
plot	plot_delta_gdf	plot	plot	plot	HV Plot Delta GDF
plot	plot_gdf_da		plot	plot	Overlay gdf on da map
plot	plot_gdf_state		plot	plot	Plot overlay of gdf with state boundaries
plot	plot_gdfs_map		plot	plot	Create Row of Plots
plot	plot_index		plot	plot	Show plot of index
plot	plot_matrix		plot	plot	Plot of model matrix
plot	plot_train_test		plot	plot	Plot test fit

Access Data with APIs

module	function	return	effect	project	description
cdcplaces	download_cdc_disease	df	download	CDC Places	Download CDC Disease data
cdcplaces	download_census_tract	gdf	download	CDC Places	Download the census tracts
cdcplaces	join_tract_cdc	gdf	merge	CDC Places	Join Census Tract and CDC Disease Data
cdcplaces	shp_tract_path	str		CDC Places	Set tract path
polaris	soil_url_dict	dict	read	POLARIS	Set up soil URLs based on place
polaris	merge_soil	da	read	POLARIS	Merge soil data
redline	redline_gdf	gdf	read	redline	Read redlining GeoDataFrame from Mapping Inequality
redline	redline_mask	gdf		redline	Create new gdf for redlining using regionmask
redline	redline_index_gdf	gdf		redline	Merge index stats with redlining gdf into one gdf
reflect	compute_reflectance_da	function		reflect	Connect to files over VSI, crop, cloud mask, and wrangle
reflect	merge_and_composite_arrays	function		reflect	Merge and Composite Arrays
reflect	read_delta_gdf	gdf	read	delta	Read Delta WBD using cache decorator
reflect	read_wbd_file	gdf	read	eelta	Read WBD File using cache key
reflect	reflectance_kmeans	df		reflect	KMeans Clusters for Reflectance Bands
reflect	reflectance_range	df		reflect	Check ranges of bands
reflect	reflectance_rgb	da		reflect	RGB saturation of reflectance
srtm	srtm_download	da	download	SRTM	Download SRTM data and create da
srtm	srtm_slope	da		SRTM	Calculate slope from SRTM data
thredds	process_maca	df	read	THREDDS	Process MACA Monthly Data
thredds	maca_year	da		THREDDS	Extract and print year data

Explore Data

module	function	return	project	description
explore	index_tree	decision_tree	explore	Convert categories to numbers
explore	ramp_logic	da	explore	Fuzzy ramp logic
explore	train_test	nparray	explore	Model fit using train and test sets
explore	var_trans	df	explore	Variable Selection and Transformation

Set up Data Mechanics

Initial module is useful for beginning of project. Process module has various mechanics that might belong elsewhere but seem broad in scope. Cached module is a decorator used in reflect.py to simplify caching of time-expensive objects (see EDA Reference Python Coding: Decorators for references). Check module is for checking parts of objects, at this point CSVs.

module	function	return	effect	project	description
initial	creata_data_dir	char	mkdir		Create Data Directory if it does not exist
initial	robust_code		setup		Make code robust to interruptions
cached	cached	function	decorator	reflect	A decorator to cache function results
check	header_csv	str			Header of CSV file
check	get_last_row_csv	str			Check Last Row of CSV File
check	check_element_in_csv	bool			Check value of element in CSV file
check	check_naip_tracts	df		NAIP	Check if NAIP tracts stored
process	da2gdf	gdf			Convert a DataArray to a GeoDataFrame using rioxarray and geopandas
process	gdf_da_bounds	da			Clip bounds from place_gdf on da extended by buffer
process	process_bands	da		process	Process bands from gdf with df metadata
process	process_cloud_mask	array		process	Load an 8-bit Fmask file and create a boolean mask
process	process_image	da		process	Load, crop, and scale a raster image from earthaccess
process	process_metadata	df		process	Create df of raster data URIs from earthaccess metadata

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
landmapy		landmapy
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
earthpy.ipynb		earthpy.ipynb
earthpy.qmd		earthpy.qmd
package.ipynb		package.ipynb
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

landmapy

Land Mapping Python Package

Install and Import

Collaboration

Use Cases

Example use with Habitat Project

Goals

Goal of EDA project

Broader goal

Technical goal

Package Modules and Functions

About

Releases

Packages

Languages

License

byandell-envsys/landmapy

Folders and files

Latest commit

History

Repository files navigation

landmapy

Land Mapping Python Package

Install and Import

Collaboration

Use Cases

Example use with Habitat Project

Goals

Goal of EDA project

Broader goal

Technical goal

Package Modules and Functions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages