API reference

IO Module

Accessing Datasets

The DataStore class provides access to a fast and efficient database of neighborhood indicators for the United States. The DataStore can read information directly over the web, or it can cache the datasets locally for (shared) repeated use. Large datasets are available quickly with no configuration by accessing methods on the class.

DataStore([data_dir])

Storage for geosnap data.

DataStore.acs([year, level, states])

American Community Survey Data.

DataStore.blocks_2000([states, fips])

Census blocks for 2000.

DataStore.blocks_2010([states, fips])

Census blocks for 2010.

DataStore.codebook()

Codebook.

DataStore.counties()

Nationwide counties as drawn in 2010.

DataStore.ejscreen([year, states])

EPA EJScreen Data <https://www.epa.gov/ejscreen>.

DataStore.ltdb()

Longitudinal Tract Database (LTDB).

DataStore.msa_definitions()

2010 Metropolitan Statistical Area definitions.

DataStore.msas()

Metropolitan Statistical Areas as drawn in 2020.

DataStore.ncdb()

Geolytics Neighborhood Change Database (NCDB).

DataStore.nces([year, dataset])

National Center for Education Statistics (NCES) Data.

DataStore.show_data_dir([verbose])

Print the location of the local geosnap data storage directory.

DataStore.states()

States.

DataStore.tracts_1990([states])

Nationwide Census Tracts as drawn in 1990 (cartographic 500k).

DataStore.tracts_2000([states])

Nationwide Census Tracts as drawn in 2000 (cartographic 500k).

DataStore.tracts_2010([states])

Nationwide Census Tracts as drawn in 2010 (cartographic 500k).

Storing data

To store the datasets locally for repeated use, or to register an external dataset with geosnap, such as the Longitudinal Tract Database (LTDB) or the Neighborhood Change Database (NCDB), the io module includes functions for caching data on your local machine. When you instantiate a DataStore class, it will use local files instead of streaming over the web.

io.store_acs([years, level, data_dir])

Save census American Community Survey 5-year data to the local geosnap storage.

io.store_census([data_dir, verbose])

Save census data to the local quilt package storage.

io.store_blocks_2000([data_dir])

Save census 2000 census block data to the local quilt package storage.

io.store_blocks_2010([data_dir])

Save census 2010 census block data to the local quilt package storage.

io.store_ejscreen([years, data_dir])

Save EPA EJScreen data to the local geosnap storage.

io.store_ltdb(sample, fullcount[, data_dir])

Read & store data from Brown's Longitudinal Tract Database (LTDB).

io.store_ncdb(filepath[, data_dir])

Read & store data from Geolytics's Neighborhood Change Database.

io.store_nces([years, dataset, data_dir])

Save NCES data to the local geosnap storage.

Querying datasets

io.get_acs(datastore[, level, state_fips, ...])

_summary_

io.get_census(datastore[, state_fips, ...])

Extract a subset of data from the decennial U.S.

io.get_ejscreen(datastore[, state_fips, ...])

Extract a subset of data from the EPA EJSCREEN as a long-form geodataframe.

io.get_gadm(code[, level, use_fsspec, gpkg, ...])

Collect data from GADM as a geodataframe.

io.get_lodes(datastore[, state_fips, ...])

Extract a subset of data from Census LEHD/LODES .

io.get_ltdb(datastore[, state_fips, ...])

Extract a subset of data from the Longitudinal Tract Database (LTDB) as a long-form geodataframe.

io.get_nces(datastore[, years, dataset])

Extract a subset of data from the National Center for Educational Statistics as a long-form geodataframe.

io.get_ncdb(datastore[, state_fips, ...])

Extract a subset of data from the Neighborhood Change Database (NCDB).

Analyze Module

Neighborhood Clustering Methods

Model neighborhood differentiation using multivariate clustering algorithms

analyze.cluster(gdf[, n_clusters, method, ...])

Create a geodemographic typology by running a cluster analysis on the study area's neighborhood attributes.

analyze.regionalize(gdf[, n_clusters, ...])

Create a spatial geodemographic typology by running a cluster analysis on the metro area's neighborhood attributes and including a contiguity constraint.

Neighborhood Dynamics Methods

Model neighborhood change using optimal-matching algorithms or spatial discrete Markov chains

analyze.linc(labels_sequence)

Local Indicator of Neighborhood Change

analyze.lincs_from_gdf(gdf, unit_index, ...)

generate local indicators of neighborhood change from a long-form geodataframe

analyze.predict_markov_labels(gdf[, ...])

Predict neighborhood labels based on spatial Markov transition model

analyze.sequence(gdf, cluster_col[, ...])

Pairwise sequence analysis and sequence clustering.

analyze.transition(gdf, cluster_col[, ...])

(Spatial) Markov approach to transitional dynamics of neighborhoods.

Segregation Dynamics Methods

Rapidly compute and compare changes in segregation measures over time and across space

analyze.segdyn.singlegroup_tempdyn(gdf[, ...])

Batch compute singlegroup segregation indices for each time period in parallel.

analyze.segdyn.multigroup_tempdyn(gdf[, ...])

Batch compute multigroup segregation indices for each time period.

analyze.segdyn.spacetime_dyn(gdf[, ...])

Batch compute multiscalar segregation profiles for each time period in parallel.

The ModelResults Class

Many of geosnap’s analytics methods can return a ModelResults class that stores additional statistics, diagnostics, and plotting methods for inspection

ModelResults.boundary_silhouette

Calculate boundary silhouette scores for each unit.

ModelResults.lincs

Calculate Local Indicators of Neighborhood Change (LINC) scores for each unit.

ModelResults.path_silhouette

Calculate path silhouette scores for each unit.

ModelResults.silhouette_scores

Calculate silhouette scores for the each unit.

ModelResults.plot_boundary_silhouette([...])

Plot the boundary silhouette scores for each unit as a choropleth map.

ModelResults.plot_next_best_label([...])

Plot the next-best cluster label for each unit as a choropleth map.

ModelResults.plot_silhouette([metric, title])

Create a diagnostic plot of silhouette scores using scikit-plot.

ModelResults.plot_silhouette_map([...])

Plot the silhouette scores for each unit as a [series of] choropleth map(s).

ModelResults.plot_path_silhouette([...])

Plot the path silhouette scores for each unit as a choropleth map.

ModelResults.predict_markov_labels([w_type, ...])

Predict neighborhood labels from the model in future time periods using a spatial Markov transition model

Harmonize Module

harmonize.harmonize(gdf[, target_year, ...])

Use spatial interpolation to standardize neighborhood boundaries over time.

Visualize Module

visualize.animate_timeseries(gdf[, column, ...])

Create an animated gif from a Community timeseries.

visualize.explore([data])

Launch an interactive visualization portal.

visualize.gif_from_path([path, figsize, ...])

Create an animated gif from a director of image files.

visualize.indexplot_seq(df_traj, clustering)

Function for index plot of neighborhood sequences within each cluster.

visualize.plot_timeseries(gdf, column[, ...])

Plot an attribute from a geodataframe arranged as a timeseries with consistent colorscaling.

visualize.plot_transition_matrix(gdf[, ...])

Plot global and spatially-conditioned transition matrices as heatmaps.

visualize.plot_transition_graphs(gdf[, ...])

Plot a network graph representation of global and spatially-conditioned transition matrices.

Util Module

util.fetch_acs([state, level, year, ...])

Collect the variables defined in geosnap.datasets.codebook from the Census API.

util.process_acs(df)

Calculate variables from the geosnap codebook

The Community Class

The Community class is an alternative object-oriented interface for interacting with geosnap. Rather than operating on geodataframes, the Community class manages data internally and exposes methods that operate on these data

Community Constructors

Community.from_census([datastore, ...])

Create a new Community from original vintage US Census data.

Community.from_geodataframes([gdfs])

Create a new Community from a list of geodataframes.

Community.from_lodes([datastore, ...])

Create a new Community from Census LEHD/LODES data.

Community.from_ltdb([datastore, state_fips, ...])

Create a new Community from LTDB data.

Community.from_ncdb([datastore, state_fips, ...])

Create a new Community from NCDB data.

Community Analytics

Community.cluster([n_clusters, method, ...])

Create a geodemographic typology by running a cluster analysis on the study area's neighborhood attributes.

Community.harmonize([target_year, ...])

Standardize inconsistent boundaries into time-static ones.

Community.regionalize([n_clusters, ...])

Create a spatial geodemographic typology by running a cluster analysis on the metro area's neighborhood attributes and including a contiguity constraint.

Community.sequence(cluster_col[, ...])

Pairwise sequence analysis to evaluate the distance/dissimilarity between every two neighborhood sequences.

Community.simulate([model_name, unit_index, ...])

Simulate community dynamics using spatial Markov transition rules.

Community.transition(cluster_col[, ...])

(Spatial) Markov approach to transitional dynamics of neighborhoods.

Community Visualization

Community.animate_timeseries([column, ...])

Create an animated gif from a Community timeseries.

Community.plot_boundary_silhouette(model_name)

Community.plot_next_best_label(model_name[, ...])

Community.plot_silhouette_map(model_name[, ...])

Community.plot_path_silhouette(model_name[, ...])

Community.plot_timeseries(column[, title, ...])

Plot an attribute from a Community arranged as a timeseries.

Community.plot_transition_matrix([...])

Plot global and spatially-conditioned transition matrices as heatmaps.

Community.plot_transition_graphs([...])

Plot a network graph representation of global and spatially-conditioned transition matrices.