library(httr)
library(jsonlite)
library(stringr)
library(data.table)
library(withr)
library(sf)
library(kableExtra)
library(ggplot2)
library(RplotterPkg)
library(RcensusPkg)
Introduction
The goal of RcensusPkg is to provide easy access to the US Census Bureau’s datasets and collection of TIGER/Line Shapefiles providing plot geometries for states, counties, roads, landmarks, water, enumeration tracts/blocks for the entire United States. The only requirement is for the user to apply for and obtain a free access key issued from the Bureau. See Guidance for Developers for additional information.
The example below illustrates a simple workflow for downloading a dataset, merging the data with shapefile geometries, and plotting the merge to create a choropleth map.
Installation
You can install the development version of RcensusPkg from GitHub with:
# install.packages("pak")
::pak("deandevl/RcensusPkg") pak
Using devtools::install_github()
:
devtools::install_github("deandevl/RcensusPkg")
Also for this example we will need devtools::install_github("deandevl/RplotterPkg")
Example
US Census Bureau API key
All Census Bureau API requests require an access key. Sign-up for a key is free and can be obtained here. The functions will check for a global setting of the key via Sys.getenv(CENSUS_KEY
). Run usethis::edit_r_environ() and edit your .Renviron file with the line: CENSUS_KEY=your key to create the global association.
Setup
We will be using the following packages.
A look at the US Census Bureau’s community resilience estimates (CRE) database.
Among the list of available API, there is the Community Resilience Estimates based on such factors as:
- Income-to-Poverty Ratio (IPR) < 130 percent
- Single or zero caregiver household
- Aged 65 years or older
- No health insurance coverage
- No vehicle access (Household)
- Disability, at least one serious constraint to significant life activity
- No one in the household is employed full-time, year-round
- Households without broadband internet access
- Unit-level crowding with >= 0.75 persons per room
- No one in the household has received a high school diploma
- No one in the household speaks English
very well
The factors are used to estimate the number of people with:
- 0 risk factors (Low risk)
- 1-2 risk factors (Medium risk)
- 3 or more risk factors (High risk)
The workflow for using RcensusPkg
is the following:
- Get a database name recognized by the API. Use
RcensusPkg::get_dataset_names()
filtering for aresilience
title and vintage of 2022.
<- RcensusPkg::get_dataset_names(
datasets_dt vintage = 2022,
filter_title_str = "resilience"
)
name | vintage | title |
---|---|---|
cre | 2022 | Community Resilience Estimates |
crepuertorico | 2022 | Community Resilience Estimates for Puerto Rico |
The returned dataframe shows a dataset name for CRE 2022 as (surprise) cre
.
- Get the variable names available for the
cre
dataset.
<- RcensusPkg::get_variable_names(
cre_var_names_dt dataset = "cre",
vintage = 2022
|>
) _[, .(name, label)]
name | label |
---|---|
COUNTY | Geography |
GEOCOMP | GEO_ID Component |
GEO_ID | Geographic Identifier |
NATION | Geography |
POPUNI | Population Universe |
PRED0_E | Estimated number of individuals with zero components of social vulnerability |
PRED0_PE | Rate of individuals with zero components of social vulnerability |
PRED12_E | Estimated number of individuals with one-two components of social vulnerability |
PRED12_PE | Rate of individuals with one-two components of social vulnerability |
PRED3_E | Estimated number of individuals with three or more components of social vulnerability |
PRED3_PE | Rate of individuals with three or more components of social vulnerability |
STATE | Geography |
SUMLEVEL | Summary Level code |
TRACT | Geography |
for | Census API FIPS forclause |
in | Census API FIPS inclause |
ucgid | Uniform Census Geography Identifier clause |
We are interested in the percentage of individuals with three or more vulnerabilities (PRED3_PE
)
- Get the regions available for the CRE dataset.
<- RcensusPkg::get_geography(
cre_regions_dt dataset = "cre",
vintage = 2022
)
name | geoLevelDisplay |
---|---|
us | 010 |
state | 040 |
county | 050 |
tract | 140 |
So we can get CRE estimates from the entire US, state, county, and tract enumeration levels. We are interested in the counties for the state of Florida.
- Download the data:
<- usmap::fips(("FL"))
florida_fips
<- RcensusPkg::get_vintage_data(
florida_cre_dt dataset = "cre",
vintage = 2022,
vars = "PRED3_PE",
region = "county",
regionin = paste0("state:", florida_fips)
|>
) := as.numeric(PRED3_PE)] |>
_[, PRED3_PE ::setnames(old = "PRED3_PE", new = "CRE_GE_3") data.table
NAME | CRE_GE_3 | state | county | GEOID |
---|---|---|---|---|
Alachua County, Florida | 19.29 | 12 | 001 | 12001 |
Baker County, Florida | 22.93 | 12 | 003 | 12003 |
Bay County, Florida | 20.05 | 12 | 005 | 12005 |
Bradford County, Florida | 24.77 | 12 | 007 | 12007 |
Brevard County, Florida | 20.67 | 12 | 009 | 12009 |
Broward County, Florida | 22.44 | 12 | 011 | 12011 |
Calhoun County, Florida | 31.52 | 12 | 013 | 12013 |
Charlotte County, Florida | 27.53 | 12 | 015 | 12015 |
- Merge the CRE county data with the Florida county shapefile geographies from the Bureau. Note that we need to establish a folder for receiving the shapefiles.
<- withr::local_tempdir()
output_dir if(!dir.exists(output_dir)){
dir.create(output_dir)
}<- parse(text = paste0("STATEFP == ", '"', florida_fips, '"'))
express <- RcensusPkg::tiger_counties_sf(
cre_florida_sf output_dir = output_dir,
vintage = 2022,
general = TRUE,
express = express,
datafile = florida_cre_dt,
datafile_key = "county"
)
- Plot the choropleth map of the percent of individuals with 3 or more risk factors.
::create_sf_plot(
RplotterPkgsf = cre_florida_sf,
aes_fill = "CRE_GE_3",
hide_x_tics = TRUE,
hide_y_tics = TRUE,
panel_color = "white",
panel_border_color = "white"
)
A more detailed example of the RcensusPkg
workflow is available here.