The goal of getACS is to make it easier to work with American Community Survey data from the tidycensus package by Kyle Walker and others.
This package includes:
- Functions that extend
tidycensus::get_acs()
to support multiple tables, geographies, or years - Functions for creating formatted tables from ACS data using the gt package
As of April 2024, this package uses a development version of {tigris}
,
available at https://github.com/elipousson/tigris.
You can install the development version of getACS from GitHub with:
# install.packages("pak")
pak::pkg_install("elipousson/getACS")
library(getACS)
library(gt)
library(ggplot2)
The main feature of {getACS}
is support for returning multiple tables,
geographies, and years.
acs_data <- get_acs_geographies(
geography = c("county", "state"),
county = "Baltimore city",
state = "MD",
table = "B08134",
year = 2022,
quiet = TRUE
)
The package also includes utility functions for filtering data and
selecting columns to support the creation of tables using the {gt}
package:
tbl_data <- filter_acs(acs_data, indent == 1, line_number <= 10)
tbl_data <- select_acs(tbl_data)
commute_tbl <- gt_acs(
tbl_data,
groupname_col = "NAME",
column_title_label = "Commute time",
table = "B08134"
)
as_raw_html(commute_tbl)
Commute time | Est. | % share |
---|---|---|
Baltimore city, Maryland | ||
Less than 10 minutes | 16,140 ± 1,096 | 7% ± 0% |
10 to 14 minutes | 20,798 ± 1,312 | 9% ± 1% |
15 to 19 minutes | 36,667 ± 1,740 | 16% ± 1% |
20 to 24 minutes | 39,803 ± 1,834 | 17% ± 1% |
25 to 29 minutes | 16,404 ± 970 | 7% ± 0% |
30 to 34 minutes | 40,744 ± 1,695 | 17% ± 1% |
35 to 44 minutes | 16,880 ± 1,160 | 7% ± 0% |
45 to 59 minutes | 20,318 ± 1,296 | 9% ± 1% |
60 or more minutes | 26,662 ± 1,207 | 11% ± 0% |
Maryland | ||
Less than 10 minutes | 203,738 ± 3,511 | 8% ± 0% |
10 to 14 minutes | 255,052 ± 4,240 | 10% ± 0% |
15 to 19 minutes | 333,717 ± 5,269 | 13% ± 0% |
20 to 24 minutes | 342,189 ± 4,777 | 13% ± 0% |
25 to 29 minutes | 177,597 ± 3,129 | 7% ± 0% |
30 to 34 minutes | 400,919 ± 5,980 | 15% ± 0% |
35 to 44 minutes | 249,413 ± 4,443 | 9% ± 0% |
45 to 59 minutes | 312,390 ± 4,394 | 12% ± 0% |
60 or more minutes | 371,252 ± 4,828 | 14% ± 0% |
Source: 2018-2022 ACS 5-year Estimates, Table B08134. |
The gt_acs_compare()
function also allows side-by-side comparison of
geographies:
commute_tbl_compare <- gt_acs_compare(
data = tbl_data,
id_cols = "column_title",
column_title_label = "Commute time",
table = "B08134"
)
as_raw_html(commute_tbl_compare)
Commute time | Baltimore city, Maryland | Maryland | ||
---|---|---|---|---|
Est. | % share | Est. | % share | |
Less than 10 minutes | 16,140 ± 1,096 | 7% ± 0% | 203,738 ± 3,511 | 8% ± 0% |
10 to 14 minutes | 20,798 ± 1,312 | 9% ± 1% | 255,052 ± 4,240 | 10% ± 0% |
15 to 19 minutes | 36,667 ± 1,740 | 16% ± 1% | 333,717 ± 5,269 | 13% ± 0% |
20 to 24 minutes | 39,803 ± 1,834 | 17% ± 1% | 342,189 ± 4,777 | 13% ± 0% |
25 to 29 minutes | 16,404 ± 970 | 7% ± 0% | 177,597 ± 3,129 | 7% ± 0% |
30 to 34 minutes | 40,744 ± 1,695 | 17% ± 1% | 400,919 ± 5,980 | 15% ± 0% |
35 to 44 minutes | 16,880 ± 1,160 | 7% ± 0% | 249,413 ± 4,443 | 9% ± 0% |
45 to 59 minutes | 20,318 ± 1,296 | 9% ± 1% | 312,390 ± 4,394 | 12% ± 0% |
60 or more minutes | 26,662 ± 1,207 | 11% ± 0% | 371,252 ± 4,828 | 14% ± 0% |
Source: 2018-2022 ACS 5-year Estimates, Table B08134. |
gt_acs_compare_vars()
is a variant on gt_acs_compare()
where the
default values support comparisons with values in columns and
geographical areas in rows:
commute_tbl_compare_vars <- acs_data |>
filter_acs(indent == 1, line_number > 10) |>
gt_acs_compare_vars(
table = acs_data$table_id
)
as_raw_html(commute_tbl_compare_vars)
NAME | Car, truck, or van | Public transportation (excluding taxicab) | Walked | Taxicab, motorcycle, bicycle, or other means |
---|---|---|---|---|
Baltimore city, Maryland | 176,543 ± 2,817 | 34,640 ± 1,637 | 14,954 ± 1,007 | 8,279 ± 901 |
Maryland | 2,357,924 ± 11,085 | 171,785 ± 3,655 | 59,507 ± 1,858 | 57,051 ± 2,213 |
Source: 2018-2022 ACS 5-year Estimates, Table B08134. |
The package also includes several functions to support creating plots
with the {ggplot2}
package including geom_acs_col()
and
labs_acs_survey()
:
plot_data <- acs_data |>
filter_acs(indent == 1, line_number > 10) |>
select_acs() |>
fmt_acs_county(state = "Maryland")
plot_data |>
ggplot() +
geom_acs_col(
fill = "NAME",
position = "dodge",
color = NA,
alpha = 0.75,
perc = TRUE,
errorbar_params = list(position = "dodge", linewidth = 0.25)
) +
scale_y_discrete("Means of transportation to work") +
scale_fill_viridis_d("Geography") +
labs_acs_survey(
.data = acs_data
) +
theme_minimal()
The geom_acs_col()
function calls geom_acs_errorbar()
(passing the
errorbar_params
argument as additional parameters) and scale_x_acs()
or scale_y_acs()
(depending on whether orientation = "y"
or the
default value of NA
).
For more information on working with Census data in R read the book Analyzing US Census Data: Methods, Maps, and Models in R (February 2023).
- {easycensus}: Quickly Extract and Marginalize U.S. Census Tables
- {cwi}: Functions to speed up and standardize Census ACS data analysis for multiple staff people at DataHaven, preview trends and patterns, and get data in more layperson-friendly
- {camiller}: A set of
convenience functions, functions for working with ACS data via
{tidycensus}
- {psrccensus}: A set of tools developed for PSRC (Puget Sound Regional Council) staff to pull, process, and visualize Census Data for geographies in the Central Puget Sound Region.
- {CTPPr}: A R package for loading and working with the US Census CTPP survey data.
- {lehdr}: a package to grab LEHD data in support of city and regional planning economic and transportation analysis
- {mapreliability}: A R package for map classification reliability calculator
- Studying Neighborhoods With Uncertain Census Data: Code to create and visualize demographic clusters for the US with data from the American Community Survey
- census-data-aggregator: A Python library from the L.A. Times data desk to help “combine U.S. census data responsibly”
- census-table-metadata:
Tools for generating metadata about tables and fields in a Census
release based on sequence lookup and table shell files. (Note: the
pre-computed data from this repository is used to label ACS data by
label_acs_metadata()
)