Crosswalk a Dataset

Using the US Housing and Urban Development USPS Crosswalk files, crosswalk an entire dataset. Currently supported crosswalks:

zip-tract
zip-county
zip-cbsa
zip-cbsadiv (Available 4th Quarter 2017 onwards)
zip-cd
tract-zip
county-zip
cbsa-zip
cbsadiv-zip (Available 4th Quarter 2017 onwards)
cd-zip
zip-countysub (Available 2nd Quarter 2018 onwards)
countysub-zip (Available 2nd Quarter 2018 onwards)

Usage

crosswalk(
  data,
  geoid,
  geoid_col,
  cw_geoid,
  cw_geoid_col = NA,
  method = NA,
  year = format(Sys.Date() - 365, "%Y"),
  quarter = 1,
  key = Sys.getenv("HUD_KEY"),
  to_tibble = getOption("rhud_use_tibble", FALSE)
)

Arguments

data

A dataframe or tibble with rows describing measurements at a zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), or core based statistical area division (cbsadiv) geographic identifier.

zip
tract
county
countysub
cbsa
cbsadiv
cd

geoid

A character vector describing the current geoid that the dataset is described in: must be zip, county, countysub, cd, tract, cbsa, or cbsadiv geographic identifier.

zip
tract
county
countysub
cbsa
cbsadiv
cd

geoid_col

A character or numeric vector of length one: the column containing the geographic identifier; must be zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), and core based statistical area division (cbsadiv) geographic identifier. Supply either the name of the column or the index. All elements in this column must be numbers only at the proper length. For example, zip codes must be 5 digit numbers.

cw_geoid

A character vector of length one: the geoid to crosswalk the dataset to; must be zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), or core based statistical area division (cbsadiv) geoid.

zip
tract
county
countysub
cbsa
cbsadiv
cd

cw_geoid_col

A character or numeric vector: the columns in the dataset to distribute according to method ratio. If method is empty, no allocation method will be applied -- the crosswalk file will just be merged to the dataset. All elements in these columns must be numbers only.

method

A character vector: the allocation method to use -- residential, business, other, or total. If method is empty, no allocation method will be applied -- the crosswalk file will just be merged to the dataset.

year

A character or numeric vector: gets the year that this data was recorded. Can specify multiple years. Default is the previous year.

quarter

A character or numeric vector: gets the quarter of the year that this data was recorded. Defaults to the first quarter of the year.

key

A character vector of length one with the key obtained from HUD (US Department of Housing and Urban Development) USER website.

to_tibble

A logical: if TRUE, return the data in a tibble format rather than a data frame.

Value

A dataframe or tibble containing the crosswalked dataset.

Examples

if (FALSE) {
sample <- data.frame(population = c(42134, 12413, 13132),
                     county = c(24047, 24045, 24043))

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "res")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "bus")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "bus",
          year = 2018, quarter = 1)

}

Usage

Arguments

Value

See also

Examples