Skip to contents

Using the US Housing and Urban Development USPS Crosswalk files, crosswalk an entire dataset. Currently supported crosswalks:

  1. zip-tract

  2. zip-county

  3. zip-cbsa

  4. zip-cbsadiv (Available 4th Quarter 2017 onwards)

  5. zip-cd

  6. tract-zip

  7. county-zip

  8. cbsa-zip

  9. cbsadiv-zip (Available 4th Quarter 2017 onwards)

  10. cd-zip

  11. zip-countysub (Available 2nd Quarter 2018 onwards)

  12. countysub-zip (Available 2nd Quarter 2018 onwards)

Usage

crosswalk(
  data,
  geoid,
  geoid_col,
  cw_geoid,
  cw_geoid_col = NA,
  method = NA,
  year = format(Sys.Date() - 365, "%Y"),
  quarter = 1,
  key = Sys.getenv("HUD_KEY"),
  to_tibble = getOption("rhud_use_tibble", FALSE)
)

Arguments

data

A dataframe or tibble with rows describing measurements at a zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), or core based statistical area division (cbsadiv) geographic identifier.

  1. zip

  2. tract

  3. county

  4. countysub

  5. cbsa

  6. cbsadiv

  7. cd

geoid

A character vector describing the current geoid that the dataset is described in: must be zip, county, countysub, cd, tract, cbsa, or cbsadiv geographic identifier.

  1. zip

  2. tract

  3. county

  4. countysub

  5. cbsa

  6. cbsadiv

  7. cd

geoid_col

A character or numeric vector of length one: the column containing the geographic identifier; must be zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), and core based statistical area division (cbsadiv) geographic identifier. Supply either the name of the column or the index. All elements in this column must be numbers only at the proper length. For example, zip codes must be 5 digit numbers.

cw_geoid

A character vector of length one: the geoid to crosswalk the dataset to; must be zip, county, county subdivision (countysub), congressional district (cd), census tract, core base statistical area (cbsa), or core based statistical area division (cbsadiv) geoid.

  1. zip

  2. tract

  3. county

  4. countysub

  5. cbsa

  6. cbsadiv

  7. cd

cw_geoid_col

A character or numeric vector: the columns in the dataset to distribute according to method ratio. If method is empty, no allocation method will be applied -- the crosswalk file will just be merged to the dataset. All elements in these columns must be numbers only.

method

A character vector: the allocation method to use -- residential, business, other, or total. If method is empty, no allocation method will be applied -- the crosswalk file will just be merged to the dataset.

  1. res

  2. bus

  3. tot

  4. oth

year

A character or numeric vector: gets the year that this data was recorded. Can specify multiple years. Default is the previous year.

quarter

A character or numeric vector: gets the quarter of the year that this data was recorded. Defaults to the first quarter of the year.

key

A character vector of length one with the key obtained from HUD (US Department of Housing and Urban Development) USER website.

to_tibble

A logical: if TRUE, return the data in a tibble format rather than a data frame.

Value

A dataframe or tibble containing the crosswalked dataset.

Examples

if (FALSE) {
sample <- data.frame(population = c(42134, 12413, 13132),
                     county = c(24047, 24045, 24043))

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "res")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "bus")

crosswalk(data = sample, geoid = "county", geoid_col = "county",
          cw_geoid = "zip", cw_geoid_col = "population", method = "bus",
          year = 2018, quarter = 1)

}