Skip to contents

You can install the development version from GitHub with:

# install.packages("devtools")
library(devtools)
devtools::install_github("etam4260/rhud")

The US Department of Housing and Urban Development HUD USER requires users to gain an access key before querying their systems. You must go to https://www.huduser.gov/hudapi/public/register?comingfrom=1 and follow the instructions for making an account.

From the website, you need to make a new token. Make sure to save the token somewhere as you will only be able to view it once. However, you can use it as many times as you want. You can now supply the ‘key’ argument to the function calls. However, it is NOT RECOMMENDED to do this. To reduce the need to supply the key argument in every function call you can use the Sys.setenv() function to save the key to the user environment.

Interactive Key Setup

There are various method of setting up a key. However, I recommend using the Sys.setenv() function.

Using Sys.setenv() during an interactive session

One method of setting the API key is to use environment variables. This library searches for HUD_KEY in the user environment to check if a key has been defined. If no key has been defined, an empty string is outputted.

library(rhud)

Sys.getenv("HUD_KEY")

To set the key…

You need to go to https://www.huduser.gov/hudapi/public/register?comingfrom=1 and register for a key and replace it here.

Sys.setenv("HUD_KEY" = "YOUR KEY HERE")
Sys.getenv("HUD_KEY")

You can also use the hud_set_key() function – this is quicker to type and more intuitive. You can also tell RStudio to remember the key by setting in_wkdir and in_home parameter to TRUE which will write Sys.setenv(“your-key”) to your .Rprofile.

hud_set_key("YOUR KEY HERE", in_home = TRUE)

To check whether rhud can gain access to this environment variable…

It is now set up for the rest of the R session.

Setting a user agent

It is recommended to set a user agent before querying. This tells the HUD servers a bit more about the querying application or user.

hud_set_user_agent("This Application")

If no user agent is set, it will return an empty string.

Increasing Performance with Caching

By default, rhud uses caching by default. Calls to the HUD User API are sent to the RStudio temporary session directory.

If no caching directory is set (located in temp directory), it will return an empty string.

To check the current caching directory use:

To set a new caching directory use:

hud_set_cache_dir(".//a//sample//path", in_home = TRUE)

To clear all items in the cache use:

Tibbles vs Dataframes

To get tibbles instead of data frames, use options(rhud_use_tibble = TRUE) or set it explicitly using the to_tibble argument.

options(rhud_use_tibble = TRUE)

hud_cw_zip_tract(zip = '35213', year = c('2010'), quarter = c('1'),
                 to_tibble = TRUE)

Understanding the Syntax

A noticeable issue with querying geographic data is determining what resolution the data is returned in. Therefore, understanding the syntax of these function calls should easily help you determine what you are querying and what data is returned.

The general syntax for these functions within the package follow this pattern where {dataset},{geography}, and {resolution} are placeholders:

hud_{dataset}_{geography}_{resolution}

{dataset}: This symbol represents the dataset to query for. So for example if you want fair markets rent data, this will be ‘fmr’. If you want data from the crosswalk files, this will be ‘cw’.

{geography}: This symbol represents the geographic resolution to query for. For example, if the function requires that you input state(s), then this will be named as ‘state.’ If the function requires that you input county(s) then this will be ‘county’.

{resolution}: This symbol represented the output geographic resolution. So for example, if we want data at a zip code level, then this will have ‘zip’.

For those that do not have the {geography} and {resolution}, these are omni functions which are capable of performing all queries under {dataset}, but these functions are harder to understand.

For those that do not have the {resolution}, it inherits the {geography}. This means that if {geography} is a state input, then the data described by the output data is also a state.

A quick example:

hud_cw_county_zip(county = 22031, year = c('2017'), quarter = c('1'))

The first part of the function begins with ‘hud’.

The second part is ‘cw’ meaning we are querying the crosswalk files.

The third part is ‘county’ meaning we need to input county(s)

The fourth part is the ‘zip’ meaning the data will be returned at a zip code resolution.

Additional Help

If you ever get confused or need help you can easily revisit this website where you can check the all the function definitions in the Reference tab.

This will quickly open up this website on your web browser as well as the github repository where you can submit issues.