U.S. Census Bureau Data

U.S. Census Bureau Data#

American Community Survey: 5-Year Estimates: Subject Tables 5-Year#

2017-2021 ACS 5-Year Subject Tables

Description:

“The American Community Survey (ACS) is an ongoing survey that provides data every year – giving communities the current information they need to plan investments and services. The ACS covers a broad range of topics about social, economic, demographic, and housing characteristics of the U.S. population. The subject tables include the following geographies: nation, all states (including DC and Puerto Rico), all metropolitan areas, all congressional districts, all counties, all places and all tracts. Subject tables provide an overview of the estimates available in a particular topic. The data are presented as both counts and percentages. There are over 66,000 variables in this dataset.”

Vintage:

2021

Dataset Name:

acs› acs5› subject

Dataset Type:

Aggregate

Geographies

Variables

Label prefix = Estimate!!Median income (dollars)!!HOUSEHOLD INCOME BY RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!Households

Concept = MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2021 INFLATION-ADJUSTED DOLLARS)

Required = not required

Limit = 0

Predicate Type = int

Group = S1903

Name	Label	Attributes EA	Attributes M	Attributes MA
S1903_C03_001E	-	S1903_C03_001EA	S1903_C03_001M	S1903_C03_001MA
S1903_C03_002E	One race–!!White	S1903_C03_002EA	S1903_C03_002M	S1903_C03_002MA
S1903_C03_003E	One race–!!Black or African American	S1903_C03_003EA	S1903_C03_003M	S1903_C03_003MA
S1903_C03_004E	One race–!!American Indian and Alaska Native	S1903_C03_004EA	S1903_C03_004M	S1903_C03_004MA
S1903_C03_005E	One race–!!Asian	S1903_C03_005EA	S1903_C03_005M	S1903_C03_005MA
S1903_C03_006E	One race–!!Native Hawaiian and Other Pacific Islander	S1903_C03_006EA	S1903_C03_006M	S1903_C03_006MA
S1903_C03_007E	One race–!!Some other race	S1903_C03_007EA	S1903_C03_007M	S1903_C03_007MA
S1903_C03_008E	Two or more races	S1903_C03_008EA	S1903_C03_008M	S1903_C03_008MA
S1903_C03_009E	Hispanic or Latino origin (of any race)	S1903_C03_009EA	S1903_C03_009M	S1903_C03_009MA
S1903_C03_010E	White alone, not Hispanic or Latino	S1903_C03_010EA	S1903_C03_010M	S1903_C03_010MA

Estimate and Annotation Values

EA - Annotation of Estimate
M - Margin of Error
MA - Annotation of Margin of Error

Groups

Sorts

Examples

Contains Geography Hierarchy

Developer Documentation

API Base URL

[Bureau, 2021]

Getting Started with American Community Survey Data in R and Python | U.S. Census Bureau | YouTube

Import Libraries#

Standard Libraries#

import os
from dotenv import load_dotenv

External Libraries#

from cenpy import remote
import matplotlib.pyplot as plt
import pandas as pd

Configure Libraries#

[Choudhari, 2022]

%matplotlib inline

load_dotenv()

True

Define Variables#

Inputs#

census_api = os.environ.get('CENSUS_API')

Outputs#

census_folder_output = 'data/census'
census_file_output = 'census-income-race-2021-5yr.csv'
census_path_output = f'{census_folder_output}/{census_file_output}'

if not os.path.exists(census_folder_output):
    print('Creating new folder for ACS dataset: "{census_folder_output}"')
    os.makedirs(census_folder_output)
else:
    print(f'"{census_folder_output}" folder already exists')

"data/census" folder already exists

Get Data From Census Data API#

Dataset: `ACS: 5-Year Estimates: Subject Tables 5-Year`#

subject_table_conn = remote.APIConnection(
    api_name = 'ACSST5Y2021',
    apikey = census_api
)

Confirm Connection#

subject_table_conn

Connection to American Community Survey: 5-Year Estimates: Subject Tables 5-Year (ID: https://api.census.gov/data/id/ACSST5Y2021)

View Variable Data#

ACSST5Y2021_var_df = subject_table_conn.variables

ACSST5Y2021_var_df.head()

	label	concept	predicateType	group	predicateOnly	hasGeoCollectionSupport	attributes	required
for	Census API FIPS 'for' clause	Census API Geography Specification	fips-for	N/A	True	NaN	NaN	NaN
in	Census API FIPS 'in' clause	Census API Geography Specification	fips-in	N/A	True	NaN	NaN	NaN
ucgid	Uniform Census Geography Identifier clause	Census API Geography Specification	ucgid	N/A	True	True	NaN	NaN
S0804_C04_068E	Estimate!!Public transportation (excluding tax...	MEANS OF TRANSPORTATION TO WORK BY SELECTED CH...	float	S0804	NaN	NaN	S0804_C04_068EA,S0804_C04_068M,S0804_C04_068MA	NaN
S0503_C02_078E	Estimate!!Foreign born; Born in Europe!!Civili...	SELECTED CHARACTERISTICS OF THE FOREIGN-BORN P...	float	S0503	NaN	NaN	S0503_C02_078EA,S0503_C02_078M,S0503_C02_078MA	NaN

Specify Search Criteria for `label` column#

`income`#

income_search = 'income'

`race`#

race_search = 'race'

Filter Variable DataFrame by Search Criteria#

print(f'number of labels: {len(ACSST5Y2021_var_df)}')

number of labels: 18827

income_race_filter = ACSST5Y2021_var_df['label'].str.contains(income_search) & ACSST5Y2021_var_df['label'].str.contains(race_search)

Preview Selected Labels#

ACSST5Y2021_var_df[income_race_filter][['label', 'attributes']].sort_index()

	label	attributes
S1902_C03_020E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_020EA,S1902_C03_020M,S1902_C03_020MA
S1902_C03_021E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_021EA,S1902_C03_021M,S1902_C03_021MA
S1902_C03_022E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_022EA,S1902_C03_022M,S1902_C03_022MA
S1902_C03_023E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_023EA,S1902_C03_023M,S1902_C03_023MA
S1902_C03_024E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_024EA,S1902_C03_024M,S1902_C03_024MA
S1902_C03_025E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_025EA,S1902_C03_025M,S1902_C03_025MA
S1902_C03_026E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_026EA,S1902_C03_026M,S1902_C03_026MA
S1902_C03_027E	Estimate!!Mean income (dollars)!!PER CAPITA IN...	S1902_C03_027EA,S1902_C03_027M,S1902_C03_027MA
S1903_C03_002E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_002EA,S1903_C03_002M,S1903_C03_002MA
S1903_C03_003E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_003EA,S1903_C03_003M,S1903_C03_003MA
S1903_C03_004E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_004EA,S1903_C03_004M,S1903_C03_004MA
S1903_C03_005E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_005EA,S1903_C03_005M,S1903_C03_005MA
S1903_C03_006E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_006EA,S1903_C03_006M,S1903_C03_006MA
S1903_C03_007E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_007EA,S1903_C03_007M,S1903_C03_007MA
S1903_C03_008E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_008EA,S1903_C03_008M,S1903_C03_008MA
S1903_C03_009E	Estimate!!Median income (dollars)!!HOUSEHOLD I...	S1903_C03_009EA,S1903_C03_009M,S1903_C03_009MA

Store Selected `names` and `labels` in a dictionary#

Compare results with Census API documentation for variables

ACSST5Y2021-variables-screenshots

names_dict = {
    'S1903_C03_001E': 'Households', 
    'S1903_C03_002E': 'White', 
    'S1903_C03_003E': 'Black or African American',
    'S1903_C03_004E': 'American Indian and Alaska Native',
    'S1903_C03_005E': 'Asian',
    'S1903_C03_006E': 'Native Hawaiian and Other Pacific Islander',
    'S1903_C03_007E': 'One race_Some other race',
    'S1903_C03_008E': 'Two or more races',
    'S1903_C03_009E': 'Hispanic or Latino origin (of any race)',
    'S1903_C03_010E': 'White alone, not Hispanic or Latino'
}

names_list = list(names_dict.keys())

View Geography Options#

subject_table_conn.geographies

{'fips':                                                  name geoLevelDisplay  \
                                                us             010   
                                            region             020   
                                          division             030   
                                             state             040   
                                            county             050   
                                county subdivision             060   
                           subminor civil division             067   
                                             tract             140   
                                             place             160   
                                 consolidated city             170   
               alaska native regional corporation             230   
american indian area/alaska native area/hawaii...             250   
                     tribal subdivision/remainder             251   
american indian area/alaska native area (reser...             252   
american indian area (off-reservation trust la...             254   
                              tribal census tract             256   
                                  state (or part)             260   
       metropolitan/micropolitan statistical area             310   
                         principal city (or part)             312   
                            metropolitan division             314   
                        combined statistical area             330   
          combined new england city and town area             335   
                   new england city and town area             350   
                                   principal city             352   
                                   necta division             355   
                                       urban area             400   
                           congressional district             500   
       state legislative district (upper chamber)             610   
       state legislative district (lower chamber)             620   
                        public use microdata area             795   
                         zip code tabulation area             860   
                     school district (elementary)             950   
                      school district (secondary)             960   
                        school district (unified)             970   
 
    referenceDate                                           requires  \
   2021-01-01                                                NaN   
   2021-01-01                                                NaN   
   2021-01-01                                                NaN   
   2021-01-01                                                NaN   
   2021-01-01                                            [state]   
   2021-01-01                                    [state, county]   
   2021-01-01                [state, county, county subdivision]   
   2021-01-01                                    [state, county]   
   2021-01-01                                            [state]   
   2021-01-01                                            [state]   
  2021-01-01                                            [state]   
  2021-01-01                                                NaN   
  2021-01-01  [american indian area/alaska native area/hawai...   
  2021-01-01                                                NaN   
  2021-01-01                                                NaN   
  2021-01-01  [american indian area/alaska native area/hawai...   
  2021-01-01  [american indian area/alaska native area/hawai...   
  2021-01-01                                                NaN   
  2021-01-01  [metropolitan/micropolitan statistical area, s...   
  2021-01-01       [metropolitan/micropolitan statistical area]   
  2021-01-01                                                NaN   
  2021-01-01                                                NaN   
  2021-01-01                                                NaN   
  2021-01-01  [new england city and town area, state (or part)]   
  2021-01-01                   [new england city and town area]   
  2021-01-01                                                NaN   
  2021-01-01                                            [state]   
  2021-01-01                                            [state]   
  2021-01-01                                            [state]   
  2021-01-01                                            [state]   
  2021-01-01                                                NaN   
  2021-01-01                                            [state]   
  2021-01-01                                            [state]   
  2021-01-01                                            [state]   
 
                                              wildcard  \
                                               NaN   
                                               NaN   
                                               NaN   
                                               NaN   
                                           [state]   
                                          [county]   
                                               NaN   
                                          [county]   
                                           [state]   
                                           [state]   
                                          [state]   
                                              NaN   
[american indian area/alaska native area/hawai...   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                              NaN   
                                [state (or part)]   
                                              NaN   
                                              NaN   
                                          [state]   
                                              NaN   
                                              NaN   
                                          [state]   
                                              NaN   
                                          [state]   
                                          [state]   
                                          [state]   
 
                                     optionalWithWCFor  
                                               NaN  
                                               NaN  
                                               NaN  
                                               NaN  
                                             state  
                                            county  
                                               NaN  
                                            county  
                                             state  
                                             state  
                                            state  
                                              NaN  
american indian area/alaska native area/hawaii...  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                              NaN  
                                  state (or part)  
                                              NaN  
                                              NaN  
                                            state  
                                              NaN  
                                              NaN  
                                            state  
                                              NaN  
                                            state  
                                            state  
                                            state  }

Get Census Data: `Medium Income in the Past 12 Months`#

Note: 2021 inflation-adjusted dollars

if os.path.isfile(census_path_output):
    income_race_rename_df = pd.read_csv(census_path_output, index_col = 0)
else:
    income_race_df = subject_table_conn.query(names_list, geo_unit = 'zip code tabulation area')
    income_race_rename_df = income_race_df.copy()
    income_race_rename_df.rename(columns = names_dict, inplace = True)
    income_race_rename_df.to_csv(census_path_output)

Preview Census Data#

Notes on ACS Estimate and Annotation Values

Estimate Value	Annotation Value	Meaning
-666666666	-	“The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.”

income_race_rename_df

	Households	White	Black or African American	American Indian and Alaska Native	Asian	Native Hawaiian and Other Pacific Islander	One race_Some other race	Two or more races	Hispanic or Latino origin (of any race)	White alone, not Hispanic or Latino	zip code tabulation area
0	15292	15913	22222	-666666666	-666666666	-666666666	10351	17958	15309	-666666666	601
1	18716	17795	23424	-666666666	-666666666	-666666666	14139	20888	18407	21593	602
2	16789	17434	15545	-666666666	-666666666	-666666666	15026	16000	16580	87768	603
3	18835	20279	-666666666	-666666666	-666666666	-666666666	16656	-666666666	18762	-666666666	606
4	21239	22746	21667	-666666666	-666666666	-666666666	19298	19498	20765	-666666666	610
...	...	...	...	...	...	...	...	...	...	...	...
33769	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	99923
33770	70625	84583	-666666666	43750	-666666666	-666666666	-666666666	59375	89583	83750	99925
33771	58229	168750	-666666666	51667	-666666666	-666666666	-666666666	53393	-666666666	168750	99926
33772	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	-666666666	99927
33773	54946	59750	-666666666	49375	-666666666	-666666666	-666666666	-666666666	29154	60208	99929

33774 rows × 11 columns

zip_codes_in_nyc = [
    11101, # Long Island City
    10001, # Hudson Yards
    10458, # Fordham
    10304, # Stapleton
    11209 # Bay Ridge
]

zip_code_filter = income_race_rename_df['zip code tabulation area'].isin(zip_codes_in_nyc)

income_race_rename_df[zip_code_filter].head()

	Households	White	Black or African American	American Indian and Alaska Native	Asian	Native Hawaiian and Other Pacific Islander	One race_Some other race	Two or more races	Hispanic or Latino origin (of any race)	White alone, not Hispanic or Latino	zip code tabulation area
2577	101409	125693	35722	-666666666	88882	-666666666	61103	83182	62735	132053	10001
2649	64539	75472	48695	-666666666	69277	-666666666	36250	-666666666	46179	85683	10304
2666	38768	34387	35604	25809	57188	-666666666	41297	34770	39360	31875	10458
2839	98920	122539	33442	47574	116857	-666666666	49401	-666666666	49958	126141	11101
2853	84145	88309	73349	118491	77943	-666666666	75847	78409	77067	89576	11209