MyData:  Login/Account Info     Download Saved Files     Log Out

Description & Citation--Study No. 13285

Bibliographic Description

ICPSR Study No.:13285
 
Persistent URL: http://dx.doi.org/10.3886/ICPSR13285
 
Title:Census of Population and Housing, 2000 [United States]: Selected Subsets From Summary File 1, Advance National
 
Principal Investigator(s):United States Department of Commerce. Bureau of the Census
 
  Inter-university Consortium for Political and Social Research
 
Series:Census of Population and Housing, 2000 [United States] Series
 
Funding Agency:National Science Foundation.
 
Grant Number:SES 0137019
 
Bibliographic Citation:United States Department of Commerce, Bureau of the Census, and Inter-university Consortium for Political and Social Research. CENSUS OF POPULATION AND HOUSING, 2000 [UNITED STATES]: SELECTED SUBSETS FROM SUMMARY FILE 1 [Computer file]. ICPSR ed. Washington, DC: U.S. Dept. of Commerce, Bureau of the Census, and Ann Arbor, MI: Inter-university Consortium for Political and Social Research [producers], 2002. Ann Arbor, MI: Inter-university Consortium for Political and Social Research, [distributor], 2002. doi:10.3886/ICPSR13285
 

Scope of Study

Summary:Prepared by the Inter-university Consortium for Political and Social Research, this data collection consists of selected subsets extracted from the Census of Population and Housing, 2000 [United States]: Summary File 1, Advance National (ICPSR 3325). Summary File 1 data contain information compiled from the questions asked of all people and of every housing unit enumerated in Census 2000: questions covering sex, age, race, Hispanic or Latino origin, type of living quarters (household/group quarters), household relationship, housing unit vacancy status, and housing unit tenure (owner/renter). The information is presented in 286 tables, which are tabulated for every case, i.e., every geographic unit represented in the data. There is one variable per table cell, plus additional variables with geographic information. All cases in the summary file data are classified by levels of observation, known as "summary levels," in the Census Bureau's nomenclature. These levels of observation served as the selection criteria for the subsets. Each subset comprises all of the cases in one of five summary levels: the nation (summary level 010), states (summary level 040), counties (summary level 050), places (summary level 160), and five-digit ZIP code tabulation areas (summary level 860). Three files are supplied for each subset except the last. There is a single, relatively large, file that contains all of the tables in the data, plus two smaller files, each of which contains approximately one half of the tables. For the five-digit ZIP code tabulation areas, there is only one file, which contains all of the tables.
 
Subject Term(s):census data, ethnicity, household composition, housing, housing conditions, population
 
Geographic Coverage:United States
 
Time Period:2000
 
Date(s) of Collection:2000
 
Universe:All persons and housing units in the United States.
 
Data Type:census/enumeration data
 
Data Collection Notes:(1) The original Summary File 1, Advance National data comprise 40 files. There is one column-delimited file that contains geographic identifiers (the geographic header record file or "Geo" file), plus 39 comma-delimited table files, each with a subset of tables in the data. Initial steps in the production of the subsets for this collection involved sorting the Geo file and the 39 table files in ascending order of the common identification variable LOGRECNO, reformatting the Geo file as a comma-delimited file, and stripping the first five identification variables from each of the 39 table files (FILEID, STUSAB, CHARITER, CIFSN, and LOGRECNO). Next, the reformatted Geo file was merged with the stripped table files, end to end, so that corresponding records in the Geo and table files were joined as a single record in the merged file. Finally, each subset was generated by extracting from the merged file all cases with a given value for SUMLEV, the variable that identifies the summary level. Separate subsets were generated for summary levels 010, 040, 050, 160, and 860. (2) To allow for compatibility with SPSS (as of August 2002), subsets with a record length greater than the SPSS limit of 32,767 were "split" into two files, each with a record length less than the limit. Three files are supplied for each of these subsets: a "first half" file containing the Geo variables and tables P1-PCT12E, a "second half" file containing the Geo variables and tables PCT12F-H16I, and a complete file (for non SPSS use) that contains the Geo variables and all of the tables, P1-H16I. (3) Each subset contains all of the geographic component iterations in its summary level, if any. (4) The implied decimal places of variables INTPTLAT (latitude) and INTPTLON (longitude) were made explicit in the subsets. In addition, the values of all Geo variables were enclosed in quotes, except for variables AREALAND, AREAWATR, POP100, HU100, INTPTLAT, and INTPTLON. (5) The data definition statements were tested with SAS 8, SPSS 10, and Stata/SE 7.0. (5) The codebook is provided by the principal investigator as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site. (6) The codebook documents data collection procedures, concepts, and individual variables in the original Summary File data as well as the ICPSR-produced subsets, but not the layout and structure of the subsets. That information is contained in the data dictionary files provided with this collection. In particular, the "Data Structure and Segmentation" section in chapter 2 of the codebook and the variable locations shown in chapter 7 do not apply to the subsets. Every subset file record begins with the Geo variables in their original order. In a complete subset file, the Geo variables are followed by the 6th to last variables in table file 1, then the 6th to last variables in table file 2, and so on up to the 6th to last variables in table file 39. Each "first half" file is a subset of a complete file: it begins with the first variable in the Geo file and ends with the last variable in table file 20. In a "second half" file, the Geo variables are followed by the 6th to last variables in table file 21, then the 6th to last variables in table file 22, and so on up to the 6th to last variables in table file 39.
 

Methodology

Data Source:self-enumerated questionnaires
 
Extent of Processing:REFORM.DATA/ DDEF.ICPSR
 

Access and Availability

Note:A list of the data formats available for this study can be found in the summary of holdings. Detailed file-level information (such as record length, case count, and variable count) is listed in the file manifest.
 
Original ICPSR Release:2002-10-02
 
Version History:The last update of this study occurred on 2006-01-18.
 
  2006-01-18 - File CB13285.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.
 
Dataset(s):
  • DS1: The Nation, All Tables
  • DS2: The Nation, Tables P1-PCT12E Only
  • DS3: The Nation, Tables PCT12F-H16I Only
  • DS4: States, All Tables
  • DS5: States, Tables P1-PCT12E Only
  • DS6: States, Tables PCT12F-H16I Only
  • DS7: Counties, All Tables
  • DS8: Counties, Tables P1-PCT12E Only
  • DS9: Counties, Tables PCT12F-H16I Only
  • DS10: Places, All Tables
  • DS11: Places, Tables P1-PCT12E Only
  • DS12: Places, Tables PCT12F-H16I Only
  • DS13: 5-Digit ZIP Code Tabulation Areas, All Tables
  • DS100: Data Dictionary for Complete Files
  • DS101: Data Dictionary for First Half Files
  • DS102: Data Dictionary for Second Half Files
  • DS103: SAS Data Definition Statements for Complete Files
  • DS104: SAS Data Definition Statements for First Half Files
  • DS105: SAS Data Definition Statements for Second Half Files
  • DS106: SPSS Data Definition Statements That Merge the Half Files
  • DS107: SPSS Data Definition Statements for First Half Files
  • DS108: SPSS Data Definition Statements for Second Half Files
  • DS109: SPSS Data Definition Statements for 5-Digit ZIP Code Tabulation Areas
  • DS110: Stata Data Definition Statements for Complete Files
  • DS111: Stata Data Definition Statements for First Half Files
  • DS112: Stata Data Definition Statements for Second Half Files