Processing and Quality Review
DSDR as part of ICPSR strives to release data in as timely a manner as possible. Sometimes these processes take longer than anticipated. However, we believe these important archival operations should be performed carefully, with attention to detail. For additional information on the ICPSR data archiving process, consult "Archiving Social Science Data: A Collaborative Process" (ICPSR Bulletin, May 1997).
The amount of processing work performed on data collections submitted varies greatly and depends on the condition of the files upon arrival and the resources available to perform processing work. All data collections undergo a series of mandatory reviews prior to release. Staff at DSDR:
- Evaluate the data collection for completeness, suitability for public release, and readiness for use
- Back up and store both the original data and the processed data at an external site
- Determine whether any problems of respondent confidentiality exist, checking for problems arising either from direct or indirect identification
- Test the technical characteristics of the documentation against the data and make sure the data and documentation match
- Prepare finding aids, including searchable study descriptions and bibliographic citations, to assist in locating the collection within the ICPSR archive
- Consult with the PI/data producer, when necessary, to remedy any problems uncovered during the review of the data
After consultation with the data depositor, if resources permit the staff may undertake further processing steps and data enhancements:
- Convert paper documentation into electronic documentation
- Create SAS and/or SPSS setup files
- Standardize missing data codes
- Reformat data to achieve more efficient transmission and storage
- Standardize coding schemes across elements of a study where they are different
- Check observed frequencies against reported frequencies
- Check for consistency of survey responses and skip patterns
- Check data against documentation for completeness, wild codes, and missing codes
- Correct data in consultation with the PI/data producer when errors are found
