Publication Date

4-17-2017

Abstract

We consider the problem of determining the optimal accuracy of public statistics when increased accuracy requires a loss of privacy. To formalize this allocation problem, we use tools from statistics and computer science to model the publication technology used by a public statistical agency. We derive the demand for accurate statistics from first principles to generate interdependent preferences that account for the public-good nature of both data accuracy and privacy loss. We first show data accuracy is inefficiently under-supplied by a private provider. Solving the appropriate social planner’s problem produces an implementable publication strategy. We implement the socially optimal publication plan for statistics on income and health status using data from the American Community Survey, National Health Interview Survey, Federal Statistical System Public Opinion Survey and Cornell National Social Survey. Our analysis indicates that welfare losses from providing too much privacy protection and, therefore, too little accuracy can be substantial.

Comments

Abowd and Schmutte acknowledge the support of Alfred P. Sloan Foundation Grant G-2015-13903 and NSF Grant SES-1131848. Abowd acknowledges direct support from the U.S. Census Bureau (before and during his appointment as Associate Director) and from NSF Grants BCS- 0941226, TC-1012593. Some of the research for this paper was conducted using the resources of the Social Science Gateway, which was partially supported by NSF grant SES-0922005. Any opinions and conclusions expressed herein are those of the authors and do not necessarily represent the views of the Census Bureau, NSF, or the Sloan Foundation. We also thank the Isaac Newton Institute for Mathematical Sciences, Cambridge, for support and hospitality during the Programme on Data Linkage and Anonymisation, supported by EPSRC grant no. EP/K032208/1. Abowd also acknowledges the Center for Labor Economics at UC Berkeley, where he was a visiting scholar when this work was initiated. We are grateful for helpful comments from Larry Blume, David Card, Michael Castro, Cynthia Dwork, John Eltinge, Stephen Fienberg, Mark Kutzbach, Ron Jarmin, Dan Kifer, Ashwin Machanavajjhala, Frank McSherry, Gerome Miklau, Kobbi Nissim, Mallesh Pai, Jerry Reiter, Eric Slud, Adam Smith, Bruce Spencer, Sara Sullivan, Lars Vilhuber and Nellie Zhao along with seminar and conference participants at the U.S. Census Bureau, Cornell University, CREST, George Mason University, Georgetown University, University of Washington Evans School of Public Policy, and the Society of Labor Economists. We thank Jennifer Childs and Casey Eggleston for providing data from the Federal Statistical System Public Opinion Survey conducted by the Census Bureau’s Center for Survey Methodology. William Sexton provided excellent research assistance. No confidential data were used in this paper.

A complete archive of the data and programs used in this paper is available via http://doi.org/10.5281/zenodo.345385.

A previous version of this paper is http://digitalcommons.ilr.cornell.edu/ldi/22/. This version supersedes Document 22.

Share

COinS