Statements in which the resource exists as a subject.
PredicateObject
rdf:type
lifeskim:mentions
pubmed:issue
46
pubmed:dateCreated
2008-11-21
pubmed:abstractText
Datasets describing the health status of individuals are important for medical research but must be used cautiously to protect patient privacy. For patient data containing geographical identifiers, the conventional solution is to aggregate the data by large areas. This method often preserves privacy but suffers from substantial information loss, which degrades the quality of subsequent disease mapping or cluster detection studies. Other heuristic methods for de-identifying spatial patient information do not quantify the risk to individual privacy. We develop an optimal method based on linear programming to add noise to individual locations that preserves the distribution of a disease. The method ensures a small, quantitative risk of individual re-identification. Because the amount of noise added is minimal for the desired degree of privacy protection, the de-identified set is ideal for spatial epidemiological studies. We apply the method to patients in New York County, New York, showing that privacy is guaranteed while moving patients 25-150 times less than aggregation by zip code.
pubmed:grant
pubmed:commentsCorrections
pubmed:language
eng
pubmed:journal
pubmed:citationSubset
IM
pubmed:status
MEDLINE
pubmed:month
Nov
pubmed:issn
1091-6490
pubmed:author
pubmed:issnType
Electronic
pubmed:day
18
pubmed:volume
105
pubmed:owner
NLM
pubmed:authorsComplete
Y
pubmed:pagination
17608-13
pubmed:dateRevised
2009-11-18
pubmed:meshHeading
pubmed:year
2008
pubmed:articleTitle
Revealing the spatial distribution of a disease while preserving privacy.
pubmed:affiliation
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, USA.
pubmed:publicationType
Journal Article, Research Support, N.I.H., Extramural