Is it really possible to anonymize data?
total read time
De-Identification is a Process, and one that can be done right or WRONG!
The argument 'for' or 'against' use of de-identification is a Red Herring. What the arguments are actually about is the point that a Process can be done badly. There should be no doubt that any Process can be done badly. Even a simple process like filling a glass of water can be done badly, even resulting in human harm.
The big misunderstanding is that De-Identification is an absolute. It is not, it is a Process used to lower 'risk' of re-identification. As a process it can be done badly. As a domain of 'risk' it can't achieve zero-risk, except to end up at the null-set.
The standards in this space are clear about this risk factor. It is absolutists that insist on viewing de-identification as an absolute, that are causing the argument. This oversimplification is just as alarmist.
As Yogi Berra is said to say: "In theory, there is no difference between theory and practice. But, in practice, there is." The Practice of applying de-identification has occasional failures, like all 'risk' domains. No one hears about the times when de-identification is done successfully.All the failures are held up to the light and used to show that the solution fails.
This doesn't mean I am an absolutist that De-Identification is the solution. My perspective is that it is a "Tool". As all tools and processes; they must be used properly.
UPDATED
-----------------
It was pointed out to me, by the awesome Gila Pyke, that I failed to remind the reader that De-Identification is just ONE tool in a mature risk management process. As a risk management tool, and as stated above, the risk will not be brought to zero; as such the resulting data-set might still require protection. It is true that too often one presumes that a data-set that has been de-identified can be globally published. This is true if that was the target of the risk management, and that the risk to re-identification has truly been reduced to the level necessary for global publication. This is one of the misunderstandings that also results in the outlined failures. This is also a fundamental misunderstanding, failing, of the HIPAA de-identification clause.
The argument 'for' or 'against' use of de-identification is a Red Herring. What the arguments are actually about is the point that a Process can be done badly. There should be no doubt that any Process can be done badly. Even a simple process like filling a glass of water can be done badly, even resulting in human harm.
The big misunderstanding is that De-Identification is an absolute. It is not, it is a Process used to lower 'risk' of re-identification. As a process it can be done badly. As a domain of 'risk' it can't achieve zero-risk, except to end up at the null-set.
The standards in this space are clear about this risk factor. It is absolutists that insist on viewing de-identification as an absolute, that are causing the argument. This oversimplification is just as alarmist.
As Yogi Berra is said to say: "In theory, there is no difference between theory and practice. But, in practice, there is." The Practice of applying de-identification has occasional failures, like all 'risk' domains. No one hears about the times when de-identification is done successfully.All the failures are held up to the light and used to show that the solution fails.
This doesn't mean I am an absolutist that De-Identification is the solution. My perspective is that it is a "Tool". As all tools and processes; they must be used properly.
UPDATED
-----------------
It was pointed out to me, by the awesome Gila Pyke, that I failed to remind the reader that De-Identification is just ONE tool in a mature risk management process. As a risk management tool, and as stated above, the risk will not be brought to zero; as such the resulting data-set might still require protection. It is true that too often one presumes that a data-set that has been de-identified can be globally published. This is true if that was the target of the risk management, and that the risk to re-identification has truly been reduced to the level necessary for global publication. This is one of the misunderstandings that also results in the outlined failures. This is also a fundamental misunderstanding, failing, of the HIPAA de-identification clause.
De-Identification, Anonymization, Pseudonymization
- PCAST - Big Data: A Technological Perspective
- De-Identifying free-text
- De-Identification: process reduce risk of identification of entries in a data-set
- Fake it properly
- De-Identification - Data Chemistry
- Guidance Regarding Methods for De-identification of Health Information
- The Emperor has no clothes - De-Identification and User Provisioning
- De-Identification is highly contextual
- Redaction and Clinical Documentation
0 Response to "Is it really possible to anonymize data?"
Posting Komentar