Privacy was defined as a fundamental human right in the Universal Declaration of Human Rights at the 1948 United Nations General Assembly. However, there is still no consensus on what constitutes privacy. In this review, we look at the evolution of privacy as a concept from the era of Hippocrates to the era of social media and big data. To appreciate the modern measures of patient privacy protection and correctly interpret the current regulatory framework in the United States, we need to analyze and understand the concepts of individually identifiable information, individually identifiable health information, protected health information, and de-identification. The Privacy Rule of the Health Insurance Portability and Accountability Act defines the regulatory framework and casts a balance between protective measures and access to health information for secondary (scientific) use. The rule defines the conditions when health information is protected by law and how protected health information can be de-identified for secondary use. With the advents of artificial intelligence and computational linguistics, computational text de-identification algorithms produce de-identified results nearly as well as those produced by human experts, but much faster, more consistently and basically for free. Modern clinical text de-identification systems now pave the road to big data and enable scientists to access de-identified clinical information while firmly protecting patient privacy. However, clinical text de-identification is not a perfect process. In order to maximize the protection of patient privacy and to free clinical and scientific information from the confines of electronic healthcare systems, all stakeholders, including patients, health institutions and institutional review boards, scientists and the scientific communities, as well as regulatory and law enforcement agencies must collaborate closely. On the one hand, public health laws and privacy regulations define rules and responsibilities such as requesting and granting only the amount of health information that is necessary for the scientific study. On the other hand, developers of de-identification systems provide guidelines to use different modes of operations to maximize the effectiveness of their tools and the success of de-identification. Institutions with clinical repositories need to follow these rules and guidelines closely to successfully protect patient privacy. To open the gates of big data to scientific communities, healthcare institutions need to be supported in their de-identification and data sharing efforts by the public, scientific communities, and local, state, and federal legislators and government agencies.
Keywords: Health Insurance Portability and Accountability Act; confidentiality; data anonymization data sharing; medical informatics; personally identifiable information privacy..