Trends in collection, use and disclosure of personal information in contemporary health research: challenges for research governance.

AuthorWillison, Donald J.
PositionCanada - Special Issue: Canadian Governance for Ethical Research Involving Humans

Background: Changes in the Nature and Use of Personal Information for Health Research

Health research encompasses a heterogeneous set of research activities. This paper focuses on challenges that arise in the governance of observational research which is usually carried out without any direct contact between the researcher and the individuals being studied. Two broad areas of health research are heavily dependent on access to a wide range of existing person-level health information:

  1. Public health, occupational health and safety, and the non-medical determinants of health and disease. The latter examines the relationship between health and lifestyle, environmental, and socioeconomic factors including income and education. Epidemiology is the foundation of much of this type of research. Research in this domain links both health and non-health information, such as occupation, education, and lifestyle information.

  2. Health policy, health services research, and program evaluation examine the health care system and the effects of different policies and methods of health care delivery on the quality and efficiency of care provided. This type of research is informed by a wide variety of disciplines, including: economics, health policy, political sciences, sociology, anthropology, medicine, and epidemiology.

    Most health research requires person-level data, chiefly to increase precision in analysis. For example, when trying to determine the effect of exposure to an environmental toxin in a neighbourhood, with person-level data one can better examine the causal relationship by "controlling for" or holding constant known personal factors such as age and sex of the individual that relate to the outcome of interest. Similarly, when evaluating a policy to increase co-payments for prescription drugs, it is prudent to examine across different income brackets the impact of that policy on the tendency to discontinue medications. In some cases, if using aggregate rather than individual-level data, it is possible to come to spurious conclusions about the effect of exposure (whether to a policy or an environmental toxin) on health outcomes. (1) Also, individual-level data are required to link information from disparate databases. This linkage creates the ability for researchers to answer a much broader set of questions about the determinants of health, but it also raises major privacy concerns when these activities are being conducted without individual consent. Although data may be stripped of direct personal identifiers, the resultant records are often so rich in information that the residual risk of disclosure of identity through indirect means is sufficiently high that the data must be treated as if they were identifiable. In fact, with as little information as date-of-birth, sex, and full postal code, the majority of individuals in a particular region may be re-identified by linking with census tract information. (2)

    Trends in Data Collection, Use, Storage, and Disclosure

    Twenty years ago, only a handful of research centres across North America had the capacity to manipulate and link large data sets, and most government and other data repositories were used only for claims adjudication. Medical records were all paper-based. Advances in the capacity of computers and the internet to store, manipulate and disseminate large amounts of data have changed dramatically the nature of collection, use and disclosure of personal information in contemporary health research. These advances have spawned two parallel developments: the planning and development of large disseminated health information networks that will serve multiple purposes beyond those for direct clinical care; and the proliferation of decentralized holdings of personal data.

    In Canada, the United States, much of the European Union, Australia, and New Zealand, major efforts are underway to computerize patient records across health care settings, with the ability to share and link information from the records of physicians, diagnostic facilities, and health care institutions. In addition to their primary use for direct patient care and claims adjudication, it is intended that these records will be used for quality and risk management, disease surveillance, research, and education of students in the health care professions. In Canada, health infohighway plans include common information architecture across all provinces and territories, to facilitate information sharing across jurisdictions. (3) While the mechanisms by which health researchers will gain access to these infostructures have not yet been determined, considerable attention has been given to developing a consolidated pan-Canadian framework for managing privacy and confidentiality of health information. (4)

    The same technologic advances that have spurred ambitious plans for massive health infohighway projects have also advanced personal computers to the point where they now have greater computing power than the mainframe computers of two decades ago. As a result, most universities no longer maintain centralized facilities to manage the computing needs of researchers. This shifts responsibility for the management of large amounts of personal data onto individual researchers and their staff, who are often ill-trained in issues of privacy, confidentiality and data security matters.

    Over the past decade, a third development in data use for research has emerged, involving the prospective collection of large amounts of data to serve as broad platforms for health research, including future yet-to-be-defined research questions. These data repositories--registries and biobanks--have been developed either as by-products of existing health information collection (e.g. separate data holdings on demographics and concurrent medical conditions of patients visiting speciality clinics) or separate collections intended specifically for research (e.g. the development of multi-site disease-specific or treatment-specific registries and biobanks). While some have developed specifically in recognition of the limitations of existing clinical and administrative data records for research, many collections have evolved over time from modest beginnings (e.g. lists of names of patients with specific conditions or leftover laboratory samples).

    The introduction of new data protection laws in Canada and elsewhere is causing the research community to take a closer look at its data collection and management practices. While, in general, the laws provide considerable scope for self-governance on the part of...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT