All Guides: Research Data: Ethics of Human Research Subject Data

UMBC Policies and Procedures

All human subject research at UMBC requires the approval of the Institutional Review Board (IRB), whose responsibility it is to advocate for ethical standards, safeguards, and protection of human research participants. For information on the IRB process, see The Office of Research & Creative Achievement's "Overview of the IRB Process" webpage.

When doing some types of human subject research or other confidential research utilizing data from an outside agency, UMBC requires a Data Use Agreement (DUA). Information on the DUA process is available on Office of Research & Creative Achievement's "Data Use Agreements" webpage.

Principles of Human Subject Research

The Belmont Report outlines three, basic, ethical principals to guide the protection of human subjects in research:

Principle	Description	Related Topics
Respect for persons	The autonomy of all participants in human subjects research must be respected.	Informed consent (UMBC Consent Guidelines and Templates), Anonymity and Confidentiality
Beneficence	Research should maximize benefits to human subjects and minimize harms.	Debriefing, Right to withdraw
Justice	Research should be well considered, non-exploitive, and administered fairly.	Inclusion/exlusion

UMBC Consent Guidelines and Templates

Data Anonymization

FDP Tool for Classifying Human Subjects Data

18 HIPAA Identifiers that comprise Personally Identifiable Information (PII)

HIPAA – Limited Data Set

FERPA – Personally Identifiable Information

PII may be used alone or with other sources to identify an individual. PII in conjunction with medical records (including payments for medical care) becomes Protected Health Information (PHI).

Name (including initials)
Address (all geographic subdivisions smaller than state: street address, city, county, zip code)
All elements (except years) of dates related to an individual (including birthdate, admission date, discharge date, date of death, and exact age if over 89)
Telephone numbers
Fax number
Email address
Social Security Number
Medical record number
Health plan beneficiary number
Account number
Certificate or license number
Any vehicle identifiers, including license plate
Device identifiers and serial numbers
Web URL
Internet Protocol (IP) Address
Finger or voice print
Photographic image - Photographic images are not limited to images of the face
Any other characteristic that could uniquely identify the individual

A data set containing any of these identifiers, or parts of the identifier, is considered “identified”

A Limited Data Set must omit all of the HIPAA Identifiers in the left-hand column except for the following:

City, state, zip code
Dates of admission, discharge, service, date of birth, date of death

Ages in years, months or days or hours To re-iterate: initials are always considered PHI/PII

HIPAA – De-identified Data

All of the 18 HIPAA Identifiers in the left-hand column must be removed in order for a data set to be considered de-identified with caveats for the following:

All geographic subdivisions smaller than a state, except for the initial three digits of the ZIP code: (1) The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20,000 people; and (2) The initial three digits of a ZIP code for all such geographic units containing 20,000 or fewer people is changed to 000;

Ages in years and for those older than 89, all ages must be aggregated into a single category of 90 or older

In the context of FERPA, PII includes, but is not limited to:

Student’s name
The name of the student’s parent(s) or other family members
Address of the student or student’s family
Student’s personal identifiers, such as:
1. Social Security Number;
2. Student number; or
3. Biometric record (i.e. Finger or voice print)
Student’s other indirect identifiers, such as:
1. Birthdate;
2. Place of birth; or
3. Mother’s maiden name
Other information that, alone or in combination, is linked or linkable to a specific student that would allow a reasonable person in the school community, who does not have personal knowledge of the relevant circumstances, to identify the student with reasonable certainty
Information requested by a person who the educational agency or

institution reasonably believes knows the identity of the student to whom the education record relates

Free Data De-Identification Tools

Note that a human must oversee these tools to ensure that all of the data is properly de-identified.

NLM Scrubber

Files must be plain text. It works on free text such as medical histories and lab reports.

NLM Scrubber Website

NLM Scrubber User Manual

NLM Scrubber Product Guide

CliniDelD

Files must be plain text or SQL. It works on free text such as discharge summaries. Java is required.

CliniDeID Website

The MITRE Identification Scrubber Toolkit

File must be plain text. It works on free text such as lab reports and orders. It requires Java and Python.

MITRE Identification Scrubber Toolkit Website

ARX Anonymization Tool

It works on tabular data in SQL, CSV, or Excel files.

ARX Anonymization Tool Website

Indigenous Data

The International Indigenous Data Sovereignty Interest Group (within the Research Data Alliance) is a network of nation-state based Indigenous data sovereignty networks and individuals that developed the ‘CARE Principles for Indigenous Data Governance’ (Collective Benefit, Authority to Control, Responsibility, and Ethics) in consultation with Indigenous Peoples, scholars, non-profit organizations, and governments:

Principle	Description
Collective Benefit	Data ecosystems shall be designed and function in ways that enable indigenous Peoples to derive benefit from the data.
Authority to Control	Indigenous Peoples' rights and interests in indigenous data must be recognized and their authority to control such data be empowered. Indigenous data governance enables Indigenous Peoples and governing bodies to determine how Indigenous Peoples, as well as indigenous lands, territories, resources, resources, knowledge, and geographical indicators are represented and identified within data.
Responsibility	Those working with indigenous data have a responsibility to share how those data are used to support Indigenous Peoples' self-determination and collective benefit. Accountability requires meaningful and openly available evidence of these efforts and the benefits accruing to Indigenous Peoples.
Ethics	indigenous Peoples' rights and wellbeing should be the primary concern at all stage of the data life cycle and across data ecosystems.

Research Data

UMBC Policies and Procedures

Principles of Human Subject Research

Data Anonymization

Free Data De-Identification Tools

Note that a human must oversee these tools to ensure that all of the data is properly de-identified.

NLM Scrubber

CliniDelD

The MITRE Identification Scrubber Toolkit

ARX Anonymization Tool

Indigenous Data

Search & Find

Using the Library

Research Help

About AOK

Special Collections & Gallery