Skip to Main Content

Research Data

What is research data management? Why manage research data?

What is research data management?

Research data management is the process of ensuring that your data is organized, accessible, clearly understood, and preserved for future access. This guide also covers finding and citing data.

Why manage and share your data?

Increase your research impact
Making your data available to other researchers can impact discovery and relevance of your research.

Save time
Planning ahead for your data management needs will save you time and resources.

Preserve your data
Depositing your data in a repository safeguards your investment of time and resources while preserving your research contribution for you and others to use.

Maintain data integrity
Managing and documenting your data throughout its life cycle will allow you and others to understand and use your data in the future.

Meet grant requirements
Many funding agencies now require that researchers deposit data collected as part of a research project.

Promote new discoveries
Sharing your data with other researchers can lead to new and unanticipated discoveries and provide research material for those with little or no funding.

Support open access
Be a catalyst for research and discovery. Show your support for open access by sharing your data.

Research Data Lifecycle

The research data management lifecycle can be defined in different ways. Further, research projects don't necessarily progress through the steps in a linear manner. Further, many research projects will only utilize part of the data management lifecycle  The research data lifecycle best serves as a means of organizing information on research data management and as a framework for data management planning.

The DataONE data life cycle:

  1. Plan
  2. Collect
  3. Assure
  4. Describe
  5. Preserve
  6. Discover
  7. Integrate
  8. Analyze

Funder Requirements

Federal Funder Data Management Requirements

See Data Management Planning and Plans,


Data Archiving Requirements

Sherpa Juliet - a searchable database of funders' policies and their requirements on open access, publication, and data archiving.


Data Sharing Requirements

Data Sharing Requirements by Federal Agency - This community resource for tracking, comparing, and understanding both current and future U.S. federal funder research data sharing policies is a joint project of SPARC & Johns Hopkins University Libraries.



FAIR data=Findable, Accessible, Interoperable, and Reusable data

The FAIR Principals:

The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.

F1. (Meta)data are assigned a globally unique and persistent identifier

F2. Data are described with rich metadata (defined by R1 below)

F3. Metadata clearly and explicitly include the identifier of the data they describe

F4. (Meta)data are registered or indexed in a searchable resource

Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol

A1.1 The protocol is open, free, and universally implementable

A1.2 The protocol allows for an authentication and authorisation procedure, where necessary

A2. Metadata are accessible, even when the data are no longer available

The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

I2. (Meta)data use vocabularies that follow FAIR principles

I3. (Meta)data include qualified references to other (meta)data

The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes

R1.1. (Meta)data are released with a clear and accessible data usage license

R1.2. (Meta)data are associated with detailed provenance

R1.3. (Meta)data meet domain-relevant community standards

The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).