Skip to Main Content

Research Data

Metadata

Various metadata standards are available for particular file formats and disciplines. General guidelines are provided below:

Things to document about your data: 

Title
Name of the dataset or research project that produced it

Creator
Names and addresses of the organization or people who created the data

Identifier
Number used to identify the data, even if it is just an internal project reference number

Dates
Key dates associated with the data, including project start and end date, data modification data release date, and time period covered by the data

Subject
Keywords or phrases describing the subject or content of the data

Funders
Organizations or agencies who funded the research

Rights
Any known intellectual property rights held for the data

Language
Language(s) of the intellectual content of the resource, when applicable

Location
Where the data relates to a physical location, record information about its spatial coverage

Methodology
How the data was generated, including equipment or software used, experimental protocol, other things you might include in a lab notebook

 

Sharing Data

Why Share Research Data?

Researchers devote a large amount of physical and intellectual effort to collect, manage, collate, and analyse their data before publishing their results. Many of these datasets have significant value beyond the usage for the original research, and sharing the data can be seen as beneficial in a number of ways:

  • Research integrity and reproducibility: Publishing research data and citing its location in published research papers allows other to replicate, validate or build upon your results thus improving the scientific record by encouraging scientific enquiry and debate. Openly sharing research data also encourages the improvement and validation of research methods and minimises the need for data re-collection.
  • Preservation of Research Data: Some research data will be unique and cannot be replaced if destroyed or lost. Sharing via a repository will mean that the repository will look after and preserve your data into the future, even after technology becomes obsolete.
  • Innovation: Data created for one research purpose may be re-invented or re-interpreted for future unrelated research and into contexts not currently envisaged. Data sharing and re-use across borders and disciplines can also promote innovation by potential new data users.
  • Impact: Others who re-use your data and cite it in their own research help to raise interest in your research and increase your impact within your field and beyond. “Open” data leads to increased citations of the data itself, and of associated papers.
  • Funder requirements: A growing number of funding bodies and research councils have adopted research data sharing policies and mandate or encourage researchers to share data and outputs to avoid duplication of effort and reduce data collection costs.
  • Journal publisher requirements: A growing number of journal publishers require data that underpin research findings to be published in open access repositories when manuscripts are submitted.

There may be reasons for not sharing your data e.g. privacy and confidentiality issues, commercial value of the data. Horizon 2020 has coined the phrase: “As open as possible, as closed as necessary.

If you are unable to publicly share your data, consider the possibility that you may wish to make your data available internally to future researchers to facilitate follow-on research, and/or to create a metadata record in your chosen archives or repository. A metadata record will describe your data and aid others in knowing about it. In order to ensure this can happen you will need to manage your data.

 

Reasons for not sharing

There are legitimate reasons for not sharing some or all research data generated by a project. Funders who require data sharing will generally ask that researchers justify this in their Data Management Plan (DMP).

It is generally possible to choose not to share research data using the following criteria:

  • data are commercially sensitive
  • data are confidential (in connection with security issues)
  • sharing would break data protection regulations (though data which have been properly anonymised can be shared without breaching data protection regulations)
  • sharing would mean that the project's main aim might not be achieved
  • the project will not generate / collect any research data

This list has been adapted from the Horizon 2020 recommendations.

 

Access control

Sensitive and confidential data can be safeguarded by regulating or restricting access to and use of the data. Access controls should always be proportionate to the kind of data and level of confidentiality involved. The access controls you can put in place will be guided by those available from your chosen Archive or Repository so it's important to talk to them about your options.

Below we describe different levels of access for data:

  • Open data

Data that can be accessed by any user for any reason, including commercial. Data in this category should not contain personal information unless consent is given.

  • Safeguarded data

Data that are available only under certain conditions. This is for data that contain no personal information, but the data owner considers there to be a risk of disclosure resulting from linkage to other data.

ISSDA provides access to safeguarded quantitative data in the Social Sciences under certain conditions. For example the user must be using the data for research or teaching purposes and must sign a legally binding End User License, which sets out additional terms and conditions.

  • Controlled data

This level of access control is suitable for data that may be disclosed. Access is generally approved by a Data Access Committee, who may require that certain training has taken place or that the data are only available from certain computers in a controlled 'data room'. 

  •  Embargo

Most data repositories allow you to place a temporary embargo on your data. During the embargo period, only the description of the dataset is published. The data themselves will become available in open access after a certain period of time.

 

Publishing and Sharing Sensitive Data

If you are conducting any study involving human participants, and wish to make the data available at the end of the study then you need to consider from the very beginning  when designing the study. Enabling others to re-use your data will mean planning for this from the start of your research project. You will need to think critically of how research data can be shared, what might limit or prohibit data sharing (e.g. consent forms, confidentiality concerns), and whether any steps can be taken to remove such limitations. In paticular you will need to ensure you are asking for informed consent to share the data.

Key messages from ANDS Publishing and sharing sensitive data guide:

  • The advantages of publishing your sensitive data will probably far outweigh any potential disadvantages when simple and appropriate steps are taken
  • Publishing your data, or just a description of your data (that is the metadata), means that others can discover and cite it
  • You can publish a description of your data without making the data itself openly accessible
  • You can place conditions around access to published data
  • Sensitive data that has been de-identified can be shared

Repositories for Sharing Data

Repositories for Sharing Data

 

Data can be shared in ScholarWorks@UMBC. See the ScholarWorks Libguide for more information or contact scholarworks-group@umbc.edu for help.

For discipline specific data repositories, or to search for specific types of data repositories, see Discover Data, above.