Skip to Main Content

Research Data

Active Storage Site Selection

"Active" or "Working" storage refers to the where you store your data while you're collecting and accessing it during the course of a project. Some storage options will better meet your project's needs, others not so much.

UMBC has a campus-wide subscription to Lab Archives for data collection and research documentation. It allows all research team members to work together and communicate, with granular access control, change history, offsite disaster-recovery back-ups. and more.

Box (UMBC FAQ), Google Drive (UMBC FAQ ), and Microsoft OneDrive are available to UMBC Faculty, Staff, and Students. DoIt has a chart comparing Box and Google Drive, here: https://wiki.umbc.edu/pages/viewpage.action?pageId=31916775.

Factors to consider when choosing where to store your data:

  • Anticipated size of dataset--Will you exceed space quotas? Will a cloud service readily upload and download files of that size?
  • Computational requirements--Do you need high speed/performance processors for large scale analysis? If so, consider using the UMBC High Performance Computing Facility (you'll need to use the Linux operating system). This isn't appropriate for data storage that doesn't need to use the HPCF processors. It also doesn't provide backups or version control like the cloud storage, so you'll need to manage that  yourself.
  • Sharing capabilities and permission settings--Do you have a project team that will need to access the data? Do you want to limit what student assistants or other project participants can do?
  • Version control--Will it be helpful to have a history of the changes made to your data? Will your storage do this automatically for you? Or do you need to design and use a version control table?
  • Backup--Will it backup your data? Or do you need set up a backup?
  • Security--Will it meet IRB standards for storage of human subjects data? Is the data encrypted? Does it use secure transmission channels? Does it require strong passwords?

Backup

3-2-1 RULE

To keep data safe, it is recommended that folks follow the 3-2-1 Rule, which suggests you maintain three copies of your data on two different storage types, with 1 of those being offsite:

3-2-1 rule as described above with some clipart

 

3-2-1 WITH UMBC RESOURCES

Both Google Drive and Box have desktop applications (Google FilestreamBox Drive) where folks can mount and access files quickly. When downloaded and installed, the applications create a folder that appears just like a My Documents folder, only it’s connected to your account on whatever service (so it’s Google Drive or Box in your file explorer). Then it operates like a two-way door: changes will be synced to and from your local computer to the service in the cloud.

This helps us stick to the 3-2-1 rule pretty nicely as well:

  1. Sync data between local copies (on all my computers) and on the Google Drive server located elsewhere.
    1. So this is 2 copies on 2 different storage media, with 1 copy offsite
  2. Run the backup to an external hard drive over the Google Drive folder on my laptop whenever there are changes.
    1. This brings us to 3 copies on 2 media with 1 offsite copy!