Skip to Main Content

Research Data

Tools for Finding Data Repositories

  • An extensive list of discipline-specific repositories.


  • Repository Finder A new tool recently launched by DataCite for helping people identify and locate online repositories of research data. Draws from the re3data listings for repository information. 



Data Repositories by Discipline


  • Qualitative Data Repository.
    QDR is a dedicated repository for preserving and sharing the digital assets associated with social science and mixed methods projects. It was founded with support from the National Science Foundation and the Center for Qualitative and Multi-Method Inquiry, a unit of the Maxwell School of Citizenship and Public Affairs at Syracuse University.




  • National Neighborhood Data Archive(NaNDA).
    The National Neighborhood Data Archive (NaNDA) is a publicly available data archive containing measures of the physical, economic, demographic, and social environment at multiple levels of spatial scale (eg, census tract, ZIP code tabulation area, county). Each NaNDA dataset covers all or most of the entire nation (including both rural and urban areas) and represents a set of measures on a single topic of interest, including socioeconomic disadvantage, healthcare, housing, partisanship, and public transit, with temporal coverage dating back to 2000.




  • The Cell: An Image Library
    Images of all cell types from all organisms, including intracellular structures and movies or animations demonstrating functions. This project relies upon the cell biology community to populate the library. Freely accessible, easy-to-search, public repository of reviewed and annotated images, videos, and animations of cells from a variety of organisms, showcasing cell architecture, intracellular functionalities, and both normal and abnormal processes.

  • Morphbank
    Holds biological Imaging documents from a wide variety of research including: specimen-based research in comparative anatomy, morphological phylogenetics, taxonomy and related fields focused on increasing our knowledge about biodiversity. The project receives its main funding from the Biological Databases and Informatics program of the National Science Foundation (Grant DBI-0446224).

  • PaleoBiology Database
    "We are bringing together taxonomic and distributional information about the entire fossil record of plants and animals from a large number of researchers at a large number of institutions."


Computer Science

  • GitHub
    Keeps your public and private code available, secure, and backed up.

  • SourceForge
    2.7 million developers create powerful software in over 260,000 projects. Our popular directory connects more than 46 million consumers with these open source projects and serves more than 2,000,000 downloads a day. SourceForge is where open source happens.

  • SNAP
    Stanford Large Network Dataset Collection. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges.


Environmental Sciences

  • The Marine Geoscience Data System (MGDS)
    The Marine Geoscience Data System (MGDS) provides access to data portals for the NSF-supported Ridge 2000 and MARGINS programs, the Antarctic and Southern Ocean Data Synthesis, the Global Multi-Resolution Topography Synthesis, and Seismic Reflection Field Data Portal.


  • IRIS (Incorporated Research Institutions for Seismology).
    From 100+ US universities and the National Science Foundation.

Geosciences & Geospatial Data

  • EarthChem
    Holds data systems and services for geochemical, geochronological, and petrological data, developed and maintained by EarthChem, including the EarthChem Library, the EarthChem Portal, PetDB, NAVDAT, SedDB, and Geochron. EarthChem is operated by a joint team of disciplinary scientists, data scientists, data managers and information technology developers who are part of the NSF-funded data facility Integrated Earth Data Applications (IEDA).

  • The Geosciences Network (GEON)
    This project is a collaboration among a dozen PI institutions and a number of other partner projects, institutions, and agencies to develop cyberinfrastructure in support of an environment for integrative geoscience research. GEON is funded by the NSF Information Technology Research (ITR) program.

  • The National Space Science Data Center
    This serves as the permanent archive for NASA space science mission data. "Space science" means astronomy and astrophysics, solar and space plasma physics, and planetary and lunar science. As permanent archive, NSSDC teams with NASA's discipline-specific space science "active archives" which provide access to data to researchers and, in some cases, to the general public.


  • All of Us Research Hub
    The Research Hub houses one of the largest, most diverse, and most broadly accessible datasets ever assembled. It also provides an interactive Data Browser where anyone can learn about the type and quantity of data that All of Us collects. Users can explore aggregate data including genomic variants, survey responses, physical measurements, electronic health record information, and wearables data.

  • MIRAGE (Middlesex medical Image Repository with a CBIR ArchivinG Environment).
    From JISC and Middlesex University.


  • Nist Atomic Spectra Database
    The Atomic Spectra Database (ASD) contains data for radiative transitions and energy levels in atoms and atomic ions. Data are included for observed transitions of 99 elements and energy levels of 56 elements.

  • CORE Repository (MLA)
    A service offered as part of the MLA Commons, the Commons Open Repository Exchange offers a place to store and publish digital assets and data in the humanities.


  • HumanitiesCommons
    Humanities Commons is a repository for the humanities. Discover the latest open-access scholarship and teaching materials, make interdisciplinary connections, build a WordPress Web site, and increase the impact of your work by sharing it in the repository.
  • DataONE 
    An international federation of data repositories containing earth observations data, including data from fields such as ecology, biology, evolution, and environmental sciences such as hydrology, oceanography, and atmospheric science. DataONE is a federation with participation from hundreds of field stations, universities, and government agencies through the DataONE Member Nodes.

  • Dryad 
    An international repository of data underlying scientific and medical publications, particularly data for which no specialized repository exists. All material in Dryad is associated with a scholarly publication. Most data in the repository are associated with peer-reviewed articles, although data associated with non-peer reviewed publications from reputable academic sources, such as dissertations, are also accepted. Dryad is a non-profit organization.

  • FigShare 
    FigShare allows you to share all of your data, negative results and unpublished figures.

  • KNB
    The Knowledge Network for Biocomplexity (KNB) is an international data repository containing ecology, biology, and environmental science data with a global distribution. The KNB is a grass-roots partnership of collaborating feld stations, laboratories, and research networks that openly publish and share data. The KNB is a Member Node within the DataONE data federation.

    Stands for "Publishing Network for Geoscientific & Environmental Data". Open to deposits from any scientist. Most datasets are open; some are restricted. Hosted by the Alfred Wegener Institute for Polar and Marine Research and the University of Bremen's Center for Marine Environmental Sciences.