Citing research data properly is essential to ensure that the original data creators receive proper credit, and that the data itself can be accessed and verified by others.
Check the README file! Often data authors will give you a recommended citation with any additional elements they would like included. They might also have included a citation file that will export to a citation management tool.
If the data authors do not express a preference, here’s a general guide on how to cite research data:
Key Elements of a Data Citation:
- Author(s): The individual(s) or organization responsible for creating the dataset.
- Title: The name of the dataset.
- Version: The version number of the dataset, if applicable.
- Year of Publication: The year the dataset was made available.
- Repository or Publisher: The platform where the data is stored (e.g., ICPSR, Zenodo, Dryad).
- DOI or Persistent Identifier: A Digital Object Identifier (DOI) or another permanent link to the dataset for easy access.
Example Citation Format:
AMA Format:
- Author(s). Title of the dataset. Version number. Year of publication. Publisher or Repository. DOI or URL.
Example:
Smith J, Doe A. Environmental data from XYZ forest. Version 2. 2024. Dryad. https://doi.org/10.5061/dryad.xxxxxx.
APA Format:
- Author(s). (Year). Title of the dataset (Version). Publisher. DOI or URL
Example:
Smith, J., & Doe, A. (2024). Environmental data from XYZ forest (Version 2). Dryad. https://doi.org/10.5061/dryad.xxxxxx
Chicago Style:
- Author(s). Title of the dataset. Version. Year of Publication. Publisher. DOI or URL.
Example:
Smith, John, and Alice Doe. Environmental data from XYZ forest. Version 2. 2024. Dryad. https://doi.org/10.5061/dryad.xxxxxx
What if the data is unpublished?
When citing unpublished data, you can still follow the general structure of a data citation, but with modifications to account for the lack of formal publication and permanent links like DOIs. If you must use a non-permanent link (e.g., a web address), be aware that URLs can change over time, so it's important to note their temporary nature. Here’s how you can structure the citation:
Key Elements of an Unpublished Data Citation:
- Author(s): The individual(s) or organization responsible for creating the dataset.
- Title: A descriptive title of the dataset (even if it’s informal).
- Version: If applicable, include the version of the dataset.
- Year of Collection: The year or range of years during which the data was collected.
- Repository or Location: The institution or platform where the data is stored (e.g., a department at a university).
- Persistent Identifier: If there’s no DOI, indicate that the link is temporary and provide the URL if necessary. No URL? Omit this step.
- Note on Unpublished Status: Clearly indicate that the data is unpublished.
Example for Unpublished Data with a Non-Permanent Link (APA Style):
Author(s) Last Name, First Initial. (Year Range). Title of dataset [Version if applicable, Unpublished raw data]. Repository or Institution where data is stored. Temporary URL (if applicable).
Example:
- Smith, J., & Johnson, M. (2019–2023). Climate change effects on urban tree growth [Version 2, Unpublished raw data]. University of Example, Department of Environmental Sciences. [Temporary link] http://example.com/dataset
Key Modifications:
- Repository or Institution: List the university, organization, or department where the data is held, since it’s unpublished.
- Temporary Link: If using a non-permanent link, indicate that it’s temporary or may change. Avoid relying on URLs long-term unless they are persistent identifiers.
- Unpublished Status: Make it clear that the data is unpublished to avoid confusion.
- Note About Future DOI: If a DOI is expected in the future, you can add "DOI to be assigned" or simply update the citation later when available.
Important Considerations:
- Temporary Links: If a URL is likely to change or disappear, it’s important to make this clear, and wherever possible, avoid using non-permanent links as part of the citation. It’s always preferable to have a DOI or other permanent link.
- Repository: Even if the data is unpublished, providing as much detail about where the data is stored can help others access it if needed (e.g., contacting the researcher or institution).