DMP IDs and the DMPTool: Announcing DMPTool v. 3.1

Image from "The Post-Office annual Glasgow directory" (1828)
Image from “The Post-Office annual Glasgow directory” (1828) (https://flic.kr/p/oe2ZFe)

Building the recent creation of “A Brave New PID” for DMPs, we are excited to announce that DMP creators can now receive IDs for their DMPs within the DMPTool. From the outset of our NSF-funded EAGER research project, the ability to generate DMP IDs has long been on the strategic roadmap for integrating DMPs into the scholarly knowledge sharing and persistent identifier ecosystem. 

Supporting NSF recommendations for data management 

As the NSF detailed in their May 2019 Dear Colleague Letter on Effective Practices for Data, it is recommended that researchers utilize PIDs for their data outputs and generate DMPs that allow for automated information exchange (machine-actionable DMPs, “maDMPs”). Expanding on the NSF Dear Colleague Letter, the recent report by ARL, AAU, APLU and CDL, Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support, provided key recommendations for effective data practices to support an open research ecosystem emphasizing the adoption of PIDs and maDMPs. 

The DMPTool team continues to work towards supporting these recommendations by building new features and services for an open, automatically updated, interconnected system for data management of research projects. 

Our new feature of generating IDs for DMPs represents tangible progress towards achieving our shared goal of moving DMPs from static text documents into structured, interoperable data that is able to be fed across stakeholders, linking metadata, repositories, and institutions, and allowing for notifications, verification, and reporting in real-time. 

What’s included in this latest release? 

Below is an outline of three new features included in this release. For technical details and a few additional features included, please see our v3.1.0 documentation. These improvements have also been distributed to the larger community within our shared open source codebase, DMPRoadmap. Thank you to the DMPTool Editorial Board for their guidance and feedback as we developed this feature set. We are also appreciative of the DMPTool Administrators who submitted feedback on an early iteration of this release. We intend on incorporating many of these suggestions in future releases and building off the many good ideas shared by all as we continue to expand our support for Networked DMPs

1. IDs for DMPs

Within the Finalize/Publish tab users can “Register” their plan and generate a DMP ID. The DMP ID will then display within the tool and link to a landing page for the plan. For further details on this feature please see our DMP ID documentation.

2. DMP ID Landing Page

After receiving a DMP ID, the system will generate a DMP landing page that includes high level details about the plan. The DMP ID metadata does not include the narrative components of a DMP. For an example of a DMP ID landing page please see this DMP

The landing pages also demonstrate the types of connections made possible by tracking a research project over time from the point of DMP creation. As a project progresses over time, updates to the plan can be connected to the DMP ID and will display on the associated landing page. 

3. Research Outputs Tab

The new Research Outputs tab allows researchers to describe specific project outputs at a more granular and controlled manner than was previously possible solely via the text narrative. In designing this new section, we strived to utilize as many controlled vocabularies and PIDs as possible. Here are some highlights of the new tab:

  • Repository selector tool utilizing the Registry of Research Data Repositories (re3 data registry) that allows researchers to define where they anticipate depositing a specific output 
  • License selector (from SPDX) that allows researchers to define the associated license for specific outputs 
  • Ability to flag outputs as containing sensitive data and/or PII

Researchers can create an unlimited number of specific research outputs. All entered outputs are included in the downloaded version of the plan, placed after the narrative component of the plan so as not to interfere with funder page count limits. 

What’s up next?

With the ability to generate DMP IDs now in place, we are one step closer to creating networked, living DMPs. While this is a great start, we have many additional features in development that will extend the usability and interoperability of this new generation of DMPs. In the coming months, we will be working on developing these additional features:

  • Connecting DMPs to other related research outputs such as datasets and journal articles via the PID Graph
  • Connecting DMP IDs to corresponding ORCID records 
  • Incorporating additional PIDs including research resource identifiers (RRIDs
  • Sponsor and funder approval workflow wherein these stakeholders can review, comment, and approve submitted DMPs
  • Integration with the Electronic Lab Notebook, RSpace
  • Adding the ability for DMPTool admins to curate a list of recommended repositories for the new repository selector tool 

Additionally, in response to several DMPTool admin requests for outreach materials supporting adoption of the DMP ID, we are developing materials to share with the DMPTool admin community in order to promote these data practices amongst their users. 

We will continue to share details on this work and the development of new features to support the networked DMP. Stay tuned for more developments over the coming months for further advancements.

As always, feedback or questions are most welcome and can be sent directly to maria.praetzellis@ucop.edu.

A Brave New PID: DMP-IDs

Cross-posted from DataCite written by Kristian Garza and Matt Buys

Original post: https://doi.org/10.5438/j22a-5d79

Despite the challenges over the last year, we are pleased to share some exciting news about launching the brave new PID, DMP IDs. Two years ago we set out a plan in collaboration with the University of California Curation Center and the DMPTool to bring DMP IDs to life. The work was part of the NSF Eager grant DMP Roadmap: Making Data Management Plans Actionable and allowed us to explore the potential of machine-actionable DMPs as a means to transform the DMP into a critical component of networked research data management.

The plan was to develop a persistent identifier (PID) for Data Management Plans (DMPs). We already have PIDs for many entities, such as articles, datasets etc. (DOIs), people (such as ORCID iDs) and places (such as ROR IDs). We knew that it would be important for DataCite to support the community in establishing a unique persistent identifier for DMPs. Until now, we had no PID for the document that “describes data that will be acquired or produced during research; how the data will be managed, described, and stored, what standards you will use, and how data will be handled and protected during and after the completion of the project”. There was no such thing as a DMP-ID; and today that changes.

Over the last few years, there has been lots of community effort towards establishing a standard data model under the Research Data Alliance (RDA) DMP Common Standards Working Group and we are now able to bring this all together in the form of a new identifier.

DMP schema example

DMP IDs at a fundamental level are registered as a DOI with the resourceTypeGeneral “OutputsManagementPlan.” Since the DataCite release of schema 4.4, the resourceTypeGeneral controlled vocabulary now includes this as a controlled list item. DMP IDs are created in the same way as registering any DOI, with the same required fields, but must include the “OutputsManagementPlan” resourceTypeGeneral to be identifiable.

Generating DMP IDs creates an unbreakable link between a data plan to the project outputs and allows access to DataCite’s supporting services such as Event Data to facilitate connections via the PID Graph.

Assigning DOIs to persistently identify DMPs is a trend that we have seen already. Since 2019, more than 200 DMPs have been assigned a DOI for their identification. Repositories such as Zenodo made this possible by allowing users to select Data Management Plans as one of the many types of resources.

Distribution of DOIs assigned to DMPs by year(creation of the DOI).

We know through our work with the DMP community that the introduction of the formal DMP ID, will allow for DMP IDs to proliferate and serve downstream use cases.

Besides persistently identifying DMPs, the assignment of DMP IDs realizes the promises of machine-actionable DMPs. The DataCite GraphQL API can now expose Data Management Plans and all their connections. Other applications can use the same APIs to build machine-actionable DMPs-based applications such as visualizations or summary statistics.

From today, it is possible for DataCite members to use the MDS API and Fabrica to assign DMP IDs to your Data Management Plans. Our team has created documentation to support the community in registering DMP IDs, understanding best practices and exploring related connections in the PID Graph.

We are really pleased to have reached this milestone and look forward to tracking the downstream impact.