Supporting the upcoming NIH data sharing requirements with the DMPTool

Immediately following the new NIH Policy for Data Management and Sharing (DMS) announcement, academic medical centers started to contact the DMPTool team with questions about how best to prepare for the pending policy change. These centers faced new and seemingly daunting requirements for managing and sharing biomedical data; fortunately, the DMPTool offers a simplified way of providing access to federal and local guidance regarding data sharing practices and features to connect researchers and data librarians. As a result, new medical centers join the DMPTool community every week. 

The need for the DMPTool arose in 2011, as transformative requirements for data management plans (DMPs) for NSF proposals began to take shape. In light of the upcoming policy changes for NIH research and the DMS plan requirement, we thought it would be helpful to highlight the work the DMPTool community is doing to help address the needs of the NIH research community.

Community supported work

Given that the DMPTool is based out of the University of California, it seemed fitting that we begin work supporting the NIH policy changes in collaboration with data librarians from the UC system. Our group included librarians from UC Berkeley, UC Irvine, UC San Francisco, and honorary UC, Stanford. The informal Working Group (WG) gathered to develop and share language for a generic NIH template and an FAQ document addressing questions the librarians had fielded thus far regarding the policy. This collaboration resulted in adding sample text language for each of the questions asked by the NIH. There were a few areas where the group found the NIH requirements needed more clear explanations, so as a result, the WG  also developed plain language where necessary.  

Following this work on the generic NIH DMP template, Nina Exner from Virginia Commonwealth University brought up the need for revising templates as guidance comes out of the NIH, and assessing the need for individual templates for specific NIH Centers and Institutes (IC). NIH has stated that while the overarching policy is the “minimum,” “ICs may supplement the DMS Policy as appropriate.” 

Several NIH ICs have released specific requirements on top of the overarching policy, and we expect additional new IC policies as we get closer to January 25, 2023, the effective date. Exner spearheaded a WG of data librarians to analyze IC policies and develop templates for ICs with different or more restrictive DMS policies when necessary.  Some ICs may not require the development of individual templates, and the WG will make this assessment. There are twenty-seven ICs, so composing individual templates for each necessitates this community effort. 

The WG is also using their experience with researcher needs to craft expanded example language for different disciplines. The new policy affects all disciplines, from basic sciences to clinical and biobehavioral, and the WG wants to help researchers see how the new Plan format connects to their disciplines. With this growing need for NIH templates, we also have a growing need for training. To support the increasing number of new medical centers joining the DMPTool, the WG is developing educational materials to support institutions that want to train local researchers on using the DMPTool templates. We are grateful to the NIH DMP Working Group for taking on this work and creating a valuable set of templates for the larger DMPTool community to utilize.

Ongoing work

The DMPTool has been a community-supported tool from inception, and the new NIH requirements highlight the need for ongoing support to keep the DMPTool up to date with policy changes. The WG is now focused on building and updating the NIH DMP templates and will publish them as work is completed. In recent weeks, NIH has announced implementation details for the DMS policy, including an optional DMS format page and harmonized genomic data sharing requirements, as well as the new FORMS-H grant application forms that will incorporate a new DMS plan section. The DMPTool NIH WG is now adding these to the template. 

How to access the templates

The NIH Generic template with community-developed guidance and sample language is available in the DMPTool by logging in and selecting the “NIH-GEN DMSP (Forthcoming 2023) template” from the Create Plan section. 

Screenshot of selecting the NIH-GEN DMSP (Forthcoming 2023) template in the DMPTool

We will continue to blog on our work as it continues and share the individual IC templates as they are published. Please reach out with any feedback, comments, or suggestions!

Members of the initial UC/Stanford Working Group

  • Anna Sackmann, University of California, Berkeley
  • Ariel Deardorff, University of California, San Francisco
  • Erin Foster, University of California, Berkeley
  • John Borghi, Stanford University
  • Wasila Dahdul, University of California, Irvine

Members of the current NIH DMP Working Group

  • Amy Yarnell, University of Maryland, Baltimore
  • Betsy Gunia, Johns Hopkins University
  • Genevieve Milliken, NNLM National Center for Data Services
  • Jim Martin, University of Arizona
  • Katy Smith, Saint Louis University
  • Lesley Skalla, Duke University Medical Center Library
  • Matt Covey, The Rockefeller University
  • Megan O’Donnell, Iowa State University
  • Megan Potterbusch, George Washington University
  • Melissa Ratajeski, University of Pittsburgh
  • Nina Exner, Virginia Commonwealth University
  • Seonyoung Kim, Washington University School of Medicine in St. Louis
  • Wasila Dahdul, University of California, Irvine
  • Will Dean, Temple University

Things to know about the updated DMPTool website

DMPTool Home Page

We’re starting 2022 with updates to the DMPTool website designed to make using the site easier and more efficient. 

Improved authentication process. We’ve streamlined the authentication process. A user enters their email address, and the system will automatically direct them to the next step. This new process avoids some complexity in the process, which is due to the fact that about half of our users use SSO for authentication and the other half manage their credentials. 

New Public Plans page search functionality Discovering sample DMPs just got a lot easier. The new public plans listing includes faceted search functionality to narrow results by Funder, Institution, language, and Subject. Additionally, DMPTool Administrators can now flag a plan as “Featured.” Featured plans are automatically displayed at the top of the page and provide a simple way to highlight exemplars. 

If you have plans in your organizational account that you think would be useful to the community, please consider requesting that the plan owner make them public and then flagging them as “Featured.” We hope to build a listing of exemplar DMPs to help showcase best practices and serve as educational aids for those writing their first plans. Of course, we’ll need your help to do this, so I’m hoping we can build on this together as a community by flagging exemplars as “Featured.” 

New Public Plans listing page with improved tools for discovery.
DMPTool Admins can mark public plans from their organization as “Featured.”

Navigation updates Admins will notice a change to the administrative functions drop-down list. Administrative functions are now located in the top right-hand of the page. 

Updated navigation points.

Please see our detailed Release Notes for a complete list of all the changes and new features. 

If you have any questions or comments, please contact us

Supporting FAIR Data with Integrations

As part of our work to extend the usability and interoperability of the new Networked DMP, we have partnered with Research Space (RSpace), a connected electronic lab notebook, and developed a prototype integration that allows users to track research data throughout the research life cycle. This new integration enables tri-directional data flows between RSpace, DMPTool, and data repositories, facilitating higher quality and more comprehensive research data capture and tracking.

The lack of interoperability between tools is arguably the most significant barrier to streamlining workflows throughout the research lifecycle. This gap prevents the comprehensive collection and incorporation of research data and metadata into the research record captured during the active research phase. Furthermore, it limits the scope for passing this data and metadata on to data repositories, thus undermining FAIR data principles and reproducibility. Bridging this gap is precisely what the integration seeks to address. 

Researchers are now able to reference and update their DMPs from within RSpace. The aims of this feature are:

  1. To add value to DMPs to link to datasets generated throughout a study and become ‘living documents.’
  2. Reduce the burden on researchers in keeping their DMPs up to date.
  3. Append DOIs and links to datasets exported from RSpace to a repository to be associated with a DMP.

Learn More

To see this workflow in action, check out this short demo from FORCE 2021, where Julie Goldman from the Harvard Medical School explains how the DMPTool – RSpace – Dataverse workflow enhances research data capture in the Harvard environment. The user documentation also contains a walk-through of the feature with instructions. 

Additionally, an upcoming FORCE11 Community Call on January 25, 2022, will showcase this and other integrations and demonstrate ways in which data capture and reproducibility are enhanced by tool interoperability.  

Please reach out with any comments, suggestions, or ideas for future integrations!

Announcing the FASEB Dataworks! Data Management Plan Challenge

Crossposted from the FASEB website

Biomedical and biological researchers writing data management plans frequently ask for examples of excellent DMPs that incorporate their data types and specific storage, access, and IP requirements. To respond to this need, and to celebrate excellence in managing and sharing biomedical and biological data, FASEB announces the DataWorks! Data Management Plan Challenge.

We are seeking DMPs that incorporate image data types, human subjects data, and big data into a template based on the 2023 NIH Data Management and Sharing Plan Requirements (NOT-OD-014). We invite you to amend your existing DMP or create one specifically for this competition for a chance to win one of 10 Outstanding NIH DMSP awards, each of which includes a prize of $500. 

The competition is a joint initiative with DMPTool, the Research Data Access and Preservation Association, and the Network of National Libraries of Medicine.

How to Enter

Rules for Submission

  • The DMP must describe a current or past biological or biomedical research project that uses or generates images or human subjects data and generates big data. 
  • The DMP must use the FASEB DataWorks! template in the DMPTool. You will need to register for an account (it’s free) to access the template.
  • The DMP should be in English no more than two single-spaced pages in length.
  • The DMP must be prepared using the DMPTool and made publicly available online. 
  • DMP Challenge submissions must use the FASEB submissions form and include a DMP-ID generated through the DMPTool.
  • The author(s) must complete the submission form by 11:59 pm EDT January 31.

Because of legal restrictions beyond our control, while anyone submitting a DMP will be considered for an award, the monetary component of the award is only available to participants who are eligible to work in the United States.

Evaluation

Valid submissions will be reviewed by a panel of expert judges, guided by a DMP rubric based on the Belmont Forum and DART Project, and aligned with the NIH Policy for Data Management and Sharing requirements.  We expect to notify the winners via email by March 1 and publicly announce them at the March DataWorks! Salon and on our website and social media. Contact the Challenge organizers with any questions.

The evaluation rubric is available here.

Our panel judges are:

  • Nicole Contaxis, MLIS, Data Librarian and Lead, NYU Data Catalog, New York University Health Sciences Library
  • Ariel Deardorff, MLIS, Data Services Librarian, University of California San Francisco 
  • Nina Exner, PhD, Research Data Librarian and Associate Professor, Scholarly Communications and Publishing, Virginia Commonwealth University
  • Lisa Federer, PhD, Data Science and Open Science Librarian 
  • Carrie Iwema, PhD, Coordinator of Basic Science Research Support, Health Sciences Library System, University of Pittsburgh 
  • Genevieve Milliken, MSLIS, Data Services Librarian, New York University Health Sciences Library
  • Elizabeth Suelzer, MLIS, Scholarly Communications Librarian, Medical College of Wisconsin

FAIR Island Project Receives NSF Funding

FAIR Island

Crossposted from the FAIR Island website

The California Digital Library (CDL), University of California Gump South Pacific Research Station, Berkeley Institute for Data Science (BIDS), Metadata Game Changers, and DataCite are pleased to announce that they have been awarded a 2- year NSF EAGER grant entitled “The FAIR Island Project for Place-based Open Science” (full proposal text). 

The FAIR Island project examines the impact of implementing optimal research data management policies and requirements, affording us the unique opportunity to look at the outcomes of strong data policies at a working field station. Building on the Island Digital Ecosystem Avatars (IDEA) Consortium (see Davies et al. 2016), the FAIR Island Project leverages collaboration between the Gump Station on the island of Moorea in French Polynesia (host of the NSF Moorea Coral Reef Long-Term Ecological Research site), and Tetiaroa Society, which operates a newly established field station located on the atoll of Tetiaroa a short distance from Moorea. 

The FAIR Island project builds interoperability between pieces of critical research infrastructure — DMPs, research practice, PIDs, data policy, and publications contributing to the advancement and adoption of Open Science.  In the global context, there are ongoing efforts to make science Open and FAIR to bring more rigor to the research process, in turn increasing the reproducibility and reusability of scientific results.  DataCite as a global partner in the project, has been working to recognize the importance of better management of research entities. This has led to critical advances concerning the development of infrastructure for Open Science. Increased availability of the different research outputs of a project (datasets, pre-registrations, software, protocols, etc.) would enable the reuse of research to aggregate findings across studies to evaluate discoveries in the field and ultimately assess and accelerate progress.

Key outcomes the FAIR Island team will develop include: 

  1. CDL, BIDS, and the University of California Natural Reserve System will work together to build an integrated system for linking research data to their associated publications via PIDs. We will develop a provenance dashboard from field to publication, documenting all research data and research outcomes derived from that data. 
  1. The project also facilitates further development of the DataCite Commons interface and extends connections made possible via the networked DMP that allows users to track relationships between DMPs, investigators, outputs, organizations, research methods, and protocols; and display citations throughout the research lifecycle.
  1. Developing an optimal data policy for place-based research by CDL, BIDS, and Metadata Game Changers is the cornerstone component of the FAIR Island project.  A reusable place-based data policy template will be shared and implemented amongst participating UC-managed field stations and marine labs. In addition, we will be incorporating these policies into a templated data management plan within the DMPTool application and sharing it with the broader community via our website, whitepapers, and conferences such as the Research Data Alliance (RDA) Plenaries.

The FAIR Island project is in a unique position to demonstrate how we can advance open science by creating optimal FAIR data policies governing all research conducted at field stations. Starting with the field station on Tetiaroa, the project team plans to demonstrate how FAIR data practices can make the reuse of data and the collaboration of data more efficient. Data Management Plans (DMPs) in this “FAIR data utopia” will be utilized as key documents for tracking provenance, attribution, compliance, deposit, and publication of all research data collected on the island by implementing mandatory registration requirements, including extensive use of controlled vocabularies, personal identifiers (PIDs), and other identifiers.

The project will make significant contributions to international Open Science standards and collaborate with open infrastructure providers to provide a scalable implementation of best practices across services. In addition, DataCite seeks to extend the infrastructure services developed in the project to their member community across 48 countries and 2,500 repositories globally. 

We will continue to share details and feature developments related to the FAIR Island project via our blog. You can join the conversation at the next RDA plenary in November 2021. Feedback or questions are most welcome and can be sent directly to info@fairisland.org

Connecting the DMP ID to an ORCID record

Recently we announced that the DMPTool can now generate persistent, unique IDs (the DMP ID) for plans created within the application. Building on this development, we are thrilled to share that the scholarly identifier service for researchers, ORCID, recently adopted the DMP as a resource type. As a result, DMPs are now a defined work type within an ORCID record and listed on an individual’s ORCID record. The connection between a DMP ID and ORCID is crucial for the Networked DMP, as ORCIDs play a key role in facilitating connections between researchers, institutions, outputs, and projects. It is precisely these types of relationships that we are enabling through our work on Networked DMPs.

Screenshot of manually adding a DMP as a work to an ORCID record

Additionally, DMP IDs generated via the DMPTool are now automatically linked to the DMP creator’s ORCID record. This means that when a DMPTool user “Registers” their plan, a DMP ID is generated, and this record is automatically pushed to ORCID and included as a work on their ORCID profile page. 

“Registering” a DMP will generate a DMP ID and push this work to the associated ORCID record
After a DMP ID is generated this work will be listed as a work on the researcher’s ORCID record

Together with Liz Krznarich from DataCite and DMPTool Editorial Board member ​​Nina Exner from Virginia Commonwealth University, I recently participated in an ORCID Community Call demonstrating this new integration and discussing our approach to building the Networked DMP. A recording of the webinar is available here, and our combined slide deck is available here.  

The DMPTool team continues to expand the Networked DMP. Development is currently underway for additional features within the DMPTool, including DMP versioning and advancing our API to facilitate external integrations. We look forward to sharing updates with you soon about these exciting advancements. In the meantime, as always, feedback or questions are most welcome and can be sent directly to maria.praetzellis@ucop.edu.

DMP Competition Winners: DMPs so good they go to 11

Last December we announced the inaugural Qualitative Data Management Plan (DMP) Competition, sponsored jointly by The Qualitative Data Repository, Princeton Research Data Service, and the DMPTool. As qualitative researchers writing such plans frequently ask for examples of excellent DMPs for qualitative research, we hoped that this competition would assemble a trove of exemplar DMPs that we could share with the research community. 

We received a wealth of excellent submissions. Many of the DMPs were so good in fact, that for that extra push over the cliff we decided to expand our pool of awardees from 10 to 11 outstanding Qualitative DMPs from a wide range of disciplines. We couldn’t be more excited to announce these winners today. We’re hugely thankful to everyone who submitted a DMP, and, of course, to the five data management experts who judged the entries (listed below).

Each entry was reviewed by three expert judges. They assessed DMPs on a 1-4 (not adequate to exemplary) scale for each item in an 18-item rubric rubric based on the DART Project  as well as guidance from the DMPTool. Judges also assigned an overall quality score from 1-10 to each DMP. You can find our rubric on OSF. Rubric scores and overall scores were closely correlated (r=.89), suggesting that the rubric closely aligned with experts’ assessments of overall quality. We also asked judges to include some overall observations about each DMP: we have included excerpts from these for each winner. 

And the awards for Outstanding Qualitative DMP go to:

Listed alphabetically by first author with summary comments from the judges

1. Amelia Acker, Ashley Bower, Emily Simpson, Bethany Radcliff, University of Texas at Austin, School of Information, “COVID-19 Oral Histories Project,” developed for research by Whitney Chappell, University of Texas at San Antonio 

“Wonderful DMP and approach to community-centered work”

2. Nicholas Bell, University of Pennsylvania and Georgetown University, “Why Do So Few Workers Take Trade Adjustment Assistance” 

“This is a strong DMP, and it’s clear the author has thought through and begun implementing good data management principles even in the composition of the DMP itself. Clear descriptions of data collection and plans for storing and sharing.”

3. Patricia Condon, Louise Buckley and Eleta Exline, University of New Hampshire,  “Teaching Quantitative Data in the Social Sciences at the University of New Hampshire: Data Management Plan”.

“Concise and straightforward descriptions of data formats, plans for storing and preserving … Wonderful DMP and acknowledgement that it’s a living document!”

4. Dayna Cueva Alegría, University of Kentucky, NSF SBE, “Water Pollution Governance in Lake Titicaca: Creating Political Spaces of Democratization

“Strong DMP with a lot of attention and detail paid to data formats, storage, preservation, and sharing”

5. Laura Garbes, Brown University, NSF SBE, with Andrew Creamer, Science Data Specialist, Brown University,  “Analyzing Diversity Efforts in Public Radio Organizations – A comparative approach to performance standards in the workplace” 

“…this DMP is pretty much perfect. Includes different measures to avoid issues related to confidentiality and security as well as it is clearly committed to data discoverability, accessibility and reusability, specially when articulates about the storage/archiving options”

6. Christopher Hale, University of Alabama, NSF- SBE “Ethnic Diversity and Public Goods Provision Across Latin America”

“Strong plan for description of data collection, storage, and sharing, with good attention to considerations for de-identifying data during the entire process, prior to depositing with the repository. This DMP has a lot of great detail about the security and anonymity practices of the PI…”

7. Jaeci Hall, University of Oregon, NSF-SBE, “Text Analysis of Taldash (GAL) in Support of Nuu-wee-ya’ Language Revitalization: Indigenous-based linguistic analysis and methodological reflections

“This DMP is an excellent example of cultural sensitivity when working with indigenous materials… Good plans for handling sensitive data and the role of partner institutions with regards to data ownership and rights to share.”

8. Tina Nabatchi, PARCC, Syracuse University and  Rebecca McLain, Portland State University, NSF SBE, “The Atlas of Collaboration: Building the World’s First Large N Database on Collaborative Governance” 

“This DMP is strong in describing both how data will be gathered and maintained now, and how it will be appropriately archived in the future. Provides a great description of the expected data and roles and responsibilities with regard to data in a multi-institutional project. Fantastic DMP.”

9. Joshua Rubin, Bates College, NSF SBE, with Pete Schlax, Science and Data Librarian, Bates College, “Possibility Spaces and Possible Things

“Overall an excellent DMP… [T] he overall plan is strengthened by inclusion of QDR selection for data sharing” 

10. Carolina Seigler, Princeton University, Department of Sociology, NSF-SBE, “Religion and Sexual Violence” 

“Compelling DMP, really made the case why the data cannot be shared well and the security provisions were exemplary.” 

11. Ieva Zumbyte, Brown University,  NSF-SES, with Andrew Creamer, Science Data Specialist, Brown University, “Tracing the Quality of Public Childcare in the Neighborhoods of Chennai, India

“…carefully considers issues such as licensing and re-identification of de-identified data… Very good description of the chosen repository and the characteristics that backup such a choice, even though the raw data won’t be shared.”

Our panel of expert judges

  • Renata G. Curty, Social Sciences Research Facilitator, UCSB Library’s Research Data Services, University of California, Santa Barbara
  • Jennifer Doty, Research Data Librarian, Emory University
  • Celia Emmelhainz, Anthropology & Qualitative Research Librarian, University of California, Berkeley
  • Megan O’Donnell, Data Services Librarian, Iowa State University
  • Vicky Rampin, Research Data Management and Reproducibility Librarian, New York University Libraries

DMP IDs and the DMPTool: Announcing DMPTool v. 3.1

Image from "The Post-Office annual Glasgow directory" (1828)
Image from “The Post-Office annual Glasgow directory” (1828) (https://flic.kr/p/oe2ZFe)

Building the recent creation of “A Brave New PID” for DMPs, we are excited to announce that DMP creators can now receive IDs for their DMPs within the DMPTool. From the outset of our NSF-funded EAGER research project, the ability to generate DMP IDs has long been on the strategic roadmap for integrating DMPs into the scholarly knowledge sharing and persistent identifier ecosystem. 

Supporting NSF recommendations for data management 

As the NSF detailed in their May 2019 Dear Colleague Letter on Effective Practices for Data, it is recommended that researchers utilize PIDs for their data outputs and generate DMPs that allow for automated information exchange (machine-actionable DMPs, “maDMPs”). Expanding on the NSF Dear Colleague Letter, the recent report by ARL, AAU, APLU and CDL, Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support, provided key recommendations for effective data practices to support an open research ecosystem emphasizing the adoption of PIDs and maDMPs. 

The DMPTool team continues to work towards supporting these recommendations by building new features and services for an open, automatically updated, interconnected system for data management of research projects. 

Our new feature of generating IDs for DMPs represents tangible progress towards achieving our shared goal of moving DMPs from static text documents into structured, interoperable data that is able to be fed across stakeholders, linking metadata, repositories, and institutions, and allowing for notifications, verification, and reporting in real-time. 

What’s included in this latest release? 

Below is an outline of three new features included in this release. For technical details and a few additional features included, please see our v3.1.0 documentation. These improvements have also been distributed to the larger community within our shared open source codebase, DMPRoadmap. Thank you to the DMPTool Editorial Board for their guidance and feedback as we developed this feature set. We are also appreciative of the DMPTool Administrators who submitted feedback on an early iteration of this release. We intend on incorporating many of these suggestions in future releases and building off the many good ideas shared by all as we continue to expand our support for Networked DMPs

1. IDs for DMPs

Within the Finalize/Publish tab users can “Register” their plan and generate a DMP ID. The DMP ID will then display within the tool and link to a landing page for the plan. For further details on this feature please see our DMP ID documentation.

2. DMP ID Landing Page

After receiving a DMP ID, the system will generate a DMP landing page that includes high level details about the plan. The DMP ID metadata does not include the narrative components of a DMP. For an example of a DMP ID landing page please see this DMP

The landing pages also demonstrate the types of connections made possible by tracking a research project over time from the point of DMP creation. As a project progresses over time, updates to the plan can be connected to the DMP ID and will display on the associated landing page. 

3. Research Outputs Tab

The new Research Outputs tab allows researchers to describe specific project outputs at a more granular and controlled manner than was previously possible solely via the text narrative. In designing this new section, we strived to utilize as many controlled vocabularies and PIDs as possible. Here are some highlights of the new tab:

  • Repository selector tool utilizing the Registry of Research Data Repositories (re3 data registry) that allows researchers to define where they anticipate depositing a specific output 
  • License selector (from SPDX) that allows researchers to define the associated license for specific outputs 
  • Ability to flag outputs as containing sensitive data and/or PII

Researchers can create an unlimited number of specific research outputs. All entered outputs are included in the downloaded version of the plan, placed after the narrative component of the plan so as not to interfere with funder page count limits. 

What’s up next?

With the ability to generate DMP IDs now in place, we are one step closer to creating networked, living DMPs. While this is a great start, we have many additional features in development that will extend the usability and interoperability of this new generation of DMPs. In the coming months, we will be working on developing these additional features:

  • Connecting DMPs to other related research outputs such as datasets and journal articles via the PID Graph
  • Connecting DMP IDs to corresponding ORCID records 
  • Incorporating additional PIDs including research resource identifiers (RRIDs
  • Sponsor and funder approval workflow wherein these stakeholders can review, comment, and approve submitted DMPs
  • Integration with the Electronic Lab Notebook, RSpace
  • Adding the ability for DMPTool admins to curate a list of recommended repositories for the new repository selector tool 

Additionally, in response to several DMPTool admin requests for outreach materials supporting adoption of the DMP ID, we are developing materials to share with the DMPTool admin community in order to promote these data practices amongst their users. 

We will continue to share details on this work and the development of new features to support the networked DMP. Stay tuned for more developments over the coming months for further advancements.

As always, feedback or questions are most welcome and can be sent directly to maria.praetzellis@ucop.edu.

A Brave New PID: DMP-IDs

Cross-posted from DataCite written by Kristian Garza and Matt Buys

Original post: https://doi.org/10.5438/j22a-5d79

Despite the challenges over the last year, we are pleased to share some exciting news about launching the brave new PID, DMP IDs. Two years ago we set out a plan in collaboration with the University of California Curation Center and the DMPTool to bring DMP IDs to life. The work was part of the NSF Eager grant DMP Roadmap: Making Data Management Plans Actionable and allowed us to explore the potential of machine-actionable DMPs as a means to transform the DMP into a critical component of networked research data management.

The plan was to develop a persistent identifier (PID) for Data Management Plans (DMPs). We already have PIDs for many entities, such as articles, datasets etc. (DOIs), people (such as ORCID iDs) and places (such as ROR IDs). We knew that it would be important for DataCite to support the community in establishing a unique persistent identifier for DMPs. Until now, we had no PID for the document that “describes data that will be acquired or produced during research; how the data will be managed, described, and stored, what standards you will use, and how data will be handled and protected during and after the completion of the project”. There was no such thing as a DMP-ID; and today that changes.

Over the last few years, there has been lots of community effort towards establishing a standard data model under the Research Data Alliance (RDA) DMP Common Standards Working Group and we are now able to bring this all together in the form of a new identifier.

DMP schema example

DMP IDs at a fundamental level are registered as a DOI with the resourceTypeGeneral “OutputsManagementPlan.” Since the DataCite release of schema 4.4, the resourceTypeGeneral controlled vocabulary now includes this as a controlled list item. DMP IDs are created in the same way as registering any DOI, with the same required fields, but must include the “OutputsManagementPlan” resourceTypeGeneral to be identifiable.

Generating DMP IDs creates an unbreakable link between a data plan to the project outputs and allows access to DataCite’s supporting services such as Event Data to facilitate connections via the PID Graph.

Assigning DOIs to persistently identify DMPs is a trend that we have seen already. Since 2019, more than 200 DMPs have been assigned a DOI for their identification. Repositories such as Zenodo made this possible by allowing users to select Data Management Plans as one of the many types of resources.

Distribution of DOIs assigned to DMPs by year(creation of the DOI).

We know through our work with the DMP community that the introduction of the formal DMP ID, will allow for DMP IDs to proliferate and serve downstream use cases.

Besides persistently identifying DMPs, the assignment of DMP IDs realizes the promises of machine-actionable DMPs. The DataCite GraphQL API can now expose Data Management Plans and all their connections. Other applications can use the same APIs to build machine-actionable DMPs-based applications such as visualizations or summary statistics.

From today, it is possible for DataCite members to use the MDS API and Fabrica to assign DMP IDs to your Data Management Plans. Our team has created documentation to support the community in registering DMP IDs, understanding best practices and exploring related connections in the PID Graph.

We are really pleased to have reached this milestone and look forward to tracking the downstream impact.

DMPRoadmap Annual Planning Meeting

This is a joint blog post between DMPonline and the DMPTool

In February we conducted our annual strategic planning meeting between DCC and CDL to discuss joint plans for the upcoming year. We were joined from DCC by: Kevin Ashley, Patricia Herterich, Magdalena Drafiova, Marta Nicholson, Ray Carrick, Angus Whyte, Diana Sisu and from CDL: John Chodacki, Marisa Strong, Catherine Nancarrow, Brian Riley and Maria Praetzellis.

This meeting was a follow up to our 2019  meeting, where we had a chance to meet for three days with our colleagues and we wanted to replicate this in our half day online meeting. This time around we had to swap to Zoom for the lovely city of Edinburgh and only met for a half day instead of three days. Nonetheless, we managed to accomplish some important high level planning discussions regarding the work of continuing our collaboration on the Roadmap codebase. In this blog post we provide you with the summary of what we discussed and share our plans for the coming months. 

Celebrating the achievements of 2020

We all agreed that despite the many challenges of 2020 (not to mention the departure of Sarah Jones and Sam Rust), this was a very successful year for our collaboration. Our team of developers completed several large developments a few of which are highlighted below: 

  • Completed the Rails5 migration 
  • Developed an API that is compliant with the RDA Common Standard for DMPs
  • Released a new feature allowing for conditional questions and notifications within DMP templates
  • Improved the usage dashboard
  • Integrated with Google Analytics
  • Integrated with translation.io to facilitate several languages

Several new features surrounding machine-actionable DMPs were also released of the past year including: 

  • RORs Identifiers for research organizations
  • Funder Registry Identifiers for funders
  • ORCiDs for DMP creators and collaborators
  • API compliant with RDA Common Standard Metadata Schema 
  • Ability to export plans as RDA Common Standard compliant JSON

Highlights of our 2021 Development Plans 

During the first quarter of 2021, DMPonline will focus on consolidating the code base, making sure the various changes both the DMPTool and DMPonline team have developed over the past year are integrated and any new work is carried out on top of a shared code base. 

UX Improvements 

Based on the extensive usability testing that both DMPTool and DMPonline have conducted over the past year, we will select pieces of work that will have significant impact for both services. Initially we will focus on the creation of a new plan wizard making the creation of new plans and the selection of templates and appropriate guidance easier.

Expanded machine-actionable DMP features

  • The ability to generate a unique identifier for a DMP with an associated landing page that connects the DPM to eventual research outputs
  • A new Research Outputs tab will allow for more granular description of specific research outputs 
  • Integration with the Registry of Research Data Repositories (re3data)
  • Integration with FAIRsharing
  • Plan versioning

DMPRoadmap for funders

In 2021, we will also work on making DMPRoadmap more useful to funders. This will include:

  • A different dashboard view
  • Easier ways to integrate grant numbers and other funder specific information
  • Tagging of institutional DMP templates as funder compliant

Other collaborations

The DMPonline team will also work with the TU Delft on a project that will integrate the system more with institutional login options to automatically get more information about users and use that to improve workflows and reporting for institutional admins.

RSpace integration

The electronic lab notebook, RSpace, and the DMPTool are currently working on an integration allowing for the bi-directional linking of data between DMPTool and RSpace. The first phase of this work is currently in development and utilizes OAuth so that users can connect accounts. Once we get this initial connection running, the team will look at bi-directional notifications and updates between the two systems.

For a more detailed description of our upcoming development plans please see our wiki page. This promises to be another busy but exciting year of work for both teams and we look forward to continuing to share our progress with you!