Latest DMPTool Release 

The DMPTool team is hard at work developing a suite of new features to facilitate the creation of optimized and structured DMPs. In response to growing federal mandates for data sharing, the DMPTool is focused on supporting these new policies by exposing the information contained within a DMP in a machine-readable format to facilitate tracking research projects as they progress through the research lifecycle. Two National Science Foundation EAGER grants have supported this work: DMP Roadmap: Making Data Management Plans Actionable and the current FAIR Island project

This latest DMPTool release includes several updates focused on UX improvements and streamlining workflows.

New Follow-up tab 

Follow-up tab to update funding status & link research outputs

This new tab on the Create Plan workflow allows plan creators (or administrators with permission to modify a plan) to add information about funded projects. The new fields include the funding status and a grant number or ID. This tab will only appear on plans that have associated DMP-IDs.

A new section allows users to connect Research Outputs from their work to their associated DMP. Research Outputs can be anything related to a project with a DOI or other URL-based identifier. For example, a project output could be a dataset, protocol, or software connected to a research project. 

These research outputs will be recorded in the metadata of the DMP-ID as a related identifier and passed back to DataCite. By submitting updated metadata to DataCite, this workflow facilitates tracking scholarly outputs and is openly available for consumption and ingest by other systems.

UX changes to the Finalize tab

In response to user feedback, we have streamlined the UI to clarify what a researcher needs to do to get a persistent identifier for their DMP and why they would want to do this.  UI changes include: 

  • New plan text language explains what a DMP-ID is, what identifiers do, and why they should get one for their DMP. 
  • The “Register” button is present but disabled if the preconditions are not met. 
  • Moved the Finalize tab to before Download to better reflect the logical workflow of creating a DMP. 

Improved DMP-ID Landing Page Design

Based on user feedback, we have redesigned the DMP-ID landing page for improved accessibility and to make it clear where a user can view a PDF version of a plan. This redesign also allows us to build new plan versioning features in the coming months. 

Sample DMP-ID Landing Page with a link to the full-text narrative 
Example of Project Outputs as they appear on the DMP ID landing page

Other miscellaneous updates & bug fixes included in this release

  • Research Outputs now appear in the CSV and TXT versions #406
  • Fixed an issue that was causing the DOCX version of the plan from displaying an error in MS Word when opening the document
  • Fixed an issue with the sans-serif font used in PDF generation. Switched from Helvetica (which is no longer downloadable for free) to Roboto and also updated spacing between questions/sections.
  • Fixed an issue that was preventing an institutional admin from adding more than one URL/link on the Org Details page #413 #405
  • Fixed an issue that was preventing associated research outputs from being deleted #372
  • Fixed an issue with the emails sent out after the plan’s visibility changes #416
  • Other updates are detailed in the release notes.

How to get involved

We welcome contributions or collaborations. For those interested in following our work’s technical development, please see our GitHub project board. Please contact us if you have suggestions or ideas for pilot partnerships or if you’re interested in being an early tester.

Updates on DMPTool Support for the NIH DMSP Requirements 

A few months back, we announced a new DMPTool NIH Template Working Group focused on supporting the upcoming NIH requirements for data sharing. Since then, this hard-working group, chaired by Nina Exner of Virginia Commonwealth University, has collaborated to develop several new resources for the community. 

Updated NIH-GEN Template

The updated NIH-GEN DMSP (forthcoming 2023) template (v6) follows the structure laid out by the NIH in the optional DMS Plan format and aligns with the NIH-recommended Elements of a DMS Plan. The DMPTool NIH Template Working Group augmented NIH notices and other policy documents with additional sample language and guidance designed to help researchers create DMS plans. The new NIH-GEN DMSP (forthcoming 2023) template also includes guidance for data covered under the Genomic Data Sharing (GDS) policy, as NIH now expects a single data sharing plan to satisfy both the GDS Policy and the DMS Policy (NOT-OD-22-198). 

This new DMPTool NIH-GEN DMSP (forthcoming 2023) template includes answer prompts and sample answer text. 
The new NIH-GEN DMSP (forthcoming 2023) template includes answer prompts and sample answer text. 

Test out the new template by creating a plan. Or, preview the new template by downloading a PDF version with sample language and guidance included. 

DMPTool administrators can customize this (or any DMPTool template) and add institution-specific guidance and sample language. Instructions on how to customize templates and a short video tutorial are available. Any institutions with existing customizations will need to migrate to this new version of the template by publishing the template. Please see our documentation on the two steps required to transfer existing customizations. 

The DMPTool will depreciate the older NIH templates (NIH-GDS: Genomic Data Sharing and NIH-GEN: Generic (Current until 2023)) on January 24. Any plans with these older templates will still be available, but new plans for NIH will be directed to the new template. After we make this switch, the NIH-GEN DMSP (Forthcoming 2023) template title will change to NIH-GEN. 

New educational materials

The Education Sub-committee of the DMPTool NIH Template Working Group developed materials that institutions can use to promote the NIH requirements and use of the DMPTool. To support the increasing number of new medical centers and other institutions joining the DMPTool community, the Sub-committee produced a slide deck and flyers that institutions can utilize to train local researchers on using the DMPTool templates.

The Education Sub-committee also collaborated on a DMPTool training workshop held by the Network of the National Library of Medicine’s National Center for Data Services (NCDS). The first DMPTool workshop was held in December and broke attendance records. Betsy Gunia of Johns Hopkins led this training webinar, giving an excellent overview and DMPTool demonstration. 

If you missed this first session, never fear! A recording of this session is available via the NNLM, and Jim Martin of the University of Arizona is giving a repeat session on February 15. Registration is available via NNLM

Ongoing work

We will continue to iterate on the new templates, including the sample language and guidance provided, and welcome feedback from the community. As the NIH releases additional recommendations and guidance, we’ll continue incorporating these into NIH templates. As always, please reach out with any questions, suggestions, or feedback! 

Supporting the upcoming NIH data sharing requirements with the DMPTool

Immediately following the new NIH Policy for Data Management and Sharing (DMS) announcement, academic medical centers started to contact the DMPTool team with questions about how best to prepare for the pending policy change. These centers faced new and seemingly daunting requirements for managing and sharing biomedical data; fortunately, the DMPTool offers a simplified way of providing access to federal and local guidance regarding data sharing practices and features to connect researchers and data librarians. As a result, new medical centers join the DMPTool community every week. 

The need for the DMPTool arose in 2011, as transformative requirements for data management plans (DMPs) for NSF proposals began to take shape. In light of the upcoming policy changes for NIH research and the DMS plan requirement, we thought it would be helpful to highlight the work the DMPTool community is doing to help address the needs of the NIH research community.

Community supported work

Given that the DMPTool is based out of the University of California, it seemed fitting that we begin work supporting the NIH policy changes in collaboration with data librarians from the UC system. Our group included librarians from UC Berkeley, UC Irvine, UC San Francisco, and honorary UC, Stanford. The informal Working Group (WG) gathered to develop and share language for a generic NIH template and an FAQ document addressing questions the librarians had fielded thus far regarding the policy. This collaboration resulted in adding sample text language for each of the questions asked by the NIH. There were a few areas where the group found the NIH requirements needed more clear explanations, so as a result, the WG  also developed plain language where necessary.  

Following this work on the generic NIH DMP template, Nina Exner from Virginia Commonwealth University brought up the need for revising templates as guidance comes out of the NIH, and assessing the need for individual templates for specific NIH Centers and Institutes (IC). NIH has stated that while the overarching policy is the “minimum,” “ICs may supplement the DMS Policy as appropriate.” 

Several NIH ICs have released specific requirements on top of the overarching policy, and we expect additional new IC policies as we get closer to January 25, 2023, the effective date. Exner spearheaded a WG of data librarians to analyze IC policies and develop templates for ICs with different or more restrictive DMS policies when necessary.  Some ICs may not require the development of individual templates, and the WG will make this assessment. There are twenty-seven ICs, so composing individual templates for each necessitates this community effort. 

The WG is also using their experience with researcher needs to craft expanded example language for different disciplines. The new policy affects all disciplines, from basic sciences to clinical and biobehavioral, and the WG wants to help researchers see how the new Plan format connects to their disciplines. With this growing need for NIH templates, we also have a growing need for training. To support the increasing number of new medical centers joining the DMPTool, the WG is developing educational materials to support institutions that want to train local researchers on using the DMPTool templates. We are grateful to the NIH DMP Working Group for taking on this work and creating a valuable set of templates for the larger DMPTool community to utilize.

Ongoing work

The DMPTool has been a community-supported tool from inception, and the new NIH requirements highlight the need for ongoing support to keep the DMPTool up to date with policy changes. The WG is now focused on building and updating the NIH DMP templates and will publish them as work is completed. In recent weeks, NIH has announced implementation details for the DMS policy, including an optional DMS format page and harmonized genomic data sharing requirements, as well as the new FORMS-H grant application forms that will incorporate a new DMS plan section. The DMPTool NIH WG is now adding these to the template. 

How to access the templates

The NIH Generic template with community-developed guidance and sample language is available in the DMPTool by logging in and selecting the “NIH-GEN DMSP (Forthcoming 2023) template” from the Create Plan section. 

Screenshot of selecting the NIH-GEN DMSP (Forthcoming 2023) template in the DMPTool

We will continue to blog on our work as it continues and share the individual IC templates as they are published. Please reach out with any feedback, comments, or suggestions!

Members of the initial UC/Stanford Working Group

  • Anna Sackmann, University of California, Berkeley
  • Ariel Deardorff, University of California, San Francisco
  • Erin Foster, University of California, Berkeley
  • John Borghi, Stanford University
  • Wasila Dahdul, University of California, Irvine

Members of the current NIH DMP Working Group

  • Amy Yarnell, University of Maryland, Baltimore
  • Betsy Gunia, Johns Hopkins University
  • Genevieve Milliken, NNLM National Center for Data Services
  • Jim Martin, University of Arizona
  • Katy Smith, Saint Louis University
  • Lesley Skalla, Duke University Medical Center Library
  • Matt Covey, The Rockefeller University
  • Megan O’Donnell, Iowa State University
  • Megan Potterbusch, George Washington University
  • Melissa Ratajeski, University of Pittsburgh
  • Nina Exner, Virginia Commonwealth University
  • Seonyoung Kim, Washington University School of Medicine in St. Louis
  • Wasila Dahdul, University of California, Irvine
  • Will Dean, Temple University

Things to know about the updated DMPTool website

DMPTool Home Page

We’re starting 2022 with updates to the DMPTool website designed to make using the site easier and more efficient. 

Improved authentication process. We’ve streamlined the authentication process. A user enters their email address, and the system will automatically direct them to the next step. This new process avoids some complexity in the process, which is due to the fact that about half of our users use SSO for authentication and the other half manage their credentials. 

New Public Plans page search functionality Discovering sample DMPs just got a lot easier. The new public plans listing includes faceted search functionality to narrow results by Funder, Institution, language, and Subject. Additionally, DMPTool Administrators can now flag a plan as “Featured.” Featured plans are automatically displayed at the top of the page and provide a simple way to highlight exemplars. 

If you have plans in your organizational account that you think would be useful to the community, please consider requesting that the plan owner make them public and then flagging them as “Featured.” We hope to build a listing of exemplar DMPs to help showcase best practices and serve as educational aids for those writing their first plans. Of course, we’ll need your help to do this, so I’m hoping we can build on this together as a community by flagging exemplars as “Featured.” 

New Public Plans listing page with improved tools for discovery.
DMPTool Admins can mark public plans from their organization as “Featured.”

Navigation updates Admins will notice a change to the administrative functions drop-down list. Administrative functions are now located in the top right-hand of the page. 

Updated navigation points.

Please see our detailed Release Notes for a complete list of all the changes and new features. 

If you have any questions or comments, please contact us

DMP IDs and the DMPTool: Announcing DMPTool v. 3.1

Image from "The Post-Office annual Glasgow directory" (1828)
Image from “The Post-Office annual Glasgow directory” (1828) (https://flic.kr/p/oe2ZFe)

Building the recent creation of “A Brave New PID” for DMPs, we are excited to announce that DMP creators can now receive IDs for their DMPs within the DMPTool. From the outset of our NSF-funded EAGER research project, the ability to generate DMP IDs has long been on the strategic roadmap for integrating DMPs into the scholarly knowledge sharing and persistent identifier ecosystem. 

Supporting NSF recommendations for data management 

As the NSF detailed in their May 2019 Dear Colleague Letter on Effective Practices for Data, it is recommended that researchers utilize PIDs for their data outputs and generate DMPs that allow for automated information exchange (machine-actionable DMPs, “maDMPs”). Expanding on the NSF Dear Colleague Letter, the recent report by ARL, AAU, APLU and CDL, Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support, provided key recommendations for effective data practices to support an open research ecosystem emphasizing the adoption of PIDs and maDMPs. 

The DMPTool team continues to work towards supporting these recommendations by building new features and services for an open, automatically updated, interconnected system for data management of research projects. 

Our new feature of generating IDs for DMPs represents tangible progress towards achieving our shared goal of moving DMPs from static text documents into structured, interoperable data that is able to be fed across stakeholders, linking metadata, repositories, and institutions, and allowing for notifications, verification, and reporting in real-time. 

What’s included in this latest release? 

Below is an outline of three new features included in this release. For technical details and a few additional features included, please see our v3.1.0 documentation. These improvements have also been distributed to the larger community within our shared open source codebase, DMPRoadmap. Thank you to the DMPTool Editorial Board for their guidance and feedback as we developed this feature set. We are also appreciative of the DMPTool Administrators who submitted feedback on an early iteration of this release. We intend on incorporating many of these suggestions in future releases and building off the many good ideas shared by all as we continue to expand our support for Networked DMPs

1. IDs for DMPs

Within the Finalize/Publish tab users can “Register” their plan and generate a DMP ID. The DMP ID will then display within the tool and link to a landing page for the plan. For further details on this feature please see our DMP ID documentation.

2. DMP ID Landing Page

After receiving a DMP ID, the system will generate a DMP landing page that includes high level details about the plan. The DMP ID metadata does not include the narrative components of a DMP. For an example of a DMP ID landing page please see this DMP

The landing pages also demonstrate the types of connections made possible by tracking a research project over time from the point of DMP creation. As a project progresses over time, updates to the plan can be connected to the DMP ID and will display on the associated landing page. 

3. Research Outputs Tab

The new Research Outputs tab allows researchers to describe specific project outputs at a more granular and controlled manner than was previously possible solely via the text narrative. In designing this new section, we strived to utilize as many controlled vocabularies and PIDs as possible. Here are some highlights of the new tab:

  • Repository selector tool utilizing the Registry of Research Data Repositories (re3 data registry) that allows researchers to define where they anticipate depositing a specific output 
  • License selector (from SPDX) that allows researchers to define the associated license for specific outputs 
  • Ability to flag outputs as containing sensitive data and/or PII

Researchers can create an unlimited number of specific research outputs. All entered outputs are included in the downloaded version of the plan, placed after the narrative component of the plan so as not to interfere with funder page count limits. 

What’s up next?

With the ability to generate DMP IDs now in place, we are one step closer to creating networked, living DMPs. While this is a great start, we have many additional features in development that will extend the usability and interoperability of this new generation of DMPs. In the coming months, we will be working on developing these additional features:

  • Connecting DMPs to other related research outputs such as datasets and journal articles via the PID Graph
  • Connecting DMP IDs to corresponding ORCID records 
  • Incorporating additional PIDs including research resource identifiers (RRIDs
  • Sponsor and funder approval workflow wherein these stakeholders can review, comment, and approve submitted DMPs
  • Integration with the Electronic Lab Notebook, RSpace
  • Adding the ability for DMPTool admins to curate a list of recommended repositories for the new repository selector tool 

Additionally, in response to several DMPTool admin requests for outreach materials supporting adoption of the DMP ID, we are developing materials to share with the DMPTool admin community in order to promote these data practices amongst their users. 

We will continue to share details on this work and the development of new features to support the networked DMP. Stay tuned for more developments over the coming months for further advancements.

As always, feedback or questions are most welcome and can be sent directly to maria.praetzellis@ucop.edu.

Effective Data Practices: new recommendations to support an open research ecosystem

We are pleased to announce the release of a new report written with our partners at the Association of Research Libraries (ARL), the Association of American Universities (AAU), and the Association of Public and Land-grant Universities (APLU): Implementing Effective Data Practices: Stakeholder Recommendations for Collaborative Research Support.  

The report brings together information and insights shared during a December 2019 National Science Foundation sponsored invitational conference on implementing effective data practices. In this report, experts from library, research, and scientific communities provide key recommendations for effective data practices to support a more open research ecosystem. 

During the December conference, the project team developed a set of recommendations for the broad adoption and implementation of NSF’s recommended data practices as described in the NSF’s May 2019 Dear Colleague Letter.  The report focuses on recommendations for research institutions and also provides guidance for publishers, tool builders, and professional associations. The AAU-APLU Institutional Guide to Accelerating Public Access to Research Data, forthcoming in spring 2021, will include the recommendations.

The conference focused on designing guidelines for (1) using persistent identifiers (PIDs) for datasets, and (2) creating machine-readable data management plans (DMPs), both data practices that were recommended by NSF. Based on the information and insights shared during the conference, the project team developed a set of recommendations for the broad adoption and implementation of NSF’s preferred data practices. 

The report focuses on recommendations for research institutions and also provides guidance for publishers, tool builders, and professional associations. The AAU-APLU Institutional Guide to Accelerating Public Access to Research Data, forthcoming in spring 2021, will include the recommendations.

Five key takeaways from the report are:

  • Center the researcher by providing tools, education, and services that are built around data management practices that accommodate the scholarly workflow.
  • Create closer integration of library and scientific communities, including researchers, institutional offices of research, research computing, and disciplinary repositories.
  • Provide sustaining support for the open PID infrastructure that is a core community asset and essential piece of scholarly infrastructure. Beyond adoption and use of PIDs, organizations that sustain identifier registries need the support of the research community.
  • Unbundle the DMP, because the DMP as currently understood may be overloaded with too many expectations (for example, simultaneously a tool within the lab, among campus resource units, and with repositories and funding agencies). Unbundling may allow for different parts of a DMP to serve distinct and specific purposes.
  • Unlock discovery by connecting PIDs across repositories to assemble diverse data to answer new questions, advance scholarship, and accelerate adoption by researchers.

The report also identifies five core PIDs that are fundamental and foundational to an open data ecosystem. Using these PIDs will ensure that basic metadata about research is standardized, networked, and discoverable in scholarly infrastructure: 

  1. Digital object identifiers (DOIs) from DataCite to identify research data, as well as from Crossref to identify publications
  2. Open Researcher and Contributor (ORCID) iDs to identify researchers
  3. Research Organization Registry (ROR) IDs to identify research organization affiliations 
  4. Crossref Funder Registry IDs to identifier research funders 
  5. Crossref Grant IDs to identify grants and other types of research awards

The report is intended to encourage collaboration and conversation among a wide range of stakeholder groups in the research enterprise by showcasing how collaborative processes help with implementing PIDs and machine-actionable DMPs (maDMPs) in ways that can advance public access to research.

The full report is now available online

This material is based upon work supported by the National Science Foundation under Grant Number 1945938. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Project team:

  • John Chodacki, California Digital Library
  • Cynthia Hudson-Vitale, Pennsylvania State University
  • Natalie Meyers, University of Notre Dame
  • Jennifer Muilenburg, University of Washington
  • Maria Praetzellis, California Digital Library
  • Kacy Redd, Association of Public and Land-grant Universities
  • Judy Ruttenberg, Association of Research Libraries
  • Katie Steen, Association of American Universities

 

Additional report and conference contributors:

  • Joel Cutcher-Gershenfeld, Brandeis University
  • Maria Gould, California Digital Library

The DMPTool needs you!

DMPToolEditorialBoard

The DMPTool, which as of February 2020 has supported 44,415 users and 266 participating institutions, currently maintains thirty-eight templates for twenty-two different federal and private funders. It is no secret that funder requirements and subsequent application and data management guidance change regularly and it is crucial that the DMPTool reflects these changes in real time to remain a high-integrity resource. Staying up to date with rapidly changing grant requirements and RDM best practices is no small task, however. There is no magic wand or automated system to keep the DMPTool in sync with current requirements. Then how does it happen? The key factor in the success of the DMPTool over the last 9 years has been and continues to be the contributions of and collaboration with the user community.

As the capabilities of the DMPTool expand and needs for it rise, we are currently calling for additional community members to contribute and to ensure its continued success. We’re therefore pleased to announce the formation of and the invitation to join the DMPTool Editorial Board. Our goal for the Board is formalize existing community involvement in the tool and to have representation across disciplines with varied areas of expertise, from a wide range of institutions, including librarians along the full career spectrum. Experience working with DMPs is desirable and we welcome applications from any individuals committed to supporting effective research data management.

Responsibilities for members of the Editorial Board include the following:

  • One year term of service (with the opportunity to extend if desired)
  • Short bi-monthly (or as needed) meetings
  • Individual ownership of specific funder templates, linked to your area of focus
  • Creation of new templates as needed
  • Provide suggestions for default guidance and best practices
  • Identification and publication of example DMPs to include in the tool
  • Estimated 1-4 hours of work a month to check for requirement updates from funders, reporting to the Board, and updating templates and guidance in the DMPTool

Joining the DMPTool Editorial Board presents an excellent opportunity to meet fellow research data management professionals, actively contribute to the community, help support a critical piece of open-source technology advancing research data management, and keep abreast of the larger changes in funding requirements and the larger funding ecosystem. Editorial Board members will work to ensure the tool provides current information about grant requirements and corresponding guidance.

We hope you will consider this invitation to contribute and apply! We have opened applications and responses are due Friday, March 13. All questions, comments, concerns, or advice are welcome: maria.praetzellis@ucop.edu. We look forward to hearing from you!

DMP services unite!

This November the DMPRoadmap team conducted a series of strategic planning meetings. Meeting in-person was highly productive and a great way to energize the team for the ambitious work we have planned for the upcoming year. Read more about the meeting and our development goals below. This blog post was originally published by Magdalena Drafiova from DMP online on 3 December, 2019.

From left to right: Brian Riley, Benjamin Faure, Marta Nicholson, Maria Praetzellis, Sarah Jones, Sam Rust and Ray Carrick.

In the middle of November we were joined for three days by our colleagues Maria Praetzellis and Brian Riley from DMPTool and Benjamin Faure from OPIDoR. On our end Sarah Jones, Sam Rust, Ray Carrick, Marta Nicholson, Diana Sisu and Magdalena Drafiova represented DMPonline. We’ve had a number of new people join the team over the past year so the meetings were a great opportunity to get to know one another and discuss where to take things next.

Over the three days we had a mix of group discussions to plan the future development roadmap (results of that later), as well as developer / project manager sessions and discussions with the wider DCC and CDL team on machine-actionable DMPs. Below we report out on the results of our sessions and the future development roadmap

Developer team meeting

The tech team had a separate team meeting to give more time to discuss changes to the codebase and development procedures.They walked through the data model and key functionality to bring new devs up to speed and discussed major pieces of infrastructure work to schedule over the coming year (e.g. upgrading to Rails v.5, making a more robust test infrastructure, etc.). They also reviewed the current development project management processes and will be revising our PR review workflow and incorporating a continuous integration approach. This will allow developers to work more atomically. A single bug fix or feature enhancement will now be handled individually instead of as a component of a larger single release. Each issue will be merged into the codebase as a single point release allowing the team to work more efficiently as well as making it easier to accept contributions from external developers.

Project management meeting
Magdalena, Maria, Sarah and Diana discussed procedures for prioritizing tickets, managing the team and conducting User Acceptance Testing (UAT). Sarah and Diana will share expertise on weekly PM meetings to bring Magdalena and Maria up to speed. We have also decided to change our sprint schedule as we will be joined by more developers. We want to do our releases more often and have less tickets on the board so we can review them all in each call. This coupled with the continuous integration approach should get fixes and features out more quickly. We have assigned a developer to each area which we want to work on, although we want to ensure that the knowledge is shared and everyone has an opportunity to work across the codebase so we don’t create dependencies.

We also discussed the need to conduct user testing, especially on the administrative area of the tool. This will involve setting some tasks and observing users complete them to see what issues they encounter and where the tool is not intuitive. We hope to run these tests in Summer 2020. If you would be interested in getting people from your organization involved, please let us know.

Development roadmap
We agreed on the development roadmap by dividing our key areas of work into time phases. Some activities are ongoing system improvements and will happen throughout the time periods.The first part of work which we hope that will run till February 2020 is around the feedback we have received in our user groups. This work will finalize the conditional questions functionality, improve search for administrators and make the usage dashboard more insightful so you can get better analytics about how is the tool used at your institution. We will also integrate a new feature from DMP OPIDoR to enable one click plan creation. From the public templates page, users will be able to click on an icon and create a plan based on that template. We are also planning integrations so you can export DMPs to Zenodo and RIO Journal and complete our work on regional filtering to separate funders/templates/organization by country.

The second part of the work will focus on making our default template machine-actionable by adding integrations to controlled vocabularies, a re3data repository selector, license selector, fewer free text fields, as well as important identifiers for users (ORCID ids) and organizations (ROR ids). We will also update our API so that it conforms to the RDA Common standard.

We will finish the year by adding new features that allow administrators to pre-define a subset of good institutionally shared plans. We will also improve the current plan version and a lifecycle of plan version so you can indicate the status of the plan. We will also work on incorporating multiple datasets into DMPs so you can get better insights about various storage requirements, license requirements etc. Enabling static pages to be edited is also on the to-do list. Lots to look forward to!

Meet the new DMPTool Product Manager

MariaPraetzellisHeadshotToday, August 19, marks my seventh week as the new DMPTool Product Manager, and the latest Research Data Specialist to join the team at UC3. I’m thrilled to be joining such an active and engaged community of professionals committed to the principles of open science, open infrastructure, and public access to research and knowledge.

As I take the reins from Stephanie Simms, I’m grateful for her instrumental work in rethinking the capabilities of a data management plan (DMP) and her work with the community in developing the conceptual frameworks and use cases for the creation of machine-actionable DMPs. As I’ve learned more in these first weeks, I am invigorated by the plans for machine-actionable DMPs, seeing the critical role they could play in research and data sharing and the exciting potential for expanding their dynamism, utility, and centrality to research data workflows. 

Prior to joining CDL, I was a Program Manager in the Web Archiving and Data Services group at the Internet Archive. At the Internet Archive, I managed domain-scale web harvesting, dataset and indexing services, and computational access to large-scale data for researchers. I bring a strong background in product management for services used by a global set of partners and a commitment to community-driven feature development and system integrations. 

I’m looking forward to expanding upon this experience as I begin work on furthering development of the DMPTool, keeping in step with what can be useful to and benefit the community, and advancing our shared commitment to open access to research and research data.

Please feel free to reach out and introduce yourself! I’m eager to receive any feedback or questions. You can reach me directly at maria.praetzellis@ucop.edu.

Representing time in machine-actionable DMPs

In this next installment of the machine-actionable DMP blog series, we want to address the broader context of time to hone in on answering the following question:

How and when do you update some piece of information in a DMP?

This happens to be the substance of Principle 9 from our preprint, forthcoming in PLOS Miksa et al. 2018: maDMPs should be versioned, updatable, living documents.

DMPs should not just be seen as a “plan” but as updatable, versioned documents representing and recording the actual state of data management as the project unfolds. The act of planning is far more important than the plan itself, and to derive value for researchers and other stakeholders, the plan needs to evolve. DMPs should track the course of research activities from planning to sharing and preserving outputs, recording key events over the course of a project to become an evolving record of activities related to the implementation of the plan.

We can all agree that it’s important to treat maDMPs as living documents, but there are multiple approaches we might take to updating them, and multiple stakeholders who should be able to provide updates for particular pieces of information at particular points along the way. First we’ll provide a quick overview of the current state of DMP-time as represented in systems and policies related to our NSF EAGER project, plus a handful of other relevant systems and policies that extend the geographical and organizational scope. Then, we’ll pitch an idea for how we can handle DMP-time using Crossref/DataCite Event Data Service. We welcome, nay encourage your feedback about this and other ideas as we experiment and iterate and prove things out in practice.

Representing time in DMPs

So we built a graph database with seed data from our partners at BCO-DMO and the UC Gump Field Station on Moorea, and enriched it with information from the NSF Awards API and public plans created with the DMPTool. All of the projects represented in the database correspond with NSF awards and therefore the DMPs have an associated timeline of:

  1. Create DMP and submit grant proposal (via institutional Office of Research, NSF Fastlane system)
  2. Grant awarded (grant number issued by NSF)
  3. Grant period ends, final report due (data deposited at appropriate repository)

This current grant/DMP workflow fails to capture information about actual data management activities as they unfold over the course of a project, however, data management staff at BCO-DMO and the Gump Field Station perform interventions and provide manual updates in their own repository systems opportunistically. These updates can occur during active stages of multi-year projects and most of them are done at the grant closeout stage when researchers are engaged with reporting activities and aware that they must deposit their data. Relevant NSF program officers from the Geosciences Directorate conduct manual compliance checks to ensure that grantees have deposited data prior to issuing a new award, which is a very useful feature of this case study.

In addition to the data repository systems, information about these projects flows through institutional grant management systems, NSF’s Fastlane system, and a subset is made publicly available via the NSF Awards API (example of our award). Each of these systems records the start data and end date for the award, and some include interim reporting dates. Our ongoing analysis for maDMP prototyping is focused on identifying additional milestones during the course of a project and which stakeholders should be responsible for updating which pieces of information…drilling into the original question of how and when do you update things?

DMP-time in European contexts

To avoid an overly narrow focus on one national context and one funding agency in this larger thematic discussion about time, we’ll also consider some European examples. The European Commission’s Horizon 2020 program acknowledges the fact that information about research data changes from the planning to final preservation stages; as a result, DMPs have built-in versioning. Horizon 2020 proposals that receive an award must submit a first version of the DMP within the first 6 months of the project. The DMP needs to be updated over the course of the project whenever significant changes arise, however, this “requirement” is somewhat vague and reads more like a best practice. Updated versions of the DMP are required at any periodic reporting deadline and at the time of the final report. DMPonline provides an optional set of Horizon 2020 templates that includes an 1) Initial DMP, 2) Detailed DMP, and 3) Final review DMP.

Our maDMP collaborators at the Technical University of Vienna are forging ahead with their own institutional prototyping efforts to automate DMPs and integrate them with local infrastructure. They just released this excellent interactive “mockups” tool and invite your feedback. Within the mockups system, time is represented through the concept of DMP Granularity and in some cases this is related to funding status. The level of granularity corresponds roughly with versions, which carry the labels “initial, detailed, or sophisticated.”

Representing time in maDMPs: Ideas for the future

The ability to update DMPs is central to our own plans for realizing machine-actionability and relies on infrastructure that already exists. In a nutshell, our idea is to insert DMPs and corresponding grant numbers into the sprawling web of information connecting people and their published outputs. We think the mechanism for accomplishing this is to issue DataCite DOIs for DMPs: this creates an identifier against which we can assert things programmatically. In addition, this hooks DMPs into Crossref/DataCite Event Data, which is a stream of assertions of relationships between research-related things. Existing and emerging registries of information are already leveraging this infrastructure—Scholix, ORCID, Wikidata, Make Data Count, etc. DMPs and grant numbers would provide a view of the connections between everything at the project level.

Documentation for Event Data explains that it “is a hub for the collection and distribution of a variety of Events and contains data from a selection of Sources. Every Event has a time at which it was created. This is usually soon after the Event was observed. In addition to this, every Event has a theoretical date on which it occurred…dates are represented as the occurred_at, timestamp and updated_date fields on each Event. The Query API has two views which allow you to find Events filtered by both occurred_at and timestamp timescales. It also lets you query for Events that have been updated since a given date.” This hub of information would therefore support versioning of the DMP as well as dynamic updating of key pieces of information (e.g. data types, volumes, licenses, repositories) by various stakeholders over time. Stakeholders could rely on this open hub of information and begin to make plans based on it (e.g., a named repository learns that a TB of data is expected within a specific timeframe).

In this scenario, the DMP would become an assertion store (cf. Wikidata and Wikibase). The assertion store would have a timeline component and anyone could use the DMP identifier to ping/query the Event Data Query API and find out what’s been asserted about the project. Various DMP stakeholders could also assert things about the project and update information over time. Each stakeholder could query and model DMP information based on the types of relationships and get the specific details they’re interested in… so an institution could discover who their PIs are collaborating with[o], a funder could check[p] if a dataset has been deposited in a named repository, a repository manager could search for any changes to a specific project or all relevant projects within a specific date range, etc. Wikidata has already begun indexing policies, in fact; once this happens at scale and is integrated with indexing of datasets, we could have automated dashboards displaying policy compliance and project progress.

That’s about it. Please tell us what you think about this approach to transforming a DMP into something active and updated, versioned and linked to research outputs.