What’s new with our machine actionable DMP work?

Featured

Building on the conceptual framework laid out in articles such as Ten principles for machine-actionable data management plans and prior blog posts covering such topics as what maDMPs are, what they can do to support automation, utilizing common standards and PIDs, and maDMPs as living documents, we are now moving into active development on the technical aspects of our NSF funded EAGER research project

A phased approach: building a plan for maDMPs

The goal of our EAGER research project is to explore the potential of machine-actionable DMPs as a means to transform the DMPs from a compliance exercise based on static text documents into a key component of a networked research data management. This ecosystem will not only facilitate, but also improve the research process for all stakeholders. 

We will be laying out the phases of work in the coming months and will continue to use this blog to keep the community informed of our progress, and to solicit your feedback and ideas.

Phase 1 Workplan

maDMP_phase1

Phase 1 of of our research entails exploring the following three high level ideas:

  1. How to best restructure the DMPTool metadata to utilize the RDA Working Group Common Standard
  2. How to optimize the Digital Object Identifiers (DOI) metadata schema for DMPs 
  3. How to best incorporate other Persistent identifiers (PIDs) into DMPs

Common Standards

The common data model for the creation of machine-actionable DMPs, produced by the RDA working group on DMP Common Standards, was recently released for community feedback. Our partners at the Digital Curation Center (DCC) have now implemented this model into the DMPRoadmap codebase. A big thank you to Sam Rust from DCC for his work on this! Those interested in learning more about the Common Standard in DMPRoadmap may want to view a recent webinar recording of Sam detailing this work. This was a fundamental step towards machine actionable DMPs, as it forms the foundation to enable information flow between DMPs and affiliated external systems in a standardized manner.

DOIs for DMPs

With our partners at the Digital Curation Center (DCC), we are working to incorporate the common standards into the shared DMPRoadmap codebase and our DMPTool development plans. As part of this work, we have partnered with DataCite to update their metadata schema to better support DMPs and to optimize a workflow for generating DOIs for DMPs. By relying on the DOI infrastructure, we will then be able to utilize the Event Data service from DataCite to record when assertions have been made on the DOI. More on the workflows surrounding this aspect of the project below. 

DMPs and the PID graph

Projects such as Freya have been working to connect research outputs through a PID graph.  A key question underpinning much of our work is how we can best leverage the PID graph (see Principle 5: Use PIDs and controlled vocabularies) within the DMP ecosystem. To connect DMPs to the larger PID ecosystem, our first phase will also include incorporating the following persistent identifiers into the DMP as a baseline for future work:

Phase 1 workflows

As discussed above, in Phase 1, we are building a system to mint DOIs for DMPs and creating a landing page for DMP DOIs to record updates to the DOI that occur over time. Although the system can be thought of as a giant API, pulling and pushing data from various sources, we are also building a landing page for these DOIs in order to visually demonstrate the types of connections made possible by tracking a research project over time from the point of DMP creation. 

Below is a high level overview of this workflow and whiteboarding of its potential architecture. (For those that would like a more detailed view, please check out our GitHub).maDMPRegistry

  1. maDMP system accepts common standard metadata from DMPTool (DMP Roadmap) 
  2. maDMP system sends that metadata to DataCite to mint a DOI (which it then returns to the DMPTool)
  3. A landing page is generated for the DMP DOI
  4. A separate harvester application queries outside APIs to check for assertions recorded against the DOI. For this phase of work we will work with the NSF awards API, and return any award information into the maDMP system. 
  5. The maDMP system then sends any award info returned to DataCite 

Our goal is to leverage the work being done by the RDA Exposing DMP working group to help inform the privacy concerns of exposing certain types of assertions on this landing page.  

Next Steps

Looking ahead, we plan to produce a basic prototype ready for testing and feedback by the end of October. I will be presenting on our work thus far at the upcoming RDA and CODATA meetings. During these meetings, I look forward to continuing our work with the RDA Common Standards Working Group (and to meeting many of those active in this space for the first time in-person)! 

Once we establish the workflow to record assertions to a DMP DOI, our next phase of work will include pilot projects with domain-specific and institutional stakeholders to test the flow and integration of relevant information across services and systems. With these partners we plan to test how maDMPs can help track data management activities as they occur during the course of a grant project. 

Finally, it’s important to note that all of our development work is being done in a test environment where we will continue to iterate for the next several months as we determine how best to deploy new features to the DMPTool and DMPRoadmap codebase. 

Interested in contributing?

Lastly, we realize that maDMP is far from the most euphonious or creative name for this service (nor is our original idea of the DMPHub much better). We are open to any and all ideas for naming this work so if you have any ideas, however strange or off the wall, please do let us know. If we use your idea we promise to shower you with accolades for your denomination genius. Also, free stickers galore.

To review or contribute to the technical components of the project check out our GitHub. And most importantly, please send any and all feedback, questions, or ideas for names to maria.praetzellis@ucop.edu.

 

What’s new with the DMPTool?

The past few months have been quite fruitful in terms of pushing forward on the technical details surrounding machine-actionable DMPs.

The common data model for the creation of machine-actionable DMPs, produced by the RDA working group on DMP Common Standards, was recently released for community feedback. With our partners at the Digital Curation Center (DCC), we are now actively incorporating this model into the DMPRoadmap codebase and our DMPTool development plans.

As part of our NSF EAGER grant, CDL has partnered with DataCite to explore how DOI infrastructure could enable the passing of information between RDM systems and supporting integration between various related systems. The initial phase of this work includes piloting workflows that efficiently move information between stakeholders, systems, and researcher workflows. Our goal is a working prototype developed by mid-October of this year. This is exciting as it represents the first step towards realizing our long-term goal of machine-actionable DMPs as critical infrastructure in the research process.


Community involvement 

Another key goal for the coming months is to re-engage the DMPTool community via regular virtual user meetings, the re-creation of advisory boards, and most importantly hearing more from you about how the DMPTool is working (or not) and gathering feedback on future developments and ideas for new areas of making the DMPTool even more useful and vital. In the coming weeks, we will reach out with more details on the above. However, in the meantime, please feel free to contact me and introduce yourself!

I am interested in hearing any input, questions, comments or feedback. You can contact me directly at maria.praetzellis@ucop.edu.

Meet the new DMPTool Product Manager

MariaPraetzellisHeadshotToday, August 19, marks my seventh week as the new DMPTool Product Manager, and the latest Research Data Specialist to join the team at UC3. I’m thrilled to be joining such an active and engaged community of professionals committed to the principles of open science, open infrastructure, and public access to research and knowledge.

As I take the reins from Stephanie Simms, I’m grateful for her instrumental work in rethinking the capabilities of a data management plan (DMP) and her work with the community in developing the conceptual frameworks and use cases for the creation of machine-actionable DMPs. As I’ve learned more in these first weeks, I am invigorated by the plans for machine-actionable DMPs, seeing the critical role they could play in research and data sharing and the exciting potential for expanding their dynamism, utility, and centrality to research data workflows. 

Prior to joining CDL, I was a Program Manager in the Web Archiving and Data Services group at the Internet Archive. At the Internet Archive, I managed domain-scale web harvesting, dataset and indexing services, and computational access to large-scale data for researchers. I bring a strong background in product management for services used by a global set of partners and a commitment to community-driven feature development and system integrations. 

I’m looking forward to expanding upon this experience as I begin work on furthering development of the DMPTool, keeping in step with what can be useful to and benefit the community, and advancing our shared commitment to open access to research and research data.

Please feel free to reach out and introduce yourself! I’m eager to receive any feedback or questions. You can reach me directly at maria.praetzellis@ucop.edu.