Scoping Machine-Actionable DMPs

Machine-actionable data management plans (maDMPs) are happening. Over the past several years we’ve contributed to community discussions and various events to suss out what we all mean by this term and why we think maDMPs are important. In the midst of these efforts, we (California Digital Library) also received an NSF EAGER grant to prototype maDMPs and are now in the process of designing that work.

To connect our prototyping with the constantly evolving maDMP landscape, we remain active in the Research Data Alliance, Force11, domain-based efforts (e.g., AGU Enabling FAIR Data), and of course we run the DMPTool service as part of an international policy/support initiative called the DMP Roadmap project. We also recently helped launch a website activedmps.org to identify all of the people and projects across the globe working on maDMPs.

In keeping with this community thread, as well as for our own edification, we’re kicking off an maDMP blog series. The primary goal is to offer some framing documents so other stakeholders, especially those who’ve invested as much time as we have thinking about such an obscure topic (!), can help us ask and answer the many outstanding questions about maDMPs. A secondary motivation is to respond to the frequent queries from our users and other stakeholders about how to envision and plan for an maDMP future, which seems inevitable as more of us begin to prototype in different directions.

For this inaugural scoping piece we want to address the following high-level questions. And just to reiterate, the answers herein are distilled from our own thinking; by no means do we think that these are the correct or only answers. We invite others to challenge our ideas at any/every step along the way.

  1. What are maDMPs?
  2. What are they not? 
  3. Who are they for?
  4. How are they different from “traditional” DMPs?
  5. What does this mean for the future of DMPs and support services?

…What comes next?

 

1. What are maDMPs?
maDMPs are a vehicle for reporting on the intentions and outcomes of a research project that enable information exchange across relevant parties and systems. They contain an inventory of key information about a project and its outputs (not just data), with a change history that stakeholders can query for updated information about the project over its lifetime. The basic framework requires common data models for exchanging information, currently under development in the RDA DMP Common Standards WG, as well as a shared ecosystem of services that send notifications and act on behalf of humans. Other components of the vision include machine-actionable policies, persistent identifiers (PIDs) (e.g., ORCID iDs, funder IDs, forthcoming Org IDs, RRIDs for biomedical resources, protocols.io, IGSNs for geosamples, etc), and the removal of barriers for information sharing.

2. What are they not?
maDMPs are not a collection of best practices for creating a data management plan (those exist already, Michener 2015) nor are they a comprehensive record of every detail about a research project and how it was conducted (i.e., they are not the Open Science Framework). It is out of scope to use maDMPs to connect all the things in the universe and try to solve reproducibility. Instead they are a plan and instructions about how to implement the plan, as well as a report about the completion of the plan; this plan includes an inventory/registry of research outputs and information about what to do with each thing (e.g., length of time to retain a dataset in a repository).

3. Who are they for?
maDMPs are focused primarily on infrastructure providers, systems, and those responsible for creating and enforcing research data policies. maDMPs are not focused primarily on researchers, data librarians, or other research support staff. However, broad adoption by all stakeholders in the research enterprise is required to achieve the the goals of the policies and ideally everyone will reap the benefits. Here is a (roughly) ranked-order list of the target audience for maDMPs:

  • Funder: funding agencies and foundations that specify requirements for DMPs and monitor compliance.
  • Repository Operator: General (e.g., Zenodo, Dryad), disciplinary (e.g., GenBank, ICPSR), and institutional data repositories.
  • Infrastructure Provider: Providers of systems for creating DMPs (DMPTool, DMPonline), grants administration, researcher profiles (RIMS/CRIS), etc. .
  • Institutional Administrator: Office of Research/Sponsored Programs, Chief Information Officers, University Librarians, others.
  • Ethics Review: Institutional Review Boards (IRB)/Research Ethics Boards (REB) that authorize human subjects research.
  • Legal Expert: Technology transfer offices; copyright and patent experts.
  • Publisher: Purveyors of article and data publication services.
  • Researcher: Principal Investigator and collaborators, including postdoctoral researchers, graduate and undergraduate students.
  • Research Support Staff: Data managers/curators, research administrators, and data librarians.
machine-actionable DMP info flows

Examples of stakeholder interactions within the ecosystem of machine-actionable DMPs. Stakeholders communicate with each other by exchanging information through DMPs. For example, a repository operator can select a proper repository, set an embargo period, and assign a correct license to data submitted by researchers. In return, a system acting on behalf of a repository operator provides a list of DOIs assigned to the data and provides information on costs of storage and preservation. This in turn can be accessed by a funder to check how the DMP was implemented.

4. How are they different from “traditional” DMPs?
The vision for maDMPs is to automate certain pieces of the DMP process, especially to alleviate the administrative burden of entering the same information in multiple places (e.g. it would be great if a researcher could recycle part or all of an IRB application for a DMP, or generate a Biosketch/CV automatically from their ORCID profile, or automatically generate a data availability statement when publishing data/articles). There is still a need for a human-readable narrative that describes digital research methods and outputs, but the main difference is that it should be updatable so that DMPs can become useful beyond the grant application stage.

5. What does this mean for the future of DMPs and support services?
We get asked this question often, most recently in the form of a provocative email from Dr. Devan Ray Donaldson as he was designing the curriculum for his digital curation course at Indiana University Bloomington.

Our response: Librarians and other digital curation experts absolutely have a role to play in supporting researchers with DMPs and data management issues more broadly. At CDL we spend a lot of time digging into the weeds of digital curation issues with librarians and researchers at all 10 UC campuses and we noticed that a major barrier to effectively supporting researchers is that they don’t recognize the language/jargon of digital curation. At the risk of self-promotion I’ll direct you to this guide that we created based on our collective experiences as researchers, and now as people who support researchers, called “Support Your Data.” John Borghi was the main driver of the project (more details from him here) and we’re now developing more attractive resources and a website to adapt for your purposes if you find these materials useful. The goal is to educate researchers about good data management practices by relating to their current practices, and demonstrate how small habits (e.g., file naming conventions) can amount to better/more efficient research.

… What comes next?
maDMPs present an opportunity to move DMPs beyond a compliance exercise by providing needed structure, interoperability, and added-value functionality to support open, reusable research data. We’re designing and developing an open framework for maDMPs that builds on existing initiatives and infrastructure. There are numerous efforts focused on connecting people and outputs (e.g., ORCID, Wikidata, Scholix, NCBI accession numbers). We want to link this information with grant numbers to create a dynamic inventory of assertions about a grant-funded research project (note: in the future we’ll also consider DMPs not associated with grants).

Step 1 for us is to get seed data from our partners at BCO-DMO and the UC Berkeley Gump Field Station on Moorea and structure it to define native maDMPs. We’ll discuss subsequent steps in future blog posts. Stay tuned!

Set the controls for the heart of the sun

Our DMPTool and DMPonline services have been humming along with the same underlying code for a couple of months now. Since our MVP release, we’ve shifted gears to more regular sprints. We’re also pleasantly surprised by how eager the wider DMP community has been to join forces in migrating, translating, and even contributing new features already! Here’s a brief retrospective and a glimpse into the future.

Post MVP Backlog
There is a modest backlog of work that didn’t make into the MVP release. We’ve prioritized these issues and are focused on tying up the loose ends over the coming months. Those following the DMPRoadmap Github repository will notice regular releases. The goal is to settle into a steady two-week rhythm, but in the near term we’re working on slightly shorter or longer cycles to address critical bugs and some minor refactoring. Many thanks to our users on both sides of the pond who have reported issues and provided overwhelmingly positive feedback so far!

Evolving processes
We’ve been communicating with our respective user communities about new fixes and features as things pertain to them. Some things to note about our evolving development process:

  • DMPRoadmap GitHub repo: this is where most development work happens since the majority of fixes and features apply to the core codebase. This repository also contains all technical documentation, release notes, and other info for those interested in deploying their own instances or contributing to the project.
  • The DMPRoadmap wiki has a list of potential future enhancements. We’re collating ideas here and will define priorities and requirements in consultation with the community via user groups and listserv discussions. If you have other desired new features please let us know.
  • Any service-specific customizations reside in separate GitHub repos. For example, you can find the custom Single-Sign-On code in the DMPTool GitHub repo. The way that we handle helpdesk functions varies too. DMPTool users can report issues directly in the DMPTool repo or via the helpdesk. If something pertains to the common codebase, Stephanie will tag the issue and transfer it to DMPRoadmap. For DMPonline users we ask you to report issues via the helpdesk.

External contributions
Our core dev team is test driving the external contributor guidelines with the French team from DMP OPIDoR. They developed a new feature for a global notification system (e.g., to display maintenance messages, updates to funder templates) that happens to be in our backlog. The new feature looks great and is exactly the kind of contribution we’d like from others. You’ll see it in the next release. Thanks Benjamin and Quentin!

We’re also keen to commence monthly community dev calls to learn about other new features that folks might be planning and keep track of how we collaborate on DMP support across the globe.

Translations
We’ll be adding new translations for Brazilian Portuguese (thanks to Benilton de Sá Carvalho and colleagues at UNICAMP) and Finnish thanks to DMPTuuli. We’re also reaching out to fill in missing portions of existing translations for other languages since we added so many new features. New translations are always welcome; more information is available on the GitHub wiki and/or contact us.

A machine-actionable future
With the launch milestone behind us, we’re devoting more attention and resources to creating a machine-actionable future for DMPs. Two working groups hosted productive sessions at the recent RDA plenary (DMP Common Standards, Exposing DMPs) that included lightning talk presentations by members of the DMPRoadmap project (slides 1 and slides 2). Both of the groups are on track to provide actionable outputs in the next 12 months that will bolster wider community efforts on this front. We’ll continue participating in both groups as well as begin prototyping things with the NSF EAGER grant awarded to the California Digital Library. Stay tuned for more details via future updates and check out the activedmps.org site to get involved.

Prepare for launch in 3… 2… 1…

In about two weeks we will launch the new DMPTool on Tues, 27 Feb. The much-anticipated third version of the tool represents an exciting next step in what has always been a community-driven project. We’ve now successfully merged the primary US- and UK-based data management planning tools into a single codebase (DMP Roadmap): the engine under the new DMPTool hood.

Why are we doing this?

A little background for those who haven’t been following along with our codevelopment journey: in 2016 the University of California Curation Center (UC3) decided to join forces with the Digital Curation Centre (DCC) to maintain a single open-source platform for DMPs. We took this action to extend our reach beyond national boundaries and move best practices forward, with a lofty goal to begin making DMPs machine actionable (i.e., useful for managing data). We’ll continue to run our own branded services (DMPTool, DMPonline, DMPTuuli, DMPMelbourne) on the shared codebase, and incorporate partners in Canada, Argentina, South Africa, and throughout Europe who are already running their own instances (full list).

In parallel with our co-development efforts we’ve been making the rounds of Research Data Alliance, Force11, IDCC, and disciplinary meetings to collect use cases for machine-actionable DMPs (details here) and help define common standards (RDA Working Group; just posted pre-print for 10 Simple Rules for Machine-Actionable DMPs). We also got an NSF EAGER grant so we can begin prototyping muy pronto.

The new version of the DMPTool will enable us to implement and test machine-actionable things in a truly global open science ecosystem. Successful approaches to making DMPs a more useful exercise will require input from and adoption by many stakeholders so we look forward to working with our existing DMP Roadmap community (an estimated 50k+ users, 400+ participating institutions, and a growing list of funder contacts across the globe) and welcoming others into the fold!

Preparing for Launch

To help DMPTool administrators prepare themselves and their institutional users for the upcoming launch, we will host a webinar on:

Mon, 26 Feb 2018, 9-10 AM Pacific Time
Zoom link (recording on Vimeo; Q&A and slides)

By that time we’ll have a new user guide for administrators, a new Quick Start Guide for researchers, and refreshed promo materials. Everyone will have seamless access to their existing DMPTool accounts, just through a new user interface that looks and feels more like DMPonline (spoiler alert: we made it blue). And one of the most exciting things about the new tool is that it contains 34 freshly updated funder templates with links to additional funder guidance.

Stay tuned to the DMPTool communication channels in the coming weeks (blog, admin email list, Twitter) for more news and updates. We look forward to seeing you at the webinar and welcome your feedback at any point.

First annual funder template pizza party!

template editors

As we approach our target release date of Feb 2018 for the DMP Roadmap platform, the DMPTool team has embarked on a major housekeeping effort. A top-to-bottom content review is underway, and last week we began an audit of the funder templates and guidance. Ten participants gathered for an all-day, pizza-fueled event that amounted to a huge template success (but an epic pizza fail, see evidence below). We were so productive and gratified by the opportunity to analyze multiple DMP policies in a group setting that we decided to make it an annual event. Read on for more DMPTool funder template news + migration plans, followed by brief updates on the DMP Roadmap project and machine-actionable DMPs.

DMPTool funder templates

The DMPTool is a hugely popular community resource in part because it serves as a central clearinghouse of information about DMP requirements and guidance for researchers applying for grants from U.S. funding agencies. Migrating the DMPTool data to the new platform provides an opportunity to update and normalize things to maintain this value. [Side note: we’re also adding a “Last updated” field to the DMP Requirements table as an enhancement in the new platform per your feedback.]

At present the tool contains 32 templates for 16 different federal and private funders. This top 10 templates list demonstrates that our users are especially keen on getting support with NSF and NIH grant proposals, although the NEH is #7, and DOE and others aren’t far behind. Some global usage statistics to put these numbers in context: 26.8k users have created 20k plans; and we have 216 participating institutions (mostly U.S. colleges and universities).

funder-template-table

Our goals for the pizza party included: 1) ensuring that template language comes directly from the most recent versions of funder policy documents; and 2) applying themes (more on themes here). Staying up to date with DMP requirements remains a crowdsourced effort spearheaded by data librarians using the Twitter hashtag #OSTPResp and a Google spreadsheet. In the past year, two additional resources entered the scene: a list of public access plans from U.S. federal agencies at CENDI.gov and this lovely SPARC tool. Using these reference materials and some additional internet research, we updated 7 links to policy documents in the current DMPTool platform (NIH-GDS, NEH-ODH, NSF-CHE, NOAA, USDA-NIFA, Joint Fire Science Program, Sloan) and made some revisions to templates in the new platform (mostly formatting). We also identified some templates that require deeper investigation and/or consultation with agency contacts to verify the best way to present DMP requirements; between now and the release date we’ll continue to work on these templates. In addition, Jackie Wilson is contracting with us to finalize the clean-up of templates and guidance (checking links and guidance text provided by funders).

#pizzafail

#pizzafail

By January we aim to have a beta DMPTool-branded version of the new platform ready for training and testing purposes. Stay tuned for a rollout plan in the new year that includes webinars for institutional administrators, with an orientation to templates and themes. Also, please note that we will be disabling template editing functionality on 18 Dec in the current version of DMPTool to maintain the integrity of template data in the new platform. For admin users who wish to make changes to templates and guidance after that date, you can contact the helpdesk, but it would be great if you can keep changes to a minimum. All other functionality in the current DMPTool will remain the same up to the final migration date (adding new users, institutions, creating and editing plans, etc.)

A million thanks to the 2017 template fixing team: Amy Neeser, Joan Starr, Alana Miller, Jackie Wilson, Marisa Strong, Daniella Lowenberg, Perry Willett, John Chodacki, and Stephen Abrams.

DMP Roadmap update

The co-development team is busy building and refining the final MVP features. The usage dashboard is the last new feature left to add. In the meantime, parallel data migration efforts are underway at DCC to move from the existing 28 DMPonline themes to the new set of 14. By January both service teams will be working on new user guides, updating other content, testing and branding. If all continues to go smoothly, we’ll be on track for a DMP Roadmap demo at IDCC in Barcelona (19–22 Feb) and an official code release. Stay tuned!

Machine-actionable DMPs

On the machine-actionable DMP front, there are two items to report:

  1. We’ll be emailing the various DMP lists shortly to encourage everyone to participate in working meetings for the RDA WGs (DMP Common Standards & Exposing DMPs) at the next plenary. For now mark your calendars for 21–23 Mar and join us in Berlin!
  2. Following on a productive session at FORCE2017, we’re finishing a draft of the 10 Simple Rules for Machine-Actionable DMPs that we will circulate soon soon.

As always, we encourage you to contact us to get involved!

Roll up, roll up. Get yer DMP update here!

Paper seller and bench From Flickr by henry... CC-BY-NC-ND

From Flickr by henry… CC-BY-NC-ND

by Sarah Jones

Last month saw a busy Active DMPs and Domain Repositories Interest Groups joint session at the RDA Plenary in Montreal. Two new working groups have been launched to advance work in this area: one on developing Common Standards for DMPs and another on Exposing DMPs. In addition, there are multiple active projects in this space including ezDMP, the University of Queensland’s Data Management Records approach, FAIRsharing and our own DMPRoadmap project. All the slides and notes from the RDA session are available from the link above if you want to find out more. The working groups are just starting to get underway too, so please review their plans and contribute if you can.

We’ve been progressing the machine-actionable DMP agenda through the DMPRoadmap team too. With support from an RDA Europe collaboration award, we integrated the disciplinary Metadata Standards Directory (MSD) into the tool. Template administrators can choose the MSD as an answer format for metadata questions so users can browse the directory from within the tool. We’d love your feedback on this – both admins trialling it on templates and end users selecting standards. Can you find relevant standards easily? Is the functionality intuitive? Are there other features or additions you would like to see? Please try it out at https://dmponline-test.dcc.ac.uk and let us know.

RDA metadata standards directory screenshot

Integrating the MSD is just one small step on the path to improving the DMP experience. We also plan to surface other registries, such as FAIRsharing and re3data, to recommend appropriate standards and services. Experimentation in this area will also aim to facilitate the exchange of information between systems and alert services to data in the pipeline. The DMPTool team have just received a 2-year NSF EAGER grant to address these bigger aims! The work plan includes pilot projects with the Biological and Chemical Oceanographic Data Management Office (BCO-DMO) at Woods Hole, MA and understanding the institutional workflow in collaboration with Purdue and others. Find out more on the DMPTool blog; additional details forthcoming as we refine the work plan.

The next stop for us is FORCE2017 in Berlin next week. We’ll be running a session on 10 Simple Rules for Active DMPs on Friday morning (27 Oct) in collaboration with the FAIR DMP group. The session will introduce participants to the concepts of FAIR and machine-actionable DMPs and then build community consensus around common goals and definitions. We’ve been working on a draft that we’ll share and iterate on at the meeting. Join us there if you can!

We’re also looking forward to the International Digital Curation Conference (IDCC) in Barcelona next February. The call for papers is out now and closes later this month. Last year we outlined ideas for Next-Generation DMPs (here) and hosted a workshop that resulted in this white paper with community-generated use cases for machine-actionable DMPs. Thanks again to all those who contributed to defining these preliminary requirements for the work now being addressed by us and the RDA working groups. IDCC is a great opportunity to get international input on your ideas so share what you’ve been working on and join us in Barcelona!

NSF EAGER Grant for Actionable DMPs

We’re delighted to announce that the California Digital Library has been awarded a 2-year NSF EAGER grant to support active, machine-actionable data management plans (DMPs). The vision is to convert DMPs from a compliance exercise based on static text documents into a key component of a networked research data management ecosystem that not only facilitates, but improves the research process for all stakeholders.

Machine-actionable “refers to information that is structured in a consistent way so that machines, or computers, can be programmed against the structure” (DDI definition). Through prototyping and pilot projects we will experiment with making DMPs machine-actionable.

Imagine if the information contained in a DMP could flow across other systems automatically (e.g., to populate faculty profiles, monitor grants, notify repositories of data in the pipeline) and reduce administrative burdens. What if DMPs were part of active research workflows, and served to connect researchers with tailored guidance and resources at appropriate points over the course of a project? The grant will enable us to extend ongoing work with researchers, institutions, data repositories, funders, and international organizations (e.g., Research Data Alliance, Force11) to define a vision of machine-actionable DMPs and explore this enhanced DMP future. Working with a broad coalition of stakeholders, we will implement, test, and refine machine-actionable DMP use cases. The work plan also involves outreach to domain-specific research communities (environmental science, biomedical science) and pilot projects with various partners (full proposal text).

Active DMP community

Building on our existing partnership with the Digital Curation Centre, we look forward to incorporating new collaborators and aligning our work with wider community efforts to create a future world of machine-actionable DMPs. We’re aware that many of you are already experimenting in this arena and are energized to connect the dots, share experiences, and help carry things forward. These next-generation DMPs are a key component in the globally networked research data management ecosystem. We also plan to provide a neutral forum (not tied to any particular tool or project or working group) to ground conversations and community efforts.

Follow the conversation @ActiveDMPs #ActiveDMPs and activedmps.org (forthcoming). You can also join the active, machine-actionable DMP community (live or remote participation) at the RDA plenary in Montreal and Force11 meeting in Berlin to contribute to next steps.

Contact us to get involved!

On the right track(s) – DCC release draws nigh

blog post by Sarah Jones

Eurostar photo

Eurostar from Flickr by red hand records CC-BY-ND

Preliminary DMPRoadmap out to test

We’ve made a major breakthrough this month, getting a preliminary version of the DMPRoadmap code out to test on DMPonline, DMPTuuli and DMPMelbourne. This has taken longer than expected but there’s a lot to look forward to in the new code. The first major difference users will notice is that the tool is now lightning quick. This is thanks to major refactoring to optimise the code and improve performance and scalability. We have also reworked the plan creation wizard, added multi-lingual support, ORCID authentication for user profiles, on/off switches for guidance, and improved admin controls to allow organisations to upload their own logos and assign admin rights within their institutions. We will run a test period for the next 1-2 weeks and then move this into production for DCC-hosted services.

Work also continues on additional features needed to enable the DMPTool team to migrate to the DMPRoadmap codebase. This includes additional enhancements to existing features, adding a statistics dashboard, email notifications dashboard, enabling a public DMP library, template export, creating plans and templates from existing ones, and flagging “test” plans (see the Roadmap to MVP on the wiki to track our progress). We anticipate this work will be finished in August and the DMPTool will migrate over the summer. When we issue the full release we’ll also provide a migration path and documentation so those running instances of DMPonline can join us in the DMPRoadmap collaboration.

Machine-actionable DMPs

Stephanie and Sarah are also continuing to gather requirements for machine-actionable DMPs. Sarah ran a DMP workshop in Milan last month where we considered what tools and systems need to connect with DMPs in an institutional context, and Stephanie has been working with Purdue University and UCSD to map out the institutional landscape. The goal is to produce maps/diagrams for two specific institutions and extend the exercise to others to capture more details about practices, workflows, and systems. All the slides and exercise from the DMP workshop in Milan are on the Zenodo RDM community collection, and we’ll be sharing a write-up of our institutional mapping in due course. I’m keen to replicate the exercise Stephanie has been doing with some UK unis, so if you want to get involved, drop me a line. We have also been discussing potential pilot projects with the NSF and Wellcome Trust, and have seen the DMP standards and publishing working groups proposed at the last RDA plenary host their initial calls. Case statements will be out for comment soon – stay tuned for more!

We have also been discussing DMP services with the University of Queensland in Australia who are doing some great work in this area, and will be speaking with BioSharing later this month about connecting up so we can start to trial some of our machine-actionable DMP plans.

The travelling roadshow

Our extended network has also been helping us to disseminate DMPRoadmap news. Sophie Hou of NCAR (National Center for Atmospheric Research) took our DMP poster to the USGS Community for Data Integration meeting (Denver, CO 16–19 May) and Sherry Lake will display it next at the Dataverse community meeting (Cambridge, MA 14-16 June). We’re starting an inclusive sisterhood of the travelling maDMPs poster. Display the poster, take a picture, and go into the Hall of Fame! Robin Rice and Josh Finnell have also been part of the street team taking flyers to various conferences on our behalf. If you would like a publicity pack, Stephanie will send out stateside and Sarah will share through the UK and Europe. Just email us your contact details and we’ll send you materials. The next events we’ll be at are the Jisc Research Data Network in York, the EUDAT and CODATA summer schools, the DataONE Users Group and Earth Science Information Partners meetings (Bloomington, IN), the American Library Association Annual Conference (Chicago, IL), and the Ecological Society of America meeting (Portland, OR) . Catch up with us there!

RDA-DMP movings and shakings

RDA Plenary 9

We had another productive gathering of #ActiveDMPs enthusiasts at the Research Data Alliance (RDA) plenary meeting in Barcelona (5-7 Apr). Just prior to the meeting we finished distilling all of the community’s wonderful ideas for machine-actionable DMP use cases into a white paper that’s now available in RIO Journal. Following on the priorities outlined in the white paper, the RDA Active DMPs Interest Group session focused on establishing working groups to carry things forward. There were 100+ participants packed into the session, both physically and virtually, representing a broad range of stakeholders and national contexts and many volunteered to contribute to five proposed working groups (meeting notes here):

  • DMP common standards: define a standard for expression of machine-readable and -actionable DMPs
  • Exposing DMPs: develop use cases, workflows, and guidelines to support the publication of DMPs via journals, repositories, or other routes to making them open
  • Domain/infrastructure specialization: explore disciplinary tailoring and the collection of specific information needed to support service requests and use of domain infrastructure
  • Funder liaison: engage with funders, support DMP review ideas, and develop specific use cases for their context
  • Software management plans: explore the remit of DMPs and inclusion of different output types e.g. software and workflows too

The first two groups are already busy drafting case statements. And just a note about the term “exposing” DMPs: everyone embraced using this term to describe sharing, publishing, depositing, etc. activities that result in DMPs becoming open, searchable, useful documents (also highlighted in a recent report on DMPs from the University of Michigan by Jake Carlson). If you want to get involved, you can subscribe to the RDA Active DMPs Interest Group mailing list and connect with these distributed, international efforts.

Another way to engage is by commenting on recently submitted Horizon2020 DMPs exposed on the European Commission website (unfortunately, the commenting period is closed here and here — but one remains open until 15 May).

DMPRoadmap update

Back at the DMPRoadmap ranch, we’re busy working toward our MVP (development roadmap and other documentation available on the GitHub wiki). The MVP represents the merging of our two tools with some new enhancements (e.g., internationalization) and UX contributions to improve usability (e.g., redesign of the create plan workflow) and accessibility. We’ve been working through fluctuating developer resources and will update/confirm the estimated timelines for migrating to the new system in the coming weeks; current estimates are end of May for DMPonline and end of July for DMPTool. Some excellent news is that Bhavi Vedula, a seasoned contract developer for UC3, is joining the team to facilitate the DMPTool migration and help get us to the finish line. Welcome Bhavi!

In parallel, we’re beginning to model some active DMP pilot projects to inform our work on the new system and define future enhancements. The pilots are also intertwined with the RDA working group activities, with overlapping emphases on institutional and repository use cases. We will begin implementing use cases derived from these pilots post-MVP to test the potential for making DMPs active and actionable. More details forthcoming…

Upcoming events

The next scheduled stop on our traveling roadshow for active DMPs is the RDA Plenary 10 meeting in Montreal (19–21 Sept 2017), where working groups will provide progress updates. We’re also actively coordinating between the RDA Active DMPs IG and the FORCE11 FAIR DMPs group to avoid duplication of effort. So there will likely be active/FAIR/machine-actionable DMP activities at the next FORCE11 meeting in Berlin (25–27 Oct)—stay tuned for details.

And there are plenty of other opportunities to maintain momentum, with upcoming meetings and burgeoning international efforts galore. We’d love to hear from you if you’re planning your own active DMP things and/or discover anything new so we can continue connecting all the dots. To support this effort, we registered a new Twitter handle @ActiveDMPs and encourage the use of the #ActiveDMPs hashtag.

Until next time.

Active, actionable DMPs

IDCC workshop participants

Roadmap project IDCC debriefing
We had a spectacularly productive IDCC last month thanks to everyone who participated in the various meetings and events focused on the DMPRoadmap project and machine-actionable DMPs. Thank you, thank you! Sarah has since taken the traveling road show onward to a meeting at CERN (slides) and Stephanie discussed institutional infrastructure for DMPs at a meeting of California data librarians. In the midst of travels we’ve been wrangling the mountain of inputs into a draft white paper on machine-actionable DMP use cases. For now, we offer a preview of the report and an invitation to keep the momentum going at the RDA plenary in Barcelona, which is just around the corner (5–7 April).

The white paper represents the outputs of the IDCC workshop: ”A postcard from the future: Tools and services from a perfect DMP world” (slides, etc. here). We convened 47 participants from 16 countries representing funders, educational institutions, data service providers, and the research community. There was so much interest in the topic that we added an overflow session to accommodate everyone who wanted to weigh in. We’re gratified to discover how many folks have been thinking about DMPs as much as we have, and aim to continue synthesizing your stakeholder-balanced, community-driven solutions for improving the data management enterprise.

mind map exercise

Solving DMPs with rainbow stickies

The contributions from IDCC align with previously gathered information and drive the agenda summarized here. Consensus emerged to:

  • Focus on integrating existing systems (Interoperability was top-voted topic for the workshop)
  • Integrate DMPs into active research workflows to emphasize benefits of planning to researchers, but keep in mind that funders still drive demand.
  • Consider the potential of persistent identifiers (ORCID iDs, Crossref Funder Registry, etc.)
  • Explore ways to offer tailored, discipline-specific guidance at appropriate points

Next steps…
All stakeholders expressed a need for common standards and protocols to enable information to flow between plans and systems in a standardized manner. This would support APIs to both read and write to DMPs, as well as creating a framework for the development of new use cases over time. Therefore, it is a top priority to define a minimum data model with a core set of elements for DMPs. The model should incorporate existing standards and avoid inventing something new; it could potentially be based on a template structure and/or use the DMPRoadmap themes. Additional requirements in this area include that it:

  • Must make use of existing vocabularies and ontologies whenever possible
  • Must employ common exchange protocols (e.g., json)
  • Must be open to support new data types, models, and descriptions
  • Should be available in a format that can be rendered for human use
  • Should accommodate versioning to support actively updated DMPs

At the RDA 9th Plenary meeting in Barcelona during the Active DMPs IG session (6 April, 9:30-11:00) we propose establishing a working group to develop standards for DMPs. This isn’t our particular area of expertise so once again we’re relying on all of you to help steer the DMP ship. We hope that additional working groups might spin out from the session and invite your ideas and contributions (e.g., publishing DMPs).

…and beyond
The DCC and UC3 will continue to pursue international collaborations related to DMPRoadmap through pilot projects. As part of an iterative process for developing, implementing, testing, and refining these use cases we’re beginning to model domain-specific and institutional pilot projects to determine what information can realistically move between stakeholders, systems, and research workflows. We have some existing funds to support a subset of this work and are actively seeking additional sources of funding to carry the project forward. In addition to technical solutions, these projects will expand our capacity to connect with key stakeholders, with particular emphasis on addressing the needs and practices of researchers and funders. Stay tuned for more details in the coming weeks and months.

You can also track our progress and find oodles of documentation on the DMPRoadmap GitHub wiki.

Roadmap retrospective: 2016

be kind rewind2016 in review

The past year has been a wild ride, in more ways than one… Despite our respective political climates, UC3 and DCC remain enthusiastic about our partnership and the future of DMPs. Below is a brief retrospective about where we’ve been in 2016 and a roadmap (if you will…we also wish we’d chosen a different name for our joint project) for where we’re going in 2017. Jump to the end if you just want to know how to get involved with DMP events at the International Digital Curation Conference (IDCC 2017, 20–23 Feb in Edinburgh, register here).

In 2016 we consolidated our UC3-DCC project team, our plans for the merged platform (see the roadmap to MVP), and began testing a co-development process that will provide a framework for community contributions down the line. We’re plowing through the list of features and adding documentation to the GitHub repo—all are invited to join us at IDCC 2017 for presentations and demos of our progress to date (papers, slides, etc. will all be posted after the event). For those not attending IDCC, please let us know if you have ideas, questions, anything at all to contribute ahead of the event!

DMPs sans frontières

Now we’d like to take a minute and reflect on events of the past year, particularly in the realm of open data policies, and the implications for DMPs and data management writ large. The open scholarship revolution has progressed to a point where top-level policies mandate open access to the results of government-funded research, including research data, in the US, UK, and EU, with similar principles and policies gaining momentum in Australia, Canada, South Africa, and elsewhere. DMPs are the primary vehicle for complying with these policies, and because research is a global enterprise, awareness of DMPs has spread throughout the research community. Another encouraging development is the ubiquity of the term FAIR data (Findable, Accessible, Interoperable, Reusable), which suggests that we’re all in agreement about what we’re trying to achieve.

On top of the accumulation of national data policies, 2016 ushered in a series of related developments in openness that contribute to the DMP conversation. To name a few:

  • More publishers articulated clear data policies, e.g., Springer Nature Research Data Policies apply to over 600 journals.
  • PLOS and Wiley now require an ORCID for all corresponding authors at the time of manuscript submission to promote discoverability and credit. Funders—e.g., Wellcome Trust, Swedish Research Council, and US Department of Transportation—are also getting on the ORCID bandwagon.
  • The Gates Foundation reinforced support for open access and open data by preventing funded researchers from publishing in journals that do not comply with its policy, which came into force at the beginning of 2017; this includes non-compliant high-impact journals such as Science, Nature, PNAS, and NEJM.
  • Researchers throughout the world continued to circumvent subscription access to scholarly literature by using Sci-Hub (Bohannon 2016).
  • Library consortia in Germany and Taiwan canceled (or threatened to cancel) subscriptions to Elsevier journals because of open-access related conflicts, and Peru canceled over a lack of government funding for expensive paid access (Schiermeier and Rodríguez Mega 2017).
  • Reproducibility continued to gain prominence, e.g., the US National Institutes of Health (NIH) Policy on Rigor and Reproducibility came into force for most NIH and AHRQ grant proposals received in 2016.
  • The Software Citation Principles (Smith et al. 2016) recognized software as an important product of modern research that needs to be managed alongside data and other outputs.

This flurry of open scholarship activity, both top-down and bottom-up, across all stakeholders continues to drive adoption of our services. DMPonline and the DMPTool were developed in 2011 to support open data policies in the UK and US, respectively, but today our organizations engage with users throughout the world. An upsurge in international users is evident from email addresses for new accounts and web analytics. In addition, local installations of our open source tools, as both national and institutional services, continue to multiply (see a complete list here).

Over the past year, the DMP community has validated our decision to consolidate our efforts by merging our technical platforms and coordinating outreach activities. The DMPRoadmap project feeds into a larger goal of harnessing the work of international DMP projects to benefit the entire community. We’re also engaged with some vibrant international working groups (e.g., Research Data Alliance Active DMPs, FORCE11 FAIR DMPs, Data Documentation Initiative DMP Metadata group) that have provided the opportunity to begin developing use cases for machine-actionable DMPs. So far the use cases encompass a controlled vocabulary for DMPs; integrations with other systems (e.g., Zenodo, Dataverse, Figshare, OSF, PURE, grant management systems, electronic lab notebooks); passing information to/from repositories; leveraging persistent identifiers (PIDs); and building APIs.

2017 things to come

This brings us to outlining plans for 2017 and charting a course for DMPs of the future. DCC will be running the new Roadmap code soon. And once we’ve added everything from the development roadmap, the DMPTool will announce our plans for migration. At IDCC we’ll kick off the conversation about bringing the many local installations of our tools along for the ride to actualize the vision of a core, international DMP infrastructure. A Canadian and a French team are our gracious guinea pigs for testing the draft external contributor guidelines.

IDCC DMP/BoF session

There will be plenty of opportunities to connect with us at IDCC. If you’re going to be at the main conference, we encourage you to attend our practice paper and/or join a DMP session we’ll be running in parallel with the BoFs on Wednesday afternoon, 22 Feb. The session will begin with a demo and update on DMPRoadmap; then we’ll break into two parallel tracks. One track will be for developers to learn more about recent data model changes and developer guidelines if they want to contribute to the code. The other track will be a buffet of DMP discussion groups. Given the overwhelming level of interest in the workshop (details below), one of these groups will cover machine-actionable DMPs. We’ll give a brief report on the workshop and invite others to feed into discussion. The other groups are likely to cover training/supporting DMPs, evaluation cribsheets for reviewing DMPs, or other topics per community requests. If there’s something you’d like to propose please let us know!

IDCC DMP utopia workshop

We’re also hosting a workshop on Monday, 20 Feb entitled “A postcard from the future: Tools and services from a perfect DMP world.” The focus will be on machine-actionable DMPs and how to integrate DMP tools into existing research workflows and services.

The program includes presentations, activities, and discussion to address questions such as:

  • Where and how do DMPs fit in the overall research lifecycle (i.e., beyond grant proposals)?
  • Which data could be fed automatically from other systems into DMPs (or vice versa)?
  • What information can be validated automatically?
  • Which systems/services should connect with DMP tools?
  • What are the priorities for integrations?

We’ve gathered an international cohort of diverse players in the DMP game—repository managers, data librarians, funders, researchers, developers, etc.—to continue developing machine-actionable use cases and craft a vision for a DMP utopia of the future. We apologize again that we weren’t able to accommodate everyone who wanted to participate in the workshop, but rest assured that we plan to share all of the outputs and will likely convene similar events in the future.

Keep a lookout for more detailed information about the workshop program in the coming weeks and feel free to continue providing input before, during, and afterward. This is absolutely a community-driven effort and we look forward to continuing our collaborations into the new year!