Getting our ducks in a row

From Flickr by Cliff Johnson, CC BY-SA 2.0

From Flickr by Cliff Johnson, CC BY-SA 2.0

Recent activity on the Roadmap project encompasses two major themes: 1) machine-actionable data management plans and 2) kicking off co-development of the shared codebase.

Machine-actionable DMPs

The first of these has been a hot topic of conversation among stakeholders in the data management game for some time now, although most use the phrase “machine-readable DMPs.” So what do we mean by machine-actionable DMPs? Per the Data Documentation Initiative definition, “this term refers to information that is structured in a consistent way so that machines can be programmed against the structure.” The goal of machine-actionable DMPs, then, is to better facilitate good data management and reuse practices (think FAIR: Findable, Accessible, Interoperable, Reusable) by enabling:

  • Institutions to manage their data
  • Funders to mine the DMPs they receive
  • Infrastructure providers to plan their resources
  • Researchers to discover data

This term is consistent with the Research Data Alliance Active DMPs Interest Group and the FORCE11 FAIR DMPs group mission statements, and it seems to capture what we’re all thinking: i.e., we want to move beyond static text files to create a dynamic inventory of digital research methods, protocols, environments, software, articles, data… One reason for the DMPonline-DMPTool merger is to develop a core infrastructure for implementing use cases that make this possible. We still need a human-readable document with a narrative, but underneath the DMP could have more thematic richness with value for all stakeholders.

A recent Cern/RDA workshop presented the perfect opportunity to consolidate our notes and ideas. In addition to the Roadmap project members, Daniel Mietchen (NIH) and Angus Whyte (DCC) participated in the exercise. We conducted a survey of previous work on the topic (we know we didn’t capture everything so please alert us to things we missed) and began outlining concrete use cases for machine-actionable DMPs, which we plan to develop further through community engagement over the coming months. Another crucial piece of our presentation was a call to make DMPs public, open, discoverable resources. We highlighted existing efforts to promote public DMPs (e.g., the DMPTool Public DMPs list, publishing exemplary DMPs in RIO Journal, Dataverse collections that include DMPs) but these are just a drop in the bucket compared to what we might be able to do if all DMPs were open by default.

You can review our slides here. And please send feedback—we want to know what you think!

Let the co-development begin!

Now for the second news item: our ducks are all in a row and work is underway on the shared Roadmap codebase.

We open with a wistful farewell to Marta Ribeiro, who is moving on to an exciting new gig at the Urban Big Data Centre. DCC has hired two new developers to join our ranks—Ray Carrick and Jimmy Angelakos—both from their sister team at EDINA. The finalized co-development team commenced weekly check-in calls and in the next week or two we’ll begin testing the draft co-development process by adding three features from the roadmap:

  1. Enhanced institutional branding
  2. Funder template export
  3. OAuth link an ORCID

In the meantime, Brian completed the migration to Rails 4.2 and both teams are getting our development environments in place. Our intention is to iterate on the process for a few sprints, iron out the kinks, and then use it and the roadmap as the touchstones for a monthly community developer check-in call. We hope this will provide a forum for sharing use cases and plans for future work (on all instances of the tool) in order to prioritize, coordinate, and alleviate duplication of effort.

The DCC interns have also been plugging away at their respective projects. Sam Rust just finished building some APIs for creating plans and extracting guidance, and is now starting work on the usage statistics use case. Damodar Sójka meanwhile is completing the internationalization project, drawing from work done by the Canadian DMP Assistant team. We’ll share more details about their work once we roll it all back into the main codebase.

Next month the UC Berkeley Web Services team will evaluate the current version of DMPonline to flag any accessibility issues that need to be addressed in the new system. We’ve also been consulting with Rachael Hu on UX strategy. We’re keeping track of requests for the new system and invite you to submit feedback via GitHub issues.

Stay tuned to GitHub and our blog channels for more documentation and regular progress updates.

The 20:51 sprint (Roadmap team-building: UK edition)

teamwork

This week we hosted the DMPTool team to flesh out our plans for ‘roadmap’ – the joint codebase we’re building together based on DMPonline and DMPTool. The key focus was reviewing and prioritising tasks for an initial release.  Building on discussions from the earlier US visit, we confirmed what work was to be done and agreed to begin with some well-defined, short tasks as a test of our co-development procedures. With everyone taking leave over the coming weeks, the first sprint will start in mid-July at which point we’ll begin adding documentation to the Github repository.

We also discussed communication plans. Stephanie and I will take turns to do monthly blog posts so you can stay in the loop with what’s happening, and we aim to start regular calls in a few months with others who are actively working on the code, such as the Portage group in Canada. This will allow everyone to share their plans for future enhancements and to coordinate development activities. We’re always learning about new people who have picked up on our software – the latest being a group in Germany who have extended the DMPTool code to offer a bi-lingual interface – so we want to do more to bring these efforts together. While we get our work underway, we encourage people to join the developers list as a place to start discussions and form a community of interest.

Machine-actionable DMPs was another key theme for which Daniel Mietchen joined our discussions. Stephanie has an RDA/US Data Share Fellowship to pursue work in this area and we’re planning to give talks at some upcoming events highlighting our ideas. We’ve started to refine the themes used for guidance in DMPonline. Currently, DCC defines 28 themes corresponding with UK funder questions that are often addressed in a DMP (e.g., Data format, Metadata, Ethical issues, etc.). These themes offer the perfect starting point for standardising and structuring DMPs so it’s more feasible to identify and mine relevant text. We’ll be seeking comments from our user communities and key working groups such as CASRAI, RDA and FORCE11 shortly on this. We’re also keen to capture more data in a controlled way so it can be put to better uses. One idea is to provide an actionable list of repositories to allow researchers to select where they are going to deposit research outputs, and then to use this data to push notifications out to alert repositories and/or monitor compliance. Machine-actionable DMPs have been part of the future plans of both teams for some years, and they are currently a hot topic. We’re excited that we now have the resources to develop those ideas and a system that will allow us to test them via deployment. We also want to collect additional use cases and explore integrations with other systems so please don’t hesitate to get in touch.

The eagle-eyed among you may have spotted some new faces in the team photos. The DCC has two student interns from Informatics working on DMPonline over the summer. Damodar is doing the internationalisation work that we consulted with the user group on at IDCC and Sam is busy developing an API. Both are making great progress so we’ll be looking for input from the user group again soon to try out the new features. The DMPTool team includes a new developer called Brian joining as technical lead. The visit was a great team-building opportunity for our transatlantic DMP roadmap project.

sprintIt was a jam-packed week with lots of meetings, brainstorming sessions and time working together on the code. We had new culinary experiences (deep-fried haggis balls no less!), heard some hilarious tales from the adventures of John Chodacki, and initiated the US team in the Glasgow-Edinburgh commute, including a quick dash one evening to make the 20:51 train home. Here’s the photographic proof of our first successful joint sprint. Stay tuned for what else we deliver over the coming months.

Roadmaps galore

Data management planning is moving and shaking at all scales—local, national, international—these days. We had excellent conversations at IDCC about coordinating responses to proliferating data policies and sharing experiences across borders and disciplines. All of the slides and materials from the international DMP workshop are available here.

So far the community has responded positively to our proposal for building a global infrastructure for all things DMP. Our big-picture plans include a merged platform based on the DMPonline codebase and incorporating recent internationalization work by the Portage Network in Canada (check out their bilingual DMP Assistant). We’re completing a gap analysis to add existing functionality from the DMPTool to DMPonline and will issue a joint roadmap in the coming months. Drawing together these disparate development efforts also presents an opportunity to set best practices for future work (stay tuned). This will allow us to consolidate value upstream and ensure maximum benefits to the entire community.

To facilitate our capacity-building efforts, we submitted a proposal entitled (what else) “Roadmap” to the Open Science Prize. You can read the Executive Summary on their website here and peruse the full proposal here (also view our snazzy promo video below). The prize seemed like the perfect opportunity to reposition DMPs as living documents using the biomedical research community as a pilot group. We’ll know by the end of April whether our bid is successful. Regardless of the outcome, we would love to know what you think about the proposal.

And finally, an update on the near-future roadmap for the DMPTool. We just added some new API calls in response to requests for more usage statistics and to facilitate integration projects with other data management systems. Admins can now get info about templates used to create plans at their institution (including private plans!) and a list of institutional templates. Check out the updated documentation on the GitHub wiki. The next order of business is working through the backlog of bug fixes. You can follow our progress in the GitHub issue tracker. Once the bugs are eliminated, we’ll circle back to high priority feature enhancements that contribute to our long-range plans.


DMPs are going global

…well international at least, with global aspirations. The US-based DMPTool and UK-based DMPonline have collaborated from the beginning to provide data management planning services and training on our respective sides of the pond. As more and more funders, institutions, and nations—the entire EU, for instance—adopt data sharing policies, we find ourselves supporting data management planning initiatives farther and wider.

To meet the exploding demand and facilitate connecting the dots (e.g., promoting community standards for DMPs), we’ve decided to formalize our partnership and move toward a single platform for all things DMP. You can learn more about our evolving partnership in this joint paper that we’ll be presenting at the International Digital Curation Conference (IDCC) at the end of Feb. Stay tuned for updates about a joint roadmap and timeline in the coming months. Our individual roadmaps will remain in place for now.

As always, we invite your feedback! And if you happen to be attending IDCC, consider joining us and the DART Project for an international DMP workshop on Thurs, Feb 25 (registration info).

GlobalSelfieMosaic

NASA’s Global Digital Selfie 2014 http://www.gigapan.com/gigapans/155294

DMPTool2 Project – Functional Requirements

In our last post, we mentioned that we would be talking more about the development work on the next generation of the DMPTool: the DMPTool2. We have now made available our current draft of the functional requirements. For those of you who haven’t read a functional requirements document, these are fairly detailed documents that specify the capabilities and, well, functions of a system. But this document should begin to give those of you who are interested a sense of where we’re expanding and adding functionality to the DMPTool.

Over the next few weeks, we’ll be highlighting very specific areas to talk in more detail (and in more colloquial language!) about these developments.

DMPTool2 Project – April 2013 Report

Overview
While the coding has not yet begun, we have made significant progress over the past month in continuing to get pieces in place to strengthen the overall project and community effort. We continue to make frequent announcements on the blog and social media presences to share the status on various project fronts, so please consult those sources for more detail on issues outlined below.

Technical Team
Much of the past month’s technical effort has been spent on review and revision of specifications and hiring of staff. A revised specification document is now complete, along with basic use cases, and each will proceed in the next week forward into development of initial wireframes. In the area of staffing, one of two application developers began on 4/1 at CDL, and interviews continue to fill a second position. In addition, we now have a UI designer on-board at Purdue who will be working on the wireframe and UI aspects during this process. We are presently determining the best process for community review and feedback on technical specifications.

Communications/Outreach Team
The Sloan outreach coordinator position has been posted at UIUC and candidate review will begin 5/2. Both Researcher and Administrative User Advisory Boards are now fully composed, posted on the public website, and initial phone meetings are scheduled for May. Initial meetings will consist of a high-level project introduction and review of technical specifications and plans. The governance structure is now being implemented, and most of the originating institutions have signed the agreement already. We are just beginning the process of enrolling institutions currently setup with authentication to shift them to “Partner” status. We hope to get most of them onboard within a month.

Metrics
Nothing new to report. We have not yet fully settled on metrics for use in evaluating 1) project success or 2) impact of the new DMPTool. We have an extensive list produced at the kick-off meeting, but are still sorting and determining best choices. We plan to gather feedback from the advisory boards on this topic.

Overall Project
While we expect most of the work to be completed within 2013, final activities may carry into early 2014 given the delays experienced in receiving and processing awards at the main institutions, and the posting and hiring of several key positions.  We remain generally on track to finish within 12 months, considering the startup delay from receiving and processing awards at the main institutions. We are now beginning to move more smoothly.

-Andrew Sallans, University of Virginia Library, DMPTool2 Project Manager

Kickoff Meetings for Newly Funded DMPTool Projects

Berkeley

The meetings were held in Downtown Berkeley, near Durant Ave. This image of the area was taken in 1978. From Calisphere, contributed by Berkeley Public Library and Betty Marvin. Click for more information.

Two weeks ago, a meeting of the data management minds took place in Berkeley, California. There were two back-to-back meetings to kick off projects funded by the Alfred P. Sloan Foundation (read the blog post about it) and the Institute of Museum and Library Services. Here we provide a report of each meeting.

Alfred P. Sloan Foundation Project: “DMPTool2: Responding to the Community”

The primary goal of this project is to improve on the DMPTool (free, easy-to-use application that guides researchers through the process of creating data management plans). To accomplish this, we aim to build on the success of the tool to create DMPTool2, and use this improved version as a centerpiece for encouraging collaboration in data management efforts across all stakeholder groups (researchers, institutions, funders, libraries).  In support of the project goals, we convened a meeting of DMPTool partners to synchronize the project kickoff efforts and revisit our planned activities.  The meeting aimed to review:

  • Current DMPTool status
  • Community engagement plans
  • Functional development plans
  • Metrics for impact and success

Meeting participants were mainly from founding DMPTool institutions.  Over the course of the 1.5 day meeting, participants reviewed the course of the DMPTool thus far, the expectations and plans for the project, and then specific activities for the next 12 or so months.  Some highlights include:

  • Observations that the DMPTool has had significant use, but should to put increased emphasis on gaining repeat users and providing more value to users.  Underlying this point, while the team aims to address user needs and demands, it is important to still stress that the goal should be making data management planning EASIER, rather than just EASY.  Research data lives in a complex environment and this must not be underestimated.
  • Community engagement in coming months will be on many fronts.  Some include development of two advisory boards, one focused on administrative users and one on researchers.  The team will also implement the planned governance structure to give the user community greater access to and participation in future directions and ownership of the DMPTool; this will be in the very near term.
  • Functionality for this project ranges far and wide, but fits into two main broad categories:  functions for the researcher (ie. Writing plans, finding resources, getting advice, etc.) and functions for the administrative user (ie. Reporting on institutional use, adding institutional guidance, etc.).  The team will offer blog posts on specific technical elements, request feedback, and conduct user testing as the project moves along.  Expect first posts in coming weeks.
  • The last discussion of the meeting was around metrics for impact and success, what’s possible, what’s easy versus hard, and what matters to our different constituents.  We have many ideas in this area, and will have blog posts to outline some of these points and request feedback in coming weeks.

IMLS Grant Project: “Improving Data Stewardship with the DMPTool: Empowering Libraries to Seize Data Management Education”

The meeting funded by the IMLS grant took place over February 21-22. The primary goal of this project is to provide librarians with the tools and resources to claim the data management education space. In an effort to ensure the tools and resources developed meet the needs of librarians, we convened a meeting of DMPTool partners, as well as librarians from five University of California campuses. We had three goals for the meeting:

  1. Identify the resources most useful for helping librarians use the DMPTool for outreach.
  2. Prioritize resources based on user profiles and use cases.
  3. Create timelines and brainstorm dissemination tactics for resources to be developed.

Participants were primarily librarians, along with members of the DMPTool partner institutions. Over the course of the two day meeting, we discussed the barriers and solutions associated with using the DMPTool as a librarian, especially for outreach. Common themes emerged related to a lack of support and education, as well as limited resources including time, money, personnel, and institution-level services.  Poor communication among institutional partners and stakeholders was also often mentioned. The solutions proposed to eliminate these barriers became the template for potential products from the IMLS grant. Here we present a list of proposed outcomes and tasks for the project, i.e. things that will help librarians use the DMPTool effectively on their campuses:

  1. Checklist/talking points documents & brown bag kit for librarians to talk to campus partners and stakeholders, including researchers, VCRs, Special Projects/Grants offices,  IT, and other librarians
  2. Slide deck for presenting to researchers
  3. Promotional materials (posters, pamphlets, bookmarks, postcards, flyers) that can be customized for the institution
  4. Startup Kit for undergoing an environmental scan of institutional resources and services
  5. DMPTool Webinar Series for librarians
  6. DMPTool Screencasts for users, librarians
  7. A collection of case studies of institutions using the DMPTool successfully
  8. A collection data management success and horror stories
  9. A calendar of funder deadlines
  10. DMPTool Libguide

A larger outcome of the IMLS grant will be that we plan to set up an online common space that allows for sharing customization of tool, provides a forum for user conversation streams, provides access to materials developed by the grant project, and can be used as a platform for collecting use cases, success and horror stories. The list above is only a subset of the long list of suggestions that emerged from our meeting. Stay tuned into this blog for more updates as the project progresses.

Download the full IMLS meeting report