NASA template update & bug fix

NASA template

NASA logoLast week NASA launched a new Research Portal, with consolidated information regarding data management plans and publications. There are no changes to the DMP requirements as the public access plan remains the same. The big news concerns the creation of PubSpace, an open access article repository that is part of the NIH-managed PubMed Central. Beginning with 2016 awards, all NASA-funded authors and co-authors will be required to deposit copies of their peer-reviewed scientific publications and associated data into PubSpace.

Another new resource is the NASA Data Portal, which bears the following description:

“The NASA data catalog serves not as a repository of study data, but as a registry that has information describing the dataset (i.e., metadata) and information about where and how to access the data. The public has access to the catalog and associated data free of charge. NASA will continue to identify additional approaches involving public and private sector entities and will continue efforts to improve public access to research data. NASA will explore the development of a research data commons—a federated system of research databases—along with other departments and agencies for the storage, discoverability, and reuse of data, with a particular focus on making the data underlying the conclusions of federally funded peer-reviewed scientific research publications available for free at the time of publication.”

In response to the announcement, we’ve updated a few guidance links for the NASA template and reached out to the NASA Open Innovation Team—part of the office of the CIO— which appears to be in charge of these new initiatives.

Review workflow: Refinements and fixes

After releasing the review workflow enhancements, we encountered a bug that prevented the system from sending out an email notification if an institution did not create a customized message. Only one user was affected and we have since fixed the issue. We also added a grayed-out default message to the box on the Institution Profile page. We apologize if any emails went awry and invite you to test again and let us know if things are working as expected. You can also check out the updated documentation on the GitHub wiki.

Review workflow enhancements

We deployed some enhancements to the review workflow in response to feedback. With increasing use of this functionality, we appreciate you letting us know what works for you and what doesn’t. In the next version of the tool, we plan to dispense with the term “review” altogether and replace it with more informal language to avoid confusing researchers (e.g., “feedback” or “comments”). The following changes to the current tool should hopefully improve things for all users. And as always, we want to know what you think!

One more small thing to note: we updated the generic slide decks (PDF and Google doc) on the promotional materials page.

  • Replaced “Submit for Review” button with “Request Feedback” for templates enabled for Informal Review

request_feedback

  • Provided complete history of reviewed plans in admin dashboard. Admins and plan owners can add new comments to previously reviewed plans.

previously_reviewed

  • Added a field to the Institution Profile page where admins can customize the automated email message that users receive when they Request Feedback on a plan

feedback_email

New templates: DOT and NASA

We just added two new funder templates in response to user requests. Both the U.S. Department of Transportation (DOT) and the National Aeronautics and Space Agency (NASA) have required a data management plan with grant proposals since 2015, but for various reasons (detailed below) we held off on creating templates.

Next on the list are DOD and NIJ templates. Please let us know if you need a specific template and we’ll bump it to the front of the line.

DOT Template

Via conversations with members of the National Transportation Library (NTL) and the American Association of State Highway and Transportation Officials (AASHTO), we learned more about the bureaucratic hurdles that stand between an agency issuing a public access plan in compliance with the OSTP memo and being able to enforce that plan legally. Suffice to say, it’s complicated (for the DOT it involves the Paperwork Reduction Act). The DOT lawyers requested that we not provide a public DOT template until they cleared these hurdles, but then they softened their stance on the condition that we include the following disclaimer:

“This tool serves to provide guidance for how to prepare a Data Management Plan (DMP). The output of this tool does not constitute an approved government form. Those preparing DMPs for submission to the U.S. Department of Transportation (USDOT) should use their best judgment in determining what information to include. USDOT has identified five (5) broad areas that should be addressed in a DMP, but is not requiring any specific information to be included in any submitted DMP. USDOT may, at its discretion, establish an Office of Management and Budget-approved information collection. Once approved, the information collection will become a form with a control number, and certain DMP elements may become mandatory.”

Throughout these conversations, we gained valuable insight into the vibrant DOT community and became fans of the NTL for providing such helpful guidance (links included in the template). The NTL also hosts a regular webinar series on data management and invited me to give a DMPTool presentation (past recordings available on their website). One noteworthy feature of the DOT plan is that it requires researchers to obtain an ORCID, which will be used in the reporting workflow to identify research outputs. We look forward to working with the NTL to maintain the DOT template in the future!

NASA Template

NASA also seems to be in limbo regarding enforcement of their public access plan. This blog post is instructive and various NASA webpages contain general information about data management plans, often infused with humor, e.g.:

“Remember, this is a directive from the white house and if you are really bad The President will call your dean and shame you. Just kidding, but awardees who do not fulfill the intent of their DMPs may have continuing funds withheld and this may be considered in the evaluation of future proposals, which may be even worse…” (DMP FAQ Roses)

Because we received so many requests for a NASA template, we decided to go ahead and create one with the information at hand (official Public Access Plan), and with the expectation that there will be revisions and updates to come. If you have suggestions of additional resources to include in the NASA template, please let us know.

Getting our ducks in a row

From Flickr by Cliff Johnson, CC BY-SA 2.0

From Flickr by Cliff Johnson, CC BY-SA 2.0

Recent activity on the Roadmap project encompasses two major themes: 1) machine-actionable data management plans and 2) kicking off co-development of the shared codebase.

Machine-actionable DMPs

The first of these has been a hot topic of conversation among stakeholders in the data management game for some time now, although most use the phrase “machine-readable DMPs.” So what do we mean by machine-actionable DMPs? Per the Data Documentation Initiative definition, “this term refers to information that is structured in a consistent way so that machines can be programmed against the structure.” The goal of machine-actionable DMPs, then, is to better facilitate good data management and reuse practices (think FAIR: Findable, Accessible, Interoperable, Reusable) by enabling:

  • Institutions to manage their data
  • Funders to mine the DMPs they receive
  • Infrastructure providers to plan their resources
  • Researchers to discover data

This term is consistent with the Research Data Alliance Active DMPs Interest Group and the FORCE11 FAIR DMPs group mission statements, and it seems to capture what we’re all thinking: i.e., we want to move beyond static text files to create a dynamic inventory of digital research methods, protocols, environments, software, articles, data… One reason for the DMPonline-DMPTool merger is to develop a core infrastructure for implementing use cases that make this possible. We still need a human-readable document with a narrative, but underneath the DMP could have more thematic richness with value for all stakeholders.

A recent Cern/RDA workshop presented the perfect opportunity to consolidate our notes and ideas. In addition to the Roadmap project members, Daniel Mietchen (NIH) and Angus Whyte (DCC) participated in the exercise. We conducted a survey of previous work on the topic (we know we didn’t capture everything so please alert us to things we missed) and began outlining concrete use cases for machine-actionable DMPs, which we plan to develop further through community engagement over the coming months. Another crucial piece of our presentation was a call to make DMPs public, open, discoverable resources. We highlighted existing efforts to promote public DMPs (e.g., the DMPTool Public DMPs list, publishing exemplary DMPs in RIO Journal, Dataverse collections that include DMPs) but these are just a drop in the bucket compared to what we might be able to do if all DMPs were open by default.

You can review our slides here. And please send feedback—we want to know what you think!

Let the co-development begin!

Now for the second news item: our ducks are all in a row and work is underway on the shared Roadmap codebase.

We open with a wistful farewell to Marta Ribeiro, who is moving on to an exciting new gig at the Urban Big Data Centre. DCC has hired two new developers to join our ranks—Ray Carrick and Jimmy Angelakos—both from their sister team at EDINA. The finalized co-development team commenced weekly check-in calls and in the next week or two we’ll begin testing the draft co-development process by adding three features from the roadmap:

  1. Enhanced institutional branding
  2. Funder template export
  3. OAuth link an ORCID

In the meantime, Brian completed the migration to Rails 4.2 and both teams are getting our development environments in place. Our intention is to iterate on the process for a few sprints, iron out the kinks, and then use it and the roadmap as the touchstones for a monthly community developer check-in call. We hope this will provide a forum for sharing use cases and plans for future work (on all instances of the tool) in order to prioritize, coordinate, and alleviate duplication of effort.

The DCC interns have also been plugging away at their respective projects. Sam Rust just finished building some APIs for creating plans and extracting guidance, and is now starting work on the usage statistics use case. Damodar Sójka meanwhile is completing the internationalization project, drawing from work done by the Canadian DMP Assistant team. We’ll share more details about their work once we roll it all back into the main codebase.

Next month the UC Berkeley Web Services team will evaluate the current version of DMPonline to flag any accessibility issues that need to be addressed in the new system. We’ve also been consulting with Rachael Hu on UX strategy. We’re keeping track of requests for the new system and invite you to submit feedback via GitHub issues.

Stay tuned to GitHub and our blog channels for more documentation and regular progress updates.

The 20:51 sprint (Roadmap team-building: UK edition)

teamwork

This week we hosted the DMPTool team to flesh out our plans for ‘roadmap’ – the joint codebase we’re building together based on DMPonline and DMPTool. The key focus was reviewing and prioritising tasks for an initial release.  Building on discussions from the earlier US visit, we confirmed what work was to be done and agreed to begin with some well-defined, short tasks as a test of our co-development procedures. With everyone taking leave over the coming weeks, the first sprint will start in mid-July at which point we’ll begin adding documentation to the Github repository.

We also discussed communication plans. Stephanie and I will take turns to do monthly blog posts so you can stay in the loop with what’s happening, and we aim to start regular calls in a few months with others who are actively working on the code, such as the Portage group in Canada. This will allow everyone to share their plans for future enhancements and to coordinate development activities. We’re always learning about new people who have picked up on our software – the latest being a group in Germany who have extended the DMPTool code to offer a bi-lingual interface – so we want to do more to bring these efforts together. While we get our work underway, we encourage people to join the developers list as a place to start discussions and form a community of interest.

Machine-actionable DMPs was another key theme for which Daniel Mietchen joined our discussions. Stephanie has an RDA/US Data Share Fellowship to pursue work in this area and we’re planning to give talks at some upcoming events highlighting our ideas. We’ve started to refine the themes used for guidance in DMPonline. Currently, DCC defines 28 themes corresponding with UK funder questions that are often addressed in a DMP (e.g., Data format, Metadata, Ethical issues, etc.). These themes offer the perfect starting point for standardising and structuring DMPs so it’s more feasible to identify and mine relevant text. We’ll be seeking comments from our user communities and key working groups such as CASRAI, RDA and FORCE11 shortly on this. We’re also keen to capture more data in a controlled way so it can be put to better uses. One idea is to provide an actionable list of repositories to allow researchers to select where they are going to deposit research outputs, and then to use this data to push notifications out to alert repositories and/or monitor compliance. Machine-actionable DMPs have been part of the future plans of both teams for some years, and they are currently a hot topic. We’re excited that we now have the resources to develop those ideas and a system that will allow us to test them via deployment. We also want to collect additional use cases and explore integrations with other systems so please don’t hesitate to get in touch.

The eagle-eyed among you may have spotted some new faces in the team photos. The DCC has two student interns from Informatics working on DMPonline over the summer. Damodar is doing the internationalisation work that we consulted with the user group on at IDCC and Sam is busy developing an API. Both are making great progress so we’ll be looking for input from the user group again soon to try out the new features. The DMPTool team includes a new developer called Brian joining as technical lead. The visit was a great team-building opportunity for our transatlantic DMP roadmap project.

sprintIt was a jam-packed week with lots of meetings, brainstorming sessions and time working together on the code. We had new culinary experiences (deep-fried haggis balls no less!), heard some hilarious tales from the adventures of John Chodacki, and initiated the US team in the Glasgow-Edinburgh commute, including a quick dash one evening to make the 20:51 train home. Here’s the photographic proof of our first successful joint sprint. Stay tuned for what else we deliver over the coming months.

Mini release (and a mini maintenance window)

We’re deploying another batch of bug fixes this Wednesday, 8 June 2016 at 4:00pm PST. The release will involve a mini maintenance window to update the database—30 min max downtime, 4:00–4:30pm PST. Most of the bugs involve patching security vulnerabilities and database clean-up so they won’t be apparent to users. We did fix one issue that prevented work from being saved when users clicked to other parts of a plan without first selecting “Save Response” or “Save and Next” for the final question in a given DMP.

Over the coming weeks we’ll be focusing our efforts on the Roadmap project. Stay tuned for another progress report following our exchange visit to Glasgow/Edinburgh next week to commence co-development with our DCC partners!

NIH Policy on Rigor and Reproducibility

You’ve all heard about the reproducibility crisis in science. But you may not be aware of a (relatively) new National Institutes of Health (NIH) policy designed to address the issue. The NIH Policy on Rigor and Reproducibility became effective for proposals received on or after January 25, 2016 and applies to most NIH and Agency for Healthcare Research and Quality (AHRQ) grant applications. We just learned about the policy ourselves thanks to the combined efforts of UCSD library and research staff to raise awareness on their campus (and here’s a noteworthy mention in a Nature review of 2015 science news). To aid researchers in meeting the new criteria, UCSD produced this handy guide that we (and they) would like to share with the wider community.

The new policy does not involve any changes to data sharing plans. It is related and important enough, however, that we inserted a statement and link in the “NIH-GEN: Generic” template (Please note the Rigor and Reproducibility requirements that involve updates to grant application instructions and review criteria [but not Data Sharing Plans]).

The policy does involve:

  • Revisions to application guide instructions for preparing your research strategy attachment
  • Use of a new “Authentication of Key Biological and/or Chemical Resources” attachment (example from UCSD library website)
  • Additional rigor and transparency questions reviewers will be asked to consider when reviewing applications

These policies are all meant to achieve basically the same goals: to promote openness, transparency, reproducibility, access to, and reuse of the results of scientific research. We’re grateful to the folks at UCSD—Dr. Anita Bandrowski, Ho Jung Yoo, and Reid Otsuji—for helping to consolidate the message and for providing some new educational resources.

Roadmap team-building exercises: US edition

roadmap_wallLast week we hosted Marta Ribeiro, the lead developer for DMPonline, for an intense, donut-fueled planning meeting to define our co-development process and consolidate our joint roadmap. The following is a debriefing on what we accomplished and what we identified as our next steps.

The project team is established, with Brian Riley joining as the DMPTool technical lead. Marta is busy completing the migration of DMPonline to Rails 4.2 to deposit the code into our new Github repository: DMPRoadmap. There’s nothing to see just yet—we’re in the midst of populating it with documentation about our process, roadmap, issues, etc. As soon as everything is in place, we’ll send word so that anyone who’s interested can track our progress. This will also allow us to begin sussing out how to incorporate external development efforts to benefit the larger DMP community. In addition, Marta is mentoring a pair of summer interns who are undertaking the internationalization work and building APIs. Meanwhile, Brian will finish building the servers for the Roadmap development and staging environments on AWS with another new member of the UC3 team: Jim Vanderveen (DevOps/Developer). Additional core team members include Stephanie Simms and Sarah Jones as Service/Project Managers, Marisa Strong as the Technical Manager, and the CDL UX team (many thanks to our UX Design Manager, Rachael Hu, for spending so much time with us!). UC3 and DCC will also rely on their existing user groups for testing and feedback on both sides of the pond.

Other groundlaying activities include a web accessibility evaluation for DMPonline to ensure that the new system is accessible for disabled users and exploring what we (and others) mean when we talk about “machine-readable DMPs.” Stephanie just received an RDA/US Data Share Fellowship to develop use cases for making DMPs machine readable, in consultation with the Active DMPs Interest Group and the research community at large. In line with this effort, she’ll be participating in an interdisciplinary and international workshop on active DMPs next month, co-hosted by CERN and the RDA group. We’re actively seeking and summarizing thoughts on the topic so please send us your ideas!

We conclude this edition with a draft of our project roadmap (below); it lists all of the features that we’ll be adding to the DMPonline codebase before we release the new platform. Most of the features already exist in the DMPTool and were slated for future enhancements to DMPonline. Stay tuned for our next update following a UC3 exchange visit to Glasgow/Edinburgh in mid June to prioritize the roadmap and commence co-development work.

Roadmap

  • Migration to Rails v.4.2
  • Bring DMP Assistant’s internationalization upstream for multi-lingual support
  • Adding the concept of locales so specific organizations, funders, and templates can be defined and filtered out for certain users/contexts
  • Shibboleth support through eduGain
  • OAuth link for ORCID
  • APIs to create plans, extract guidance, and generate usage statistics
  • More robust institutional branding
  • A lifecycle to indicate the status of plans and allow institutional access to plans
  • Support for reviewing plans
  • Public sharing option > Public DMPs library
  • Flag test plans (to exclude them from usage stats)
  • Email notification system
  • Admin controls for assigning admin rights to others
  • Export template with guidance
  • Copy template option for creating new templates
  • Copy plan option for creating new plans
  • Toggle switch for navigating between Plan area and Admin area

Maintenance and bugs

We just announced a maintenance window for Wednesday, 4 May 2016, 3:00–4:00pm (PST). We’ll be updating some support scripts related to last month’s migration to AWS that help ensure the stability of the system. Following the maintenance, these bug fixes will also be in place:

Bug fixes

  • We fixed an issue that prevented admins from uploading files for institutional logos.
  • We created a static FAQ page on GitHub to replace the unreliable askbot service.
  • We added a date field to the API call to get a list of users. This call enables admins to track new users over time.
  • We updated links on the Help pages and other outdated info throughout the platform (e.g., we have a new CDL Director: Günter Waibel!).

We’ll continue on a 2-week release cycle until all of the bugs are wiped out. You can follow our progress in the GitHub issue tracker and keep a lookout for updates. We also invite direct bug reporting and enhancement requests via GitHub issues.

Roadmaps galore

Data management planning is moving and shaking at all scales—local, national, international—these days. We had excellent conversations at IDCC about coordinating responses to proliferating data policies and sharing experiences across borders and disciplines. All of the slides and materials from the international DMP workshop are available here.

So far the community has responded positively to our proposal for building a global infrastructure for all things DMP. Our big-picture plans include a merged platform based on the DMPonline codebase and incorporating recent internationalization work by the Portage Network in Canada (check out their bilingual DMP Assistant). We’re completing a gap analysis to add existing functionality from the DMPTool to DMPonline and will issue a joint roadmap in the coming months. Drawing together these disparate development efforts also presents an opportunity to set best practices for future work (stay tuned). This will allow us to consolidate value upstream and ensure maximum benefits to the entire community.

To facilitate our capacity-building efforts, we submitted a proposal entitled (what else) “Roadmap” to the Open Science Prize. You can read the Executive Summary on their website here and peruse the full proposal here (also view our snazzy promo video below). The prize seemed like the perfect opportunity to reposition DMPs as living documents using the biomedical research community as a pilot group. We’ll know by the end of April whether our bid is successful. Regardless of the outcome, we would love to know what you think about the proposal.

And finally, an update on the near-future roadmap for the DMPTool. We just added some new API calls in response to requests for more usage statistics and to facilitate integration projects with other data management systems. Admins can now get info about templates used to create plans at their institution (including private plans!) and a list of institutional templates. Check out the updated documentation on the GitHub wiki. The next order of business is working through the backlog of bug fixes. You can follow our progress in the GitHub issue tracker. Once the bugs are eliminated, we’ll circle back to high priority feature enhancements that contribute to our long-range plans.