Readme Documentation for the Data Management Plan (DMP) Review Project Lisa Johnston and Carolyn Bishoff Date Created: 3-31-2015 GENERAL INFORMATION 1. Title of Dataset: Analyzed Data Management Plans from Successful University of Minnesota National Science Foundation (NSF) Grants, 2011-2014 2. File Information: A. Filename: UMN_DMPRevoew_2011-2014.xls B. Short description: Spreadsheet of Deidentified and Analyzed DMPs C. Filename:UMN_DMPReviewInstrument.pdf D. Short description: UMN Review Instrument for Analyzing DMPs E. Filename:Screenshot_UMN_DMPReviewInstrument.png F. Short description: Screenshot of UMN DMP Review Instrument in Google Forms 3. Principal Investigator Contact Information A. Name: Lisa Johnston B. Institution: University of Minnesota - Twin Cities C. Address: 108 Walter Library, Minneapolis, MN 55455 D. Email: ljohnsto@umn.edu 4. Date files were created: 2015-03-31 METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: The response to the call for DMPs from faculty was strong considering the opt-in nature of the study. The libraries received 182 data management plans emailed from PIs between June 25 and September 2, 2014. This accounted for 41% of the total number of plans solicited. As the DMPs were received via email, they were downloaded and any identifiers relating the DMP to the grant recipient were removed, including PI names and grant award titles. Next, the DMP file was renamed using a standard file name schema in the form of University_CollegeAbrv_Department_000.ext (eg. UMN_CSE_Physics_001.pdf). Most DMPs arrived as a Microsoft Word files or as a PDF. However some DMPs arrived in the form of the entire grant application and therefore the two-page DMP was extracted and then saved as a new file and the rest of the application was deleted. 2. Methods for processing the data: For the NSF, DMPs must be no longer than a 2-page written plans addressing data management criteria provided by NSF directorates and sub-directorates. According to the NSF (2013) the DMPs should describe: the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project; the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies); policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements; policies and provisions for re-use, re-distribution, and the production of derivatives; and, plans for archiving data, samples, and other research products, and for preservation of access to them. It should be noted that the University Libraries have been offering training to researchers on data management topics, and specifically “Creating a Data Management Plan for a Grant Application” that provided a digital template for a data management plan, since 2010. Over 774 faculty, researchers, and students have attended the libraries’ data-themed workshops. In addition, our web resource on DMP tools (http://lib.umn.edu/datamanagement/DMP) are easily accessible on the web and the site appears as the first hit when searching for “data management plans” at http://umn.edu. In order to review the plans, we created a survey instrument to capture and standardized our findings. We developed a set of binary questions and controlled vocabulary for five sections based on the NSF guidelines. This tool was created from several sources: an internally-developed DMP checklist (UMN Libraries, 2015), and DMP resources developed at other institutions, including Cornell University Libraries (Wright and Andrews, 2015), Columbia University Libraries (2014), Johns Hopkins Libraries (2014), Purdue University Libraries (2011), as well as research done at Syracuse University School of Information Studies (Curty, Kim, & Qin, 2013) and the University of Illinois Urbana-Champaign (Mischo, Schlembach, & O'Donnell, 2014). We also referred to a draft DMP rubric that was developed as part of an IMLS research grant (Whitmire et. al, 2014). References National Science Foundation (NSF). (2013, January). Plans for data management and sharing of the products of research. GPG Chapter II. - Proposal Preparation Instructions. Retrieved from http://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_2.jsp#dmp. Columbia University Libraries. (2014). “Data Management Plan Self‐Assessment Questionnaire, Adapted from Purdue University Libraries (2011).” Data Management Plan Templates - Scholarly Communications Program. Retrieved from http://scholcomm.columbia.edu/wp-content/uploads/2014/04/DMPSelfAssessment_v01.pdf. Curty, R., Kim Y., & Qin, J. (2013). “What have Scientists Planned for Data Sharing and Reuse? A Content Analysis of NSF Awardees’ Data Management Plans [presentation slides].” Research Data Access & Preservation Summit. Baltimore, 4-5 April 2013. Retrieved from http://www.slideshare.net/asist_org/rdap13-curty-what-have-scientists-planned-for-data-sharing-and-reuse-a-content-analysis-of-nsf-awardees-data-management-plans. Johns Hopkins Sheridan Libraries. (2014). “Questionnaire to Help with the Creation of a Data Management Plan.” JHU Data Management Services. Retrieved from http://dmp.data.jhu.edu/assistance/nsf-data-management-plans. Mischo, W. H., Schlembach, M. C., & O'Donnell, M. N.. (2014). "An Analysis of Data Management Plans in University of Illinois National Science Foundation Grant Proposals." Journal of eScience Librarianship. 3(1): Article 3. http://dx.doi.org/10.7191/jeslib.2014.1060. Purdue University Libraries. (2011). “Data Management Plan Self‐Assessment Questionnaire.” Purdue University, West Lafayette IN. Retireved from https://www.purdue.edu/research/vpr/rschdev/documents/DMP_Self-Assess_14Feb2011.pdf. University of Minnesota Libraries. (2015). U of M DMP Template (Google Doc). Creating a Data Management Plan. Retrieved from https://www.lib.umn.edu/datamanagement/DMP. Whitmire, A. L., Carlson, J., Hswe, P. M., Wells Parham, S., Rolando, E., & Westra, B. (2014, Unpublished Draft). “Rubric for assessment of NSF data management plans.” A product of IMLS National Leadership Grant LG-07-13-0328, “Analysis of data management plans as a means to inform and empower academic librarians in providing research data support.” Personal Communication. Wright, S. J. & Andrews, C. (2015). Developing a For-Credit Course to Teach Data Information Literacy Skills: A Case Study in Natural Resources. In J. Carlson and L. Johnston (Eds.), Data Information Literacy: Librarians, Data, and the Education of a New Generation of Researchers (pp. 94-96). West Lafayette, IN: Purdue University Press. 3. Instrument-specific information needed to interpret the data: Our review instrument was created in Google forms and consisted of seven sections that map to the NSF requirements. The instrument was not intended to critique the plan, create subjective measures of quality, or to provide feedback directly to researchers. As much as possible, our methods were intended to review the plan as is, while applying a controlled vocabulary to the content so we can group and analyze the plans as a whole. Our analysis identifies the college and department of origin, the types of data produced and how they are stored, and then analyzes the methods mentioned for data sharing, archiving, and preservation. If the DMP indicated that the grant would not produce research data (specifically allowed by NSF), or if the PI intended to use additional external datasets, this information was also recorded. 4. Standards and calibration information, if appropriate: n/a 5. Environmental/experimental conditions: n/a 6. Describe any quality-assurance procedures performed on the data: Because the DMPs were written narratives and, in a sense, qualitative data, the review of the contents were a subjective challenge. Once collected, the DMPs were reviewed by two independent graduate research assistants that were hired as Scientific Data Curators for the University Libraries. This analysis took place in September-October 2014. With each plan reviewed twice, the authors were able to compare the analysis and, when incongruities occurred, make a final decision on how the plan should be classified. 7. Codes or symbols used to note or characterize low quality/questionable outliers that people should be aware of A. Code/symbol: n/a B. Definition: When a plan stated that no data was to be managed by the grant (Eg. Math grant proposal or a workshop) then rather than answer No to binary questions, n/a was specified. 8. People involved with sample collection, processing, analysis and/or submission: John McGrory, Christine Storino, and Anders Swendsrud (University of Minnesota Libraries) Credits: Template provided by the University of Minnesota Libraries, http://lib.umn.edu/datamanagement