This codebook.txt file was generated on 20201016 by wilsonkm ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Data for "3D Printing-Enabled DNA Extraction for Long-Read Genomics" published as ACS Omega 2020, 5, 20817-20824 2. Author Information Principal Investigator Contact Information Name: Paridhi Agrawal Institution: University of Minnesota Email: agraw135@umn.edu Associate or Co-investigator Contact Information Name: Jeffrey G. Reifenberger Institution: Bionano Genomics Inc. Associate or Co-investigator Contact Information Name: Kevin D. Dorfman Institution: University of Minnesota 3. Date of data collection: 2019-07-23 to 2019-11-23 4. Geographic location of data collection: Minneapolis, MN and San Diego, CA 5. Information about funding sources that supported the collection of the data: Sponsorship: NIH (R21- HG009208) -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal 2. Links to publications that cite or use the data: Agrawal, P., Reifenberger, J. G., & Dorfman, K. D. (2020). 3D Printing-Enabled DNA Extraction for Long-Read Genomics. ACS Omega, 5(33), 20817–20824. https://doi.org/10.1021/acsomega.0c01912 3. Recommended citation for the data: Agrawal, Paridhi; Reifenberger, Jeffrey G; Dorfman, Kevin D. (2020). Data for "3D Printing-Enabled DNA Extraction for Long-Read Genomics" published as ACS Omega 2020, 5, 20817-20824. Retrieved from the Data Repository for the University of Minnesota, https://doi.org/10.13020/brk2-4t69. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: AgrawalACSOmega_QubitData.xlsx Short description: "QubitData" is DNA concentration measurement. B. Filename: AgrawalACSOmega_SizingData.xlsx Short description: "SizingData" is DNA size measurement 2. Relationship between files: None 3. Are there multiple versions of the dataset? No -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Qubit data is generated using Qubit fluorometer standard protocol. Sizing data is generated using Bionano nano channels. Method description for both can be found in the publication ACS Omega 2020, 5, 20817-20824 2. Methods for processing the data: Instrument readouts. More information in publication ACS Omega 2020, 5, 20817-20824 3. Instrument- or software-specific information needed to interpret the data: None 4. Standards and calibration information, if appropriate: NA 5. Environmental/experimental conditions: Room temperature 6. Describe any quality-assurance procedures performed on the data: NA 7. People involved with sample collection, processing, analysis and/or submission: Agrawal, Paridhi and Reifenberger, J.G. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: AgrawalACSOmega_QubitData.xlsx ----------------------------------------- 1. Number of variables: 14 2. Number of cases/rows: 15 rows 3. Missing data codes: Code/symbol Definition Code/symbol Definition 4. Variable List A. Name: Run ID Description: ID generated by the equipment Value labels if appropriate B. Name: Test Name Description: Sample name generated by equipment Value labels if appropriate C. Name: Test Date Description: Date Value labels if appropriate D. Name: Qubit® tube conc. Description: Tube concentration measured by fluorometer Value labels if appropriate E. Name: Units Description: concentration units Value labels if appropriate F. Name: Original sample conc. Description: Concentration of original undiluted sample Value labels if appropriate G. Name: Units Description: concentration units Value labels if appropriate H. Name: Sample Volume (µL) Description: volume of original sample added to total 200 µL Qubit tube Value labels if appropriate I. Name: DNA (ng) Description: DNA amount in original sample extracted from device Value labels if appropriate J. Name: avg DNA yield Description: DNA yield averaged over multiple measurements of same sample Value labels if appropriate K. Name: sd yield Description: SD of DNA yield over multiple measurements of same sample Value labels if appropriate L. Name: # samples Description: # of measurements of same sample Value labels if appropriate M. Name: avg sample conc Description: Average DNA concentration over multiple measurements of same sample Value labels if appropriate N. Name: sd conc Description: SD of DNA concentration over multiple measurements of same sample Value labels if appropriate ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: AgrawalACSOmega_SizingData.xlsx ----------------------------------------- There are three sheets within the workbook: - Circulomics Nanobind - Device DNA - consolidated 1. Number of variables: 2. Number of cases/rows: 3. Missing data codes: Code/symbol Definition Code/symbol Definition 4. Variable List for sheets "Circulomics Nanobind" and "Device DNA" Example. Name: Gender Description: Gender of respondent 1 = Male 2 = Female 3 = Other A. molecule_Length Description: DNA length in bp (base pairs) Value labels if appropriate B. kbp length Description: DNA length in kbp Value labels if appropriate C. NumEvents Description: Number of DNA molecules in sample corresponding to the DNA size in that bin Value labels if appropriate D. cum length kbp Description: Cumulative DNA length in sample corresponding to the DNA size in that bin Value labels if appropriate E. Norm100 Description: Normalized percentage based on cumulative DNA length corresponding to the DNA size in that bin Value labels if appropriate F. cdf A*B filt Description: Cumulative distribution function of DNA length when considering only molecules greater than 150 kbp in size Value labels if appropriate G. norm cdf A*B filt Description: Normalized CDF Value labels if appropriate H. cdf A*B unfilt Description: Cumulative distribution function of DNA length when considering all molecules greater than 15 kbp in size Value labels if appropriate I. norm cdf A*B unfilt Description: Normalized CDF Value labels if appropriate J. Column J is empty on both sheets. K. # Mols >= 150kbp Description: Sum total of number of DNA molecules in size longer than 150 kbp Value labels if appropriate L. Coverage >= 150kbp(Gbp) Description: Genomic coverage from all molecules greater than 150 kbp Value labels if appropriate M. n50_20kbp Description: DNA length value such that sum of the length of all DNA fragments greater than the N50 value equal to more than 50% of the total genomic length measured. _20kbp cprresponds to dataset when all molecules greater than 20 kbp are analyzed. This can also be read from Column A when the corresponding norm CDF A*B unfiltered value is 0.5 Value labels if appropriate N. n50_150kbp Description: DNA length value such that sum of the length of all DNA fragments greater than the N50 value equal to more than 50% of the total genomic length measured. _150kbp cprresponds to dataset when all molecules greater than 150 kbp are analyzed. This can also be read from Column A when the corresponding norm CDF A*B filtered value is 0.5 Value labels if appropriate O. Green_AvgDensity Description: NA Value labels if appropriate P. Red_AvgDensity Description: NA Value labels if appropriate Q. Blue_AvgIntensity Description: NA Value labels if appropriate R. Blue_AvgSNR Description: NA Value labels if appropriate S. Green_AvgIntesity Description: NA Value labels if appropriate T. Green_AvgSNR Description: NA Value labels if appropriate U. Red_AvgIntensity Description: NA Value labels if appropriate V. Red_AvgSNR Description: NA Value labels if appropriate 4. Variable List for sheet "consolidated" Example. Name: Gender Description: Gender of respondent 1 = Male 2 = Female 3 = Other A. mol len C Description: DNA length bin size for Circulomics data sheet Value labels if appropriate B. mol len D Description: DNA length bin size for device DNA data sheet Value labels if appropriate C. num C1 Description: same as NumEvents in the other 2 sheets Value labels if appropriate D. num D1 Description: same as NumEvents in the other 2 sheets Value labels if appropriate E. C cum len Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate F. D cum len Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate G. C cdf Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate H. C cdf norm Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate I. D cdf Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate J. D cdf norm Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate K. C cdf filt Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate L. C cdf filt norm Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate M. D cdf filt Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate N. D cdf filt norm Description: Same as other 2 sheets. C stands for Circulomics data and D for device DNA data Value labels if appropriate