This codebook.txt file was generated on 20190614 by wilsonkm

-------------------
GENERAL INFORMATION
-------------------

1. Title of Dataset 
TnSeq of Himar1 transposon library of Mycobacterium tuberculosis H37Rv grown in vitro (on Mtb YM rich or Mtb minimal). 

2. Author Information

  Principal Investigator Contact Information
        Name: Yusuke Minato
           Institution: University of Minnesota Medical School
           Address:689 23rd Avenue S.E. Minneapolis MN 55455
           Email:yminato@umn.edu

  Associate or Co-investigator Contact Information
        Name: Anthony D. Baughn
           Institution:University of Minnesota Medical School
           Address:689 23rd Avenue S.E. Minneapolis MN 55455
           Email:abaughn@umn.edu

3. Date of data collection 2015/09/25
4. Geographic location of data collection (where was data collected?): University of Minnesota Genomics Center 

5. Information about funding sources that supported the collection of the data:This study was supported by funds from the Minnesota Partnership for Biotechnology and Medical Genomics (ML2012, chapter 5, article 1, section 5, subdivision 5e, to A.D.B.) and the American Lung Association (to A.D.B.). 

--------------------------
SHARING/ACCESS INFORMATION
-------------------------- 

1. Licenses/restrictions placed on the data:N/A


2. Links to publications that cite or use the data:https://msystems.asm.org/content/4/4/e00070-19


3. Links to other publicly accessible locations of the data:N/A


4. Links/relationships to ancillary data sets:N/A


5. Was data derived from another source? No
           If yes, list source(s):


6. Recommended citation for the data:n Minato Y, Gohl DM, Thiede JM,Chacón JM, Harcombe WR, Maruyama F,Baughn AD. 2019. Genomewide assessment of Mycobacterium tuberculosis conditionally essential metabolic pathways. mSystems 4: e00070-19. https://doi.org/10.1128/mSystems
.00070-19

---------------------
DATA & FILE OVERVIEW
---------------------

1. File List
	A. Filename: Rich_Plate_1_GCCAAT_L007_R1_001.fastq       
	Short description: 28,894,656 lines. 


	B. Filename: Rich_Plate_2_ATGTCA_L007_R1_001.fastq       
	Short description: 29,121,632 lines.       


	C. Filename: Minimal_Plate_1_AGTCAA_L007_R1_001.fastq       
	Short description: 61,347,792 lines. 


        D. Filename: Minimal_Plate_2_GTGAAA_L007_R1_001.fastq
	Short description: 26,015,788 lines. 


2. Relationship between files:A and B (Biological replicates), C and D (Biological replicates).       


3. Additional related data collected that was not included in the current data package:N/A


4. Are there multiple versions of the dataset? no

--------------------------
METHODOLOGICAL INFORMATION
--------------------------


1. Description of methods used for collection/generation of data: 
https://msystems.asm.org/content/4/4/e00070-19


2. Environmental/experimental conditions:
https://msystems.asm.org/content/4/4/e00070-19


3. Describe any quality-assurance procedures performed on the data:


4. People involved with sample collection, processing, analysis and/or submission: University of Minnesota Genomics Center


-----------------------------------------
NOTE REGARDING .FASTQ FILES (from https://help.basespace.illumina.com/articles/descriptive/fastq-files/)
-----------------------------------------

FASTQ files are named with the sample name and the sample number, which is a numeric assignment based on the order that the sample is listed in the sample sheet. Example: Data\Intensities\BaseCalls\samplenameS1L001R1001.fastq.gz

    A. samplename — The sample name provided in the sample sheet. If a sample name is not provided, the file name includes the sample ID, which is a required field in the sample sheet and must be unique.
    B. S1 — The sample number based on the order that samples are listed in the sample sheet starting with 1. In this example, S1 indicates that this sample is the first sample listed in the sample sheet.
    C. L001 — The lane number.
    D. R1 — The read. In this example, R1 means Read 1. For a paired-end run, there is at least one file with R2 in the file name for Read 2.
    E. 001 — The last segment is always 001.

Each entry in a FASTQ file consists of four lines:

    A. Sequence identifier
    B. Sequence
    C. Quality score identifier line (consisting only of a +)
    D. Quality score