This codebook.txt file was generated on 20201203 by Mohd Farid Abdul Halim ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Formate dependent heterodisulfide reduction in a Methanomicrobiales archaeon 2. Author Information Principal Investigator Contact Information Name: Kyle Costa Institution: University of Minnesota Address: 670 Biological Sciences Center 1445 Gortner Avenue St. Paul, MN 55108 Email: kcosta@umn.edu ORCID: https://orcid.org/0000-0003-0407-1431 Associate or Co-investigator Contact Information Name: Mohd Farid Abdul Halim Institution: University of Minnesota Address: 685 Biological Sciences Center 1445 Gortner Avenue St. Paul, MN 55108 Email: faridh@umn.edu ORCID: https://orcid.org/0000-0002-2327-3621 Associate or Co-investigator Contact Information Name: Leslie Day Institution: University of Minnesota Address: 685 Biological Sciences Center 1445 Gortner Avenue St. Paul, MN 55108 Email: day00094@umn.edu ORCID: 3. Date of data collection 20191220 - 20200307 4. Geographic location of data collection (where was data collected?): University of Minnesota Center for Mass Spectrometry and Proteomics, St Paul, Minnesota, USA 5. Information about funding sources that supported the collection of the data: U.S. Department of Energy, Office of Science, Basic Energy Sciences under grant number DE-SC0019148. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal 2. Links to publications that cite or use the data: Mohd Farid Abdul Halim et al., 2020 "Formate dependent heterodisulfide reduction in a Methanomicrobiales archaeon" (submitted) 3. Links to other publicly accessible locations of the data: 4. Links/relationships to ancillary data sets: 5. Was data derived from another source? No If yes, list source(s): 6. Recommended citation for the data: Abdul Halim, Mohd Farid; Day, Leslie; Costa, Kyle. (2020). Data for Formate dependent heterodisulfide reduction in a Methanomicrobiales archaeon. Retrieved from the Data Repository for the University of Minnesota, https://doi.org/10.13020/g0xp-gf10. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: kcosta_faridh_20191220_17571_MutE1.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus HdrB-His strain. First independent sample. B. Filename: kcosta_faridh_20191220_17571_MutE2.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction II from His-purification of M. thermophilus HdrB-His strain. First independent sample. C. Filename: kcosta_faridh_20191220_17571_WTE1.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus wild-type strain. First independent sample. D. Filename: kcosta_faridh_20191220_17571_WTE2.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction II from His-purification of M. thermophilus wild-type strain.First independent sample. E. Filename: kcosta_faridh_20200306_17692_HisB.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus HdrB-His strain. Second independent sample. F. Filename: kcosta_faridh_20200306_17692_HisE.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus HdrB-His strain. Third independent sample. G. Filename: kcosta_faridh_20200306_17692_MutHisA.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus MvhA-His strain. First independent sample. H. Filename: kcosta_faridh_20200306_17692_MutHisB.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus MvhA-His strain. Second independent sample. I. Filename: kcosta_faridh_20200306_17692_WTA.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus wild-type strain. Second independent sample. J. Filename: kcosta_faridh_20200307_17692_HisB_reinj.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus HdrB-His strain. Second independent sample. Repeat of analysis run on mass spectrometry. K. Filename: kcosta_faridh_20200307_17692_MutHisA_reinj.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus MvhA-His strain. First independent sample. Repeat of analysis run on mass spectrometry. L. Filename: kcosta_faridh_20200307_17692_MutHisB_reinj.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus MvhA-His strain. Second independent sample. Repeat of analysis run on mass spectrometry. M. Filename: kcosta_faridh_20200307_17692_WTA_reinj.raw Short description: Raw data file of mass spectrometry analysis on Elution fraction I from His-purification of M. thermophilus wild-type strain. Second independent sample.Repeat of analysis run on mass spectrometry. 2. Relationship between files: The files contain the raw mass spectromery data of independent samples (biological replicates) of Elution fraction from His-purification of wild-type, MvhA-His, or HdrB-His strains. 3. Additional related data collected that was not included in the current data package: 4. Are there multiple versions of the dataset? No If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Each concentrated protein sample (10 µl) was mixed with 10 ul of 4X SDS load buffer and heated for 10 min at 95 oC. The samples were loaded on a 10% BioRad Criterion Tris-HCl gel and ran at 25 mA constant current for 35 min. The gel was stained with Thermo Scientific’s Imperial Protein stain. Stained gel regions for each sample were excised and proteolytically digested with trypsin as previously described (34) except that iodoacetamide was used instead of methyl methanethiosulfonate during the reduction and alkylation step of the digestion. Extracted peptides were dried in vacuo and then cleaned with a C18 stage tip (35). The dried peptide pellets were resuspended in load solvent (97.99:2:0.01, water:acetonitrile:formic acid) and approximately 0.2 – 0.5 micrograms of material loaded on the LTQ Orbitrap Velos (Thermo Scientific) as previously described (36) with the following revisions: the LC gradient was 2 - 5% B solvent from 0 - 2 minutes and 5 - 30% B solvent from 2 - 67 minutes with a flowrate of 330 nl/min; lock mass was not invoked; MS1 survey scan was 380 – 1800 m/z; dynamic exclusion list size was 200, duration was 45 seconds, and window was +/- 15 ppm; the top 12 most intense ions were selected for MS2 fragmentation; MS1 maximum injection time (IT) was 150 milliseconds and MS2 maximum IT was 200 milliseconds. The MS/MS data was analyzed using Sequest (37) (Thermo Fisher Scientific, San Jose, CA, USA; version ISE 1.1.0.189, x64 in Proteome Discoverer 2.4.0.305). Sequest was set up to search the Methanoculleus thermophiles (taxon ID 2200) Reference Sequence protein database downloaded from NCBI on December 21, 2019 after concatenation of the common lab contaminants protein sequences from https://www.thegpm.org/crap/. The total number of protein sequences was 2256. The search parameters included: trypsin enzyme with full specificity; fragment ion mass tolerance of 0.1 Da; precursor ion tolerance 20 ppm; carbamidomethyl cysteine as a fixed amino acid modification; acetylation of protein N-terminus, protein N-terminal loss of methionine or acetylated methionine, oxidation of methionine, pyroglutamic acid modification of glutamine and asparagine deamidation as variable modifications. Scaffold (version 4.9, Proteome Software Inc., Portland, OR) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 91.0% probability to achieve a false discovery rate (FDR) less than 1.0% by the Scaffold Local FDR algorithm. Protein identifications were accepted if they could be established at greater than 53.0% probability to achieve an FDR less than 1.0% and contained at least 2 identified peptides. Protein identity probabilities were assigned by the Protein Prophet algorithm (38). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. Proteins sharing significant peptide evidence were grouped into clusters. 2. Methods for processing the data: The resulting MS/MS data from Scaffold analyzed using R (version 3.6.2) (39). The total spectral counts (SC) corresponding to each protein were normalized by multiplying each protein total SC by its protein identity probability and dividing by the whole total spectral counts of the respective sample, which was then divided by 1000, resulting in a counts per thousand (CPT) ratio for each protein accession number. Proteins with a maximum CPT value less than 20 across samples were removed from the analysis. Significant differences of mean CPT values (p < 0.05) were assessed using a two-sample t-test comparing HdrB-His to the combined MvhA-His and WT-His samples. 3. Instrument- or software-specific information needed to interpret the data: 4. Standards and calibration information, if appropriate: 5. Environmental/experimental conditions: 6. Describe any quality-assurance procedures performed on the data: 7. People involved with sample collection, processing, analysis and/or submission: Mohd Farid Abdul Halim (Costa Lab) LeAnn Higgins (University of Minnesota Center for Mass Spectrometry and Proteomics) Todd Markowski(University of Minnesota Center for Mass Spectrometry and Proteomics) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: [FILENAME] ----------------------------------------- 1. Number of variables: 2. Number of cases/rows: 3. Missing data codes: Code/symbol Definition Code/symbol Definition 4. Variable List A. Name: Description: Value labels if appropriate B. Name: Gender Description: Gender of respondent 1 = Male 2 = Female 3 = Other C. Name: Description: Value labels if appropriate