This readme.txt file was generated on <20231115> by Recommended citation for the data: Interactive transcriptome analyses of Northern Wild Rice (Zizania palustris L.) and Bipolaris oryzae (Cochliobolus miyabeanus) at 24 h and 48 h of their interaction. https://doi.org/10.13020/9kja-aj88 ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Transcriptome analysis (RNA-sequencing) of cultivated wild rice (Zizania palustris L.) and Bipolaris oryzae at 24 h and 48 h of their interaction 2. Author Information: Castell-Miller, Claudia V.; Researcher 5, Department of Plant Pathology, University of Minnesota Principal Investigator Contact Information Name: Kimball, Jennifer A. Institution: University of Minnesota, Department of Agronomy and Plant Genetics. Address: 411 Borlaug Hall, 1991 Upper Buford Circle, St. Paul, MN 55108 Email: jkimball@umn.edu ORCID: 0000-0002-1210-8161 Associate or Co-investigator Contact Information Name: Samac, Deborah A. Institution: United States Department of Agriculture-Agricultural Research Service, Plant Science Research Unit, St. Paul, MN. Address: 495 Borlaug Hall, 1991 Upper Buford Cir, St. Paul, MN 55108 Email: debby.samac@usda.gov ORCID: 0000-0002-1847-0508 Associate or Co-investigator Contact Information Name: Castell-Miller, Claudia V. Institution: University of Minnesota, Department of Plant Pathology Address: 495 Borlaug Hall, 1991 Upper Buford Cir, St. Paul, MN 55108 Email: caste007@umn.edu ORCID: 0000-0002-5730-5863 3. Date published or finalized for release: November 2024 4. Date of data collection (single date, range, approximate date) <2014; 2017> 5. Geographic location of data collection (where was data collected?): Greenhouse at the UMN Plant Growth Facility, St. Paul, MN, 55108 6. Information about funding sources that supported the collection of the data: Informatics Institute, UMII MnDRIVE Updraft Grant (CFS is 1000-12165-MNI11-1806486) and USDA-ARS project: 5026-12210-004-00D 7. Overview of the data (abstract): We designed a transcriptome study to understand the global gene expression between the Northern Wild Rice (NWR) cultivar Itasca - C12 and the Bipolaris oryzae isolate TG12Lb2 at 12 h and 48 h of their encounter. NWR activated numerous plant recognition receptors, followed by active transcriptional reprogramming of signaling mechanisms driven by Ca+2 and its sensors, mitogen-activated protein kinase cascades, activation of an oxidative burst, as well as phytohormone signaling bound-mechanisms. Several transcription factors associated with plant defenses were found to be expressed. Importantly, evidence of diterpenoid phytoalexins, especially phytocassane biosynthesis, among other defense genes was found. In B. oryzae, predicted genes associated with pathogenicity, including secreted effectors that could target plant defense mechanisms, were expressed. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal (https://creativecommons.org/publicdomain/zero/1.0/) 2. Links to publications that cite or use the data: Castell-Miller CV, Gutierrez-Gonzalez JJ, Tu ZJ, Bushley KE, Hainaut M, et al. (2016) Genome Assembly of the Fungus Cochliobolus miyabeanus, and Transcriptome Analysis during Early Stages of Infection on American Wildrice (Zizania palustris L.). PLOS ONE 11(6): e0154122. https://doi.org/10.1371/journal.pone.0154122; Castell-Miller Claudia, Ranjan Ashish, Kono Thomas, Samac Deborah, Kimball Jennifer. 2023, August 12-16. Global expression analysis of Bipolaris oryzae genes in cultivated wild rice (Zizania palustris) leaves during early stages of colonization – Claudia Castell-Miller [Conference Presentation]. Plant Health 2023, Denver, CO, United States. https://events.rdmobile.com/Lists/Details/1874310 Castell-Miller, CV, Kono, TYJ, Ranjan A, Schlatter, CD, Samac, DA, and Kimball, JA. (2023). Interactive transcriptome analyses of Northern Wild Rice and Bipolaris oryzae shows convoluted communications during the early stages of fungal brown spot development. 3. Was data derived from another source? No If yes, list source(s): 4. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/drum/policies/#terms-of-use -------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: Boiv_ip_24h_DEG_Annotated_LongestPeptide_Feb2022_Collapsed.csv Short description: Annotation of Bipolaris oryzae expressed genes (fungus grown in Northern Wild Rice vs. in vitro) at 24 hours after inoculation B. Filename: Boiv_ip_48h_DEG_Annotated_LongestPeptide_Feb2022_Collapsed.csv Short description: Annotation of Bipolaris oryzae expressed genes (fungus grown in Northern Wild Rice vs. in vitro) at 48 hours after inoculation C. Filename: WRi_vs_WRm_24h_DEG_Annotated_LongestPeptide_Collapsed.csv Short description: Annotation of Northern Wild Rice expressed genes (plant infected by B. oryzae vs. mock (water) inoculated) at 24 hours after inoculation D. Filename: WRi_vs_WRm_48h_DEG_Annotated_LongestPeptide_Collapsed Short description: Annotation of Northern Wild Rice expressed genes (plant infected by B. oryzae vs. mock (water) inoculated) at 48 hours after inoculation 2. Relationship between files: A. and B. filenames are B. oryzae transcripts at two time points (24h, and 48h after inoculation of Northern Wild Rice); while C. and D. are Northern Wild Rice transcripts in response to fungal infection at the same time points. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: The flag leaf and flag leaf-1 of fungal inoculated (+water +Tween20) and mock (water +Tween20) treatments were collected at 24 h and 48 h, immediately frozen in liquid nitrogen, and kept at -80 ºC until used for RNA extraction/ Total RNA was extracted using the RNeasy Mini Kit (Qiagen Inc, Valencia, CA) according to manufacturer’s instructions. Genomic DNA was removed using the DNA-freeTM Kit (ThermoFisher Scientific, Waltham, MA). RNA concentration and integrity, cDNA preparation, and sequencing were done at the Biomedical Genomics Center at the University of Minnesota. Single pass reads (SP) were sequenced as 50 bp runs either on an Illumina HiSeq2500 (24 h) or an HiSeq2000 (48 h) machine 2. Methods for processing the data: Illumina technologies (HiSeq2000 and HiSeq2500) were tested with the log2 of normalized relative expression of two (technical) libraries using Pearson’s and Spearman’s correlations. Inspection and removal of low-quality reads, adapters and ribosomal RNA contamination. High quality reads were assembled into a single gene representation (“Super transcripts”). Transcriptome assembly quality was assessed by representation of the input reads. Separation of fungal and plant transcripts was carried out with BLAST searches. Clustering of redundant sequences was performed with CD-HIT. Transcripts were annotated with publicly available resources such as Ensembl Plants and Fungal databases, BLAST+/SwissProt, Pfam Signal P, eggNOG, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes. Data visualization and dimensionally reduction of expression data were carried out with R statistical software. Plant and fungal differential genes expression analysis was performed with the DESeq2 package in R. Enrichment tests with GO and KEGG pathways was performed with hypergeometric enrichment tests. P-values from differential expression tests and enrichment tests were used for false discovery rate estimation with the Benjamini-Hochberg method. Per-sample relative gene expression values were used in a variance partitioning analysis and a weighted gene co-expression network analysis. Selection of plant and fungal candidate genes (in-house R script). Gene expression validation RT-qPCR assays. 3. Instrument- or software-specific information needed to interpret the data: Illumina HiSeq2500 (24 h) and an HiSeq2000 (48 h) machine. Unix [GNU, P. (2007). Free Software Foundation. Bash (3.2. 48]. FastQC version 0.11.8 [Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data; http://www.bioinformatics.babraham.ac.uk/projects/fastqc]. Trimmomatic version 0.33 [Anthony M. Bolger, Marc Lohse, Bjoern Usadel, Trimmomatic: a flexible trimmer for Illumina sequence data, Bioinformatics, Volume 30, Issue 15, August 2014, Pages 2114–2120; https://doi.org/10.1093/bioinformatics/btu170]. BBDuk version 38.84 (https://sourceforge.net/projects/bbmap). Trinity version 2.10.0 [Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., et. al. (2011). Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nature Biotechnology. 29, 644–652. https://doi.org/10.1038/nbt.1883; Davidson, N. M., Hawkins, A. D. K., and Oshlack, A. (2017). SuperTranscripts: a data driven reference for analysis and visualisation of transcriptomes. Genome Biology. 18, 148–148. https://doi.org/10.1186/s13059-017-1284-1; Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., et al. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols. 8, 1494–1512. https://doi.org/10.1038/nprot.2013.084] BUSCO v 4.0.6 [Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V., and Zdobnov, E. M. (2015). BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31 3210–3212. https://doi.org/10.1093/bioinformatics/btv351] RSEM [Li, B., Dewey, C.N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011). https://doi.org/10.1186/1471-2105-12-323] Bowtie2 version 2.3.4.1 [Langmead, B., and Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods. 9, 357–359. https://doi.org/10.1038/nmeth.1923; SAMTools version 1.9 (http://www.htslib.org] BLASTn [Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. (1990) “Basic local alignment search tool.” J. Mol. Biol. 215:403-410] Phyton [Van Rossum, G., & Drake Jr, F. L. (1995). Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam.] CD-HIT version 4.6.1 [Fu, L., Niu, B., Zhu, Z., Wu, S., and Li, W. (2012). CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 28, 3150–3152. https://doi.org/10.1093/bioinformatics/bts565] Trinotate [Bryant, D.M., Johnson, K., DiTommaso, T., Tickle, T., Couger, M.B., Payzin-Dogru, D., Lee, T.J., Leigh, N.D., Kuo, T.H., Davis, F.G. and Bateman, J., 2017. A tissue-mapped axolotl de novo transcriptome enables identification of limb regeneration factors. Cell reports, 18(3), pp.762-776.] R-software. [R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.URL https://www.R-project.org/ (several packages)] DESeq2 [Love, M. I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 15, 550–550. https://doi.org/10.1186/s13059-014-0550-8] variancePartition analysis. [ R package (Bioconductor: http://bioconductor.org/packages/variancePartition) Hoffman, G. E., and Schadt, E. E. (2016). variancePartition: interpreting drivers of variation in complex gene expression studies. BMC Bioinformatics. 17, 483–483. https://doi.org/10.1186/s12859-016-1323-z] Weighted gene co-expression network analysis (WGCNA). [R- software; Langfelder, P., and Horvath, S. (2008). WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 9, 559–559. https://doi.org/10.1186/1471-2105-9-559] EffectorP-fungi 3.0 [(https://effectorp.csiro.au; Sperschneider and Dodds, 2021) Sperschneider, J., and Dodds, P. N. (2022). EffectorP 3.0: Prediction of apoplastic and cytoplasmic effectors in fungi and oomycetes. Molecular Plant-Microbe Interactions. 35, 146–156. https://doi.org/10.1094/MPMI-08-21-0201-R] 4. Standards and calibration information, if appropriate: No 5. Environmental/experimental conditions: Plants used for inoculations were between the principal phenological stages of stem elongation and booting. Leaves of 10 plants were sprayed with 1.5 to 2 ml of spore solution at 15,000 to 20,000 conidia/ml and 0.01% Tween20, while sterile deionized water with Tween20 was sprayed on another 10 plants (“mock”). Plants were immediately placed in a mist chamber and received 20 min of continuous mist, followed by 2 min of mist every 60 min during a period of 16 h at 24 °C (± 1 °C). Later moved back to the greenhouse under air temperature of 22 ºC (± 2 ºC) and lights of 450-watt high pressure sodium halogen lamps supplemented with three 60-watt incandescent lights for 16 hours/day. 6. Describe any quality-assurance procedures performed on the data: Please see point 4 and 5. 7. People involved with sample collection, processing, analysis and/or submission: Castell-Miller, Claudia V., and Kono Thomas YJ ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: Boiv_ip_24h_DEG_Annotated_LongestPeptide_Feb2022_Collapsed.csv ----------------------------------------- 1. Number of variables: 113 2. Number of cases/rows: 113 (data availability vary per row) 3. Missing data codes: NA = no available “.” = no matches 4. Variable List: Please see attached spreadsheet file (Description of the variables_Boiv_ip_24h) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: Boiv_ip_48h_DEG_Annotated_LongestPeptide_Feb2022_Collapsed.csv ----------------------------------------- 1. Number of variables: 113 2. Number of cases/rows: 113 (data availability vary per row) 3. Missing data codes: Code/symbol NA = no available “.” = no matches 4. Variable List and Description: Please see attached spreadsheet file (Description of the variables_Boiv_ip_48h) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: WRi_vs_WRm_24h_DEG_Annotated_LongestPeptide_Collapsed.csv ----------------------------------------- 1. Number of variables: 97 2. Number of cases/rows: 97 (data availability vary per row) 3. Missing data codes: NA = no available “.” = no matches 4. Variable List and Description: Please see attached spreadsheet file (Description of the variables_WRmi_24h) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: WRi_vs_WRm_48h_DEG_Annotated_LongestPeptide_Collapsed.csv ----------------------------------------- 1. Number of variables: 97 2. Number of cases/rows: 97 (data availability vary per row) 3. Missing data codes: NA = no available “.” = no matches 4. Variable List and Description: Please see attached spreadsheet file (Description of the variables_WRmi_48h) Data sets descriptions: The datasets contain expressed genes of the Northern Wild Rice (Zizania palustris L.) obtained during Bipolaris oryzae infection compared to those expressed by the mock-inoculated (water) plants at 24h and 48h. It also contains Bipolaris oryzae transcripts during infection of wild rice compared to those expressed by the fungus in vitro at 24 h and 48 h.