Applications of Genomewide Selection in a New Plant Breeding Program

Neyhart, Jeffrey2019-09-172019-09-172019-07https://hdl.handle.net/11299/206638University of Minnesota Ph.D. dissertation. July 2019. Major: Applied Plant Sciences. Advisor: Kevin Smith. 1 computer file (PDF); vii, 120 pages.Newly established breeding programs must undergo population improvement and determine superior germplasm for deployment in diverse growing environments. More rapid progress towards these goals may be made by incorporating genomewide selection, or the use of genomewide molecular markers to predict the merit of unphenotyped individuals. Within the context of a new two-row barley (Hordeum vulgare L.) breeding program, my objectives were to i) investigate various methods of updating training population data and their impact on long-term genomewide recurrent selection, ii) assess genomewide prediction accuracy with informed subsetting of data across diverse environments, and iii) validate genomewide predictions of the mean, genetic variance, and superior progeny mean of potential breeding crossses. My first study relied on simulations to examine the impact on prediction accuracy and response to selection when updating the training population each cycle with lines selected based on predictions (best, worst, both best and worst), model criteria (PEVmean and CDmean), random sampling, or no selections. In the short-term, we found that updating with the best or both best and worst predicted lines resulted in high prediction accuracy and genetic gain; in the long-term, all methods (besides not updating) performed similarly. In an actual breeding program, a breeder may want phenotypic data on lines predicted to be the best and our results suggest that this method may be effective for long-term genomewide selection and practical for a breeder. In my second study, a 183-line training population and 50-line offspring validation population were phenotyped in 29 location-year environments for grain yield, heading date, and plant height. Environmental relationships were measured using phenotypic data, geographic distance, or environmental covariables. When adding data from increasingly distant environments to a training set, we observed diminishing gains in prediction accuracy; in some cases, accuracy declined with additional data. Clustering environments led to a small, but non-significant gain in prediction accuracy compared to simply using data from all environments. Our results suggest that informative environmental subsets may improve genomewide selection within a single population, but not when predicting a new generation under realistic breeding circumstances. Finally, my third study used genomewide marker effects from the same training population above to predict the mean (μ), genetic variance (VG), and superior progeny mean (μSP ; mean of the best 10% of lines) of 330,078 possible crosses for Fusarium head blight (FHB) severity, heading date, and plant height. Twenty-seven of these crosses were developed as validation populations. Predictions of μ and μSP were moderate to high in accuracy (rMP = 0.46 – 0.69), while predictions of VG were less accurate (rMP = 0.01 – 0.48). Predictive ability was likely a function of trait heritability, as rMP estimates for heading date (the most heritable) were highest and rMP estimates for FHB severity (the least heritable) were lowest. Accurate predictions of VG and μ are feasible, but, like any implementation of genomewide selection, reliable phenotypic data is critical.enbarleygenomewide predictiongenomewide selectiongenomicsquantitative geneticssimulationApplications of Genomewide Selection in a New Plant Breeding ProgramThesis or Dissertation