This readme.txt file was generated on 20240506 by Dorothy D Sweet ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Genomes to Fields Initiative Flight Data - Missouri C5A 2020 2. Author Information Principal Investigator Contact Information Name: Dorothy D Sweet Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: kirsc168@umn.edu ORCID: 0000-0002-9614-5436 Associate or Co-investigator Contact Information Name: Candice N Hirsch Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: cnhirsch@umn.edu ORCID: 0000-0002-8833-3023 Associate or Co-investigator Contact Information Name: Cory D Hirsch Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: cdhirsch@umn.edu ORCID: 0000-0002-3409-758X Associate or Co-investigator Contact Information Name: Sherry A Flint-Garcia Institution: University of Missouri Address: 301 Curtis Hall Columbia, MO 65211-7400 Email: flint-garcias@missouri.edu ORCID: 0000-0003-4156-5318 Associate or Co-investigator Contact Information Name: Jacob D Washburn Institution: University of Missouri USDA-ARS Address: 302-A Curtis Hall Columbia, MO 65211-7400 Email: Jacob.Washburn@usda.gov ORCID: 0000-0003-0185-7105 3. Date published or finalized for release: 20240515 4. Date of data collection (single date, range, approximate date): 20200617 - 20200914 5. Geographic location of data collection (where was data collected?): University of Missouri 6. Information about funding sources that supported the collection of the data: 7. Overview of the data (abstract): This dataset (DRUM 4 of 8) is a subset of the flight data collected through the Genomes to Fields Initiative in 2020 and 2021. In conjunction with equivalent datasets on similar material at alternate locations, this data provides a valuable resource for evaluating the performance and stability of hybrid maize across many environments. Many flights throughout the growing season were conducted at these locations (Delaware, Minnesota, Missouri, Nebraska, and Texas) and this dataset includes the orthomosaics, digital elevation models, plot shapefiles, and extracted plant height values for each of those flights following the pipeline from Anderson, Steven L., II, Seth C. Murray, Lonesome Malambo, Colby Ratcliff, Sorin Popescu, Dale Cope, Anjin Chang, Jinha Jung, and J. Alex Thomasson. 2019. “Prediction of Maize Grain Yield before Maturity Using Improved Temporal Height Estimates of Unmanned Aerial Systems.” The Plant Phenome Journal 2 (1): 1–15.. This maize experiment consisted of over 1000 maize hybrids grown in partial replication across 8 environments in 2 years. A set of common hybrids were grown in every location in order to establish a connection between environments. Within the partially replicated set, hybrids were produced by the cross of double haploids derived from the WI-SS-MAGIC population to the inbred testers PHK76, PHP02, and PHZ51 with the tester choice depending on the relative maturity zone of the location. For this location (Missouri 2020) all testers were used. A modified randomized complete block design was used for testing. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal 2. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/drum/policies/#terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: YYYYMMDD-G2F-Missouri-C5A-RGB-Orthos_LZW.tif Short description: compressed orthomosaic (ortho) Projection: WGS 1984 UTM Zone 15N, EPSG 32615 B. Filename: YYYYMMDD-G2F-Missouri-C5A-RGB-DEMs.zip Short description: zipped folder containing Digital Elevation Models (DEMs) for the flight Folder contains: YYYYMMDD_m2pro_BradC5a.laz YYYYMMDD_m2pro_BradC5a_sort.laz YYYYMMDD_m2pro_BradC5a_sort_NREM.las YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND.las YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND_KP.laz YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND_KP_BARE.las YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND_KP_BARE_ASC.asc YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND_KP_BARE_ASC_HGT.las YYYYMMDD_m2pro_BradC5a_sort_NREM_GRND_KP_BARE_DTM.dtm laz_[individual plot identifier].las (750 files) Projection: WGS 1984 UTM Zone 15N, EPSG 32615 C. Filename: MO_PHTs_2020_merged.csv Short description: file containing all extracted height values as well as other observation data for that location D. Filename: Plot_Shapefiles.zip Short description: zipped folder containing all plot boundary shapefiles for each flight for the location Projection: Custom transverse mercator (Datum North American 1983, Spheroid: GRS 1980) *Note: to use with other spatial datasets in this collection the shapefiles must be reprojected to WGS 1984 UTM Zone 15N (EPSG 32615)* E. Filename: Extracted_Values.zip Short description: zipped folder containing all csvs of extracted height values for each flight 2. Relationship between files: YYYYMMDD-G2F-Missouri-C5A-RGB-Orthos_LZW.tif and YYYYMMDD-G2F-Missouri-C5A-RGB-DEMs.zip are compressed formats of the orthomosaics and digital elevation models respectively. Both are products of the structure from motion process of extracting plant height from 2D RGB images. Plot_Shapefiles.zip contains shapefiles defining the boundaries of each plot for data extraction (Extracted_Values.zip). Extracted_Values.zip is a folder containing all csv files with the raw extracted plant height data and MO_PHTs_2020_merged contains all of the data from Extracted_Values.zip for the location with other observational data from the location. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Various platforms were used for image collection with each location using their own UAV and camera combination. Guidelines for a standard operating procedure were distributed and followed by each group with standards for weather, camera settings, overlap percentages, and image resolution. Weather:Flights on cloudless, sunny days between 9 A.M and 4 P.M; Camera Settings: ISO: 100 or 200; Exposure time: < 1/1000; Aperture: f 2 to 4; Image Overlap: 70% overlap early in the season, 80% overlap after canopy fills out; and Image Resolution: Ground Sampling Distance <= 0.94 cm/pixel (0.37 in/pixel). Flights were intended to be collected at least once a week throughout the growing season until flowering. 2. Methods for processing the data: UAV images were processed following the protocol from Anderson et al. (2019). In short, structure from motion (SfM) photogrammetry algorithms from either Agisoft Software (Agisoft PhotoScan Professional, (Agisoft L L C 2019)) or Pix4Dmapper (PIX4Dmapper ) were used to identify common features (tie points) across images, triangulate, and adjust distortion to generate dense 3D point clouds, digital surface models, and orthomosaic images. Ground control points were placed throughout the study area to ensure correct scale, orientation, and geographic location. A processing pipeline using R/UAStools::plotshpcreate (Anderson ) was used to construct ESRI shapefiles (.shp) of individual research plots for plot extraction using a data frame containing the experimental design, plot dimensions, and unique plot IDs. After the shapefile is visually overlaid on the orthomosaics, some manual adjustment is needed due to subtle variances in the rows. Point clouds were clipped to the trial level and point anomalies above or below the point cloud were manually removed using CloudCompare v2.10 (Girardeau-Montaut ) and a custom batch script was run using executable functions from LAStools (Isenburg 2012; LAStools 220107 2022) and FUSION/LDV (McGaughey 2012) software (https://github.com/andersst91/UAS_Height_Pipeline) to sort data points, remove additional erroneous points, identify ground points, identify key points for digital terrain modeling (DTM), and construct the DTM. The noise-filtered point cloud was adjusted to aboveground height using the DTM and points below the DEM were removed. The plot-level ESRI shapefile was then used to clip individual plot point clouds and estimate height. 3. Instrument- or software-specific information needed to interpret the data: R is needed to visualize and extract data, but the raw data is also present. 4. Standards and calibration information, if appropriate: Ground targets (different for each location) were placed around the border of the area of interest for use as ground control points (GCPs) 5. Environmental/experimental conditions: Weather information including daily minimum and maximum temperatures (˚C) and daily total precipitation (mm.) were collected from WatchDog Weather Stations at each location. 6. Describe any quality-assurance procedures performed on the data: Quality control filtering included the removal of phenotypic records with missing data in terminal plant height, days to anthesis, days to silking, or grain yield and plots with a grain yield of more than 500 bu/acre. Plots were also removed based on stand count, stalk lodging, and root lodging if less than 30 percent of the plot germinated or more than 70 percent of the stand count lodged. Further quality control based on the temporal plant height removed plots with no growth throughout the season as well as plots with too many large dips and peaks in the growth rate throughout the season with a dip being a decrease of more than 20 percent and a peak being an increase of more than 20 percent. Lastly, whole flight days were removed based on visual inspection of the variation of height across the field for that day with some flights showing irregular, unexplainable patterns across the field. 7. People involved with sample collection, processing, analysis and/or submission: Dorothy Sweet, Alper Adak, Mustafa Arik, Jose Varela, Aaron DeSalvio, Seth Murray, Sherry Flint-Garcia, Jacob Washburn, Cory Hirsch, Candice Hirsch ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: MO_PHTs_2020_merged.csv ----------------------------------------- *Variables with unknown definitions were default outputs of the software and not used for analysis 1. Number of variables: 47 2. Number of cases/rows: 47986 3. Missing data codes: Code/symbol NA 4. Variable List B. Name: file_name Description: Name of the file for the data in that row C. Name: Plot_ID Description: Plot identifier D. Name: Elev.P90 Description: 90 percentile of extracted height for specific plot E. Name: Elev.P95 Description: 95 percentile of extracted height for specific plot F. Name: Elev.P99 Description: 99 percentile of extracted height for specific plot G. Name: Flight_date Description: Date of the flight of the data H. Name: DAP Description: Days After Planting I. Name: Year Description: Year of flight J. Name: Field.Location Description: Name of the field location K. Name: State Description: State of field location L. Name: City Description: City of field location M. Name: Plot.length..center.center.in.feet. Description: Length of the plot from center to center in feet N. Name: Plot.area..ft2. Description: area of the plot in square feet O. Name: Alley.length..in.inches. Description: Length of the alley in inches P. Name: Row.spacing..in.inches. Description: Spacing between plot rows in inches Q. Name: Rows.per.plot Description: Number of rows in plot R. Name: X..Seed.per.plot Description: Number of seeds planted in the plot S. Name: Experiment Description: Name of the experiment for the plot with the tester T. Name: Source Description: Seed source U. Name: Pedigree Description: Name of the genotype planted with the female/male parents V. Name: Family Description: Name of the family of the female parent W. Name: Tester Description: Name of the tester (male parent) used in the hybrid X. Name: Rep Description: Replicate number of plot Y. Name: Block Description: Block number of plot Z. Name: Plot Description: plot number AA. Name: Range Description: Range location of the plot AB. Name: Pass Description: Number of the pass of the plot AC. Name: Date.Plot.Planted..MM.DD.YY. Description: Date the plot was planted MM/DD/YY AD. Name: Anthesis..MM.DD.YY. Description: Date of the plot's anthesis MM/DD/YY AE. Name: Silking..MM.DD.YY. Description: Date of the plot's silking MM/DD/YY AF. Name: Anthesis..days. Description: Date of the plot's anthesis in days after planting AG. Name: Silking..days. Description: Date of the plot's silking in days after planting AH. Name: Plant.Height..cm. Description: Manual plant height measurement at terminal height in centimeters AI. Name: Ear.Height..cm. Description: Manual ear height measurement of first ear at terminal height in centimeters AJ. Name: Stand.Count....of.plants. Description: Plant stand count in the plot AK. Name: Root.Lodging....of.plants. Description: number of plants root lodged at the end of the season AL. Name: Stalk.Lodging....of.plants. Description: number of plants stalk lodged at the end of the season AM. Name: Grain.Moisture.... Description: grain moisture of the plot at harvest AN. Name: Test.Weight..lbs. Description: grain test weight of the plot at harvest in pounds AO. Name: Plot.Weight..lbs. Description: weight of grain harvested from the plot in pounds AP. Name: Grain.Yield..bu.A. Description: Dry grain yield in bushels per acre for the plot AQ. Name: Plot.Discarded..enter..yes..or.blank. Description: Note on whether or not the plot was discarded; blank if not discarded, yes if discarded AR. Name: Comments Description: any comments on the plot AS. Name: Filler Description: seed used to replace the hybrid if insufficient seed; blank if no filler used AT. Name: Snap....of.plants. Description: Number of plants snapped off; NA if none, number if there are some AU. Name: Date.Plot.Harvested..MM.DD.YY. Description: Date the plot was harvested MM/DD/YY ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: YYYYMMDD-G2F-Missouri-C5A-RGB.csv ----------------------------------------- *Variables with unknown definitions were default outputs of the software and not used for analysis 1. Number of variables: 87 2. Number of cases/rows: ~755 3. Missing data codes: Code/symbol NA 4. Variable List A. Name: Identifier Description: Plot ID B. Name: DataFile Description: Location of file for the data in that row C. Name: FileTitle Description: Name of the file for the data in that row D. Name: Total return count Description: Number of points in the point cloud for the specific plot E. Name: Total return count above 0.00 Description: Number of points in the point cloud above a height of 0.00 F. Name: Return 1 count above 0.00 Description: Unknown G. Name: Return 2 count above 0.00 Description: Unknown H. Name: Return 3 count above 0.00 Description: Unknown I. Name: Return 4 count above 0.00 Description: Unknown J. Name: Return 5 count above 0.00 Description: Unknown K. Name: Return 6 count above 0.00 Description: Unknown L. Name: Return 7 count above 0.00 Description: Unknown M. Name: Return 8 count above 0.00 Description: Unknown N. Name: Return 9 count above 0.00 Description: Unknown O. Name: Other return count above 0.00 Description: Unknown P. Name: Elev minimum Description: minimum height value extracted from the point cloud for specific plot Q. Name: Elev maximum Description: maximum height value extracted from the point cloud for specific plot R. Name: Elev mean Description: average height value extracted from the point cloud for specific plot S. Name: Elev mode Description: most common height value extracted from the point cloud for specific plot T. Name: Elev stddev Description: standard deviation of height values extracted from the point cloud for specific plot U. Name: Elev variance Description: variance of height values extracted from the point cloud for specific plot V. Name: Elev CV Description: Unknown W. Name: Elev IQ Description: Unknown X. Name: Elev skewness Description: Unknown Y. Name: Elev kurtosis Description: Unknown Z. Name: Elev AAD Description: Unknown AA.Name: Elev MAD median Description: Unknown AB.Name: Elev MAD mode Description: Unknown AC.Name: Elev L1 Description: Unknown AD.Name: Elev L2 Description: Unknown AE.Name: Elev L3 Description: Unknown AF.Name: Elev L4 Description: Unknown AG.Name: Elev L CV Description: Unknown AH.Name: Elev L skewness Description: Unknown AI.Name: Elev L kurtosis Description: Unknown AJ.Name: Elev P01 Description: 1 percentile of extracted height for specific plot AK.Name: Elev P05 Description: 5 percentile of extracted height for specific plot AL.Name: Elev P10 Description: 10 percentile of extracted height for specific plot AM.Name: Elev P20 Description: 20 percentile of extracted height for specific plot AN.Name: Elev P25 Description: 25 percentile of extracted height for specific plot AO.Name: Elev P30 Description: 30 percentile of extracted height for specific plot AP.Name: Elev P40 Description: 40 percentile of extracted height for specific plot AQ.Name: Elev P50 Description: 50 percentile of extracted height for specific plot AR.Name: Elev P60 Description: 60 percentile of extracted height for specific plot AS.Name: Elev P70 Description: 70 percentile of extracted height for specific plot AT.Name: Elev P75 Description: 75 percentile of extracted height for specific plot AU.Name: Elev P80 Description: 80 percentile of extracted height for specific plot AV.Name: Elev P90 Description: 90 percentile of extracted height for specific plot AW.Name: Elev P95 Description: 95 percentile of extracted height for specific plot AX.Name: Elev P99 Description: 99 percentile of extracted height for specific plot AY.Name: Canopy relief ratio Description: percent of plot covered by canopy AZ.Name: Elev SQRT mean SQ Description: unknown BA.Name: Elev CURT mean CUBE Description: unknown BB.Name: Int minimum Description: unknown BC.Name: Int maximum Description: unknown BD.Name: Int mean Description: unknown BE.Name: Int mode Description: unknown BF.Name: Int stddev Description: unknown BG.Name: Int variance Description: unknown BH.Name: Int CV Description: unknown BI.Name: Int IQ Description: unknown BJ.Name: Int skewness Description: unknown BK.Name: Int kurtosis Description: unknown BL.Name: Int AAD Description: unknown BM.Name: Int L1 Description: unknown BN.Name: Int L2 Description: unknown BO.Name: Int L3 Description: unknown BP.Name: Int L4 Description: unknown BQ.Name: Int L CV Description: unknown BR.Name: Int L skewness Description: unknown BS.Name: Int L kurtosis Description: unknown BT.Name: Int P01 Description: unknown BU.Name: Int P05 Description: unknown BV.Name: Int P10 Description: unknown BW.Name: Int P20 Description: unknown BX.Name: Int P25 Description: unknown BY.Name: Int P30 Description: unknown BZ.Name: Int P40 Description: unknown CA.Name: Int P50 Description: unknown CB.Name: Int P60 Description: unknown CC.Name: Int P70 Description: unknown CD.Name: Int P75 Description: unknown CE.Name: Int P80 Description: unknown CF.Name: Int P90 Description: unknown CG.Name: Int P95 Description: unknown CH.Name: Int P99 Description: unknown CI.Name: Profile area Description: unknown