This readme.txt file was generated on 20240506 by Dorothy D Sweet ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Genomes to Fields Initiative Flight Data - Delaware 2020 2. Author Information Principal Investigator Contact Information Name: Dorothy D Sweet Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: kirsc168@umn.edu ORCID: 0000-0002-9614-5436 Associate or Co-investigator Contact Information Name: Candice N Hirsch Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: cnhirsch@umn.edu ORCID: 0000-0002-8833-3023 Associate or Co-investigator Contact Information Name: Cory D Hirsch Institution: University of Minnesota Address: 1991 Upper Buford Circle Saint Paul, MN 55108 Email: cdhirsch@umn.edu ORCID: 0000-0002-3409-758X Associate or Co-investigator Contact Information Name: Erin E Sparks Institution: University of Delaware Address: 590 Avenue 1743, Room 301 Newark, DE 19713 Email: esparks@udel.edu ORCID: 0000-0003-1543-6950 Associate or Co-investigator Contact Information Name: Jarrod O Miller Institution: University of Delaware Address: 16483 County Seat Highway (Office 120) Georgetown, DE 19947 Email: jarrod@udel.edu ORCID: 0000-0002-5353-233X 3. Date published or finalized for release: 20240515 4. Date of data collection (single date, range, approximate date): 20200529 - 20200825 5. Geographic location of data collection (where was data collected?): University of Delaware 6. Information about funding sources that supported the collection of the data: 7. Overview of the data (abstract): This dataset (DRUM 1 of 8) is a subset of the flight data collected through the Genomes to Fields Initiative in 2020 and 2021. In conjunction with equivalent datasets on similar material at alternate locations, this data provides a valuable resource for evaluating the performance and stability of hybrid maize across many environments. Many flights throughout the growing season were conducted at these locations (Delaware, Minnesota, Missouri, Nebraska, and Texas) and this dataset includes the orthomosaics, digital elevation models, plot shapefiles, and extracted plant height values for each of those flights following the pipeline from Anderson, Steven L., II, Seth C. Murray, Lonesome Malambo, Colby Ratcliff, Sorin Popescu, Dale Cope, Anjin Chang, Jinha Jung, and J. Alex Thomasson. 2019. “Prediction of Maize Grain Yield before Maturity Using Improved Temporal Height Estimates of Unmanned Aerial Systems.” The Plant Phenome Journal 2 (1): 1–15. This maize experiment consisted of over 1000 maize hybrids grown in partial replication across 8 environments in 2 years. A set of common hybrids were grown in every location in order to establish a connection between environments. Within the partially replicated set, hybrids were produced by the cross of double haploids derived from the WI-SS-MAGIC population to the inbred testers PHK76, PHP02, and PHZ51 with the tester choice depending on the relative maturity zone of the location. For this location (Delaware 2020) all testers were used. A modified randomized complete block design was used for testing. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal 2. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/drum/policies/#terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: YYYYMMDD-G2F-Delaware-RGB-Orthos_LZW.tif Short description: compressed orthomosaic (ortho) Projection: WGS 1984 UTM Zone 18N, EPSG 32618 B. Filename: YYYYMMDD-G2F-Delaware-RGB-DEMs.zip Short description: zipped folder containing Digital Elevation Models (DEMs) for the flight Folder contains: YYYYMMDD-G2F-Delaware-RGB.laz YYYYMMDD-G2F-Delaware-RGB_sort.laz YYYYMMDD-G2F-Delaware-RGB_sort_NREM.las YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND.las YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND_KP.laz YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND_KP_BARE.las YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND_KP_BARE_ASC.asc YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND_KP_BARE_ASC_HGT.las YYYYMMDD-G2F-Delaware-RGB_sort_NREM_GRND_KP_BARE_DTM.dtm laz_[individual plot identifier].las (1550 files) Projection: WGS 1984 UTM Zone 18N, EPSG 32618 C. Filename: DE_PHTs_2020_merged.csv Short description: file containing all extracted height values as well as other observation data for that location D. Filename: Plot_Shapefiles.zip Short description: zipped folder containing all plot boundary shapefiles for each flight for the location Projection: NAD 1983 UTM Zone 18N, EPSG 26918 E. Filename: Extracted_Values.zip Short description: zipped folder containing all csvs of extracted height values for each flight 2. Relationship between files: YYYYMMDD-G2F-Delaware-RGB-Orthos_LZW.tif and YYYYMMDD-G2F-Delaware-RGB-DEMs.zip are compressed formats of the orthomosaics and digital elevation models respectively. Both are products of the structure from motion process of extracting plant height from 2D RGB images. Plot_Shapefiles.zip contains shapefiles defining the boundaries of each plot for data extraction (Extracted_Values.zip). Extracted_Values.zip is a folder containing all csv files with the raw extracted plant height data and DE_PHTs_2020_merged contains all of the data from Extracted_Values.zip for the location with other observational data from the location. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Various platforms were used for image collection with each location using their own UAV and camera combination. Guidelines for a standard operating procedure were distributed and followed by each group with standards for weather, camera settings, overlap percentages, and image resolution. Weather:Flights on cloudless, sunny days between 9 A.M and 4 P.M; Camera Settings: ISO: 100 or 200; Exposure time: < 1/1000; Aperture: f 2 to 4; Image Overlap: 70% overlap early in the season, 80% overlap after canopy fills out; and Image Resolution: Ground Sampling Distance <= 0.94 cm/pixel (0.37 in/pixel). Flights were intended to be collected at least once a week throughout the growing season until flowering. 2. Methods for processing the data: UAV images were processed following the protocol from Anderson et al. (2019). In short, structure from motion (SfM) photogrammetry algorithms from either Agisoft Software (Agisoft PhotoScan Professional, (Agisoft L L C 2019)) or Pix4Dmapper (PIX4Dmapper ) were used to identify common features (tie points) across images, triangulate, and adjust distortion to generate dense 3D point clouds, digital surface models, and orthomosaic images. Ground control points were placed throughout the study area to ensure correct scale, orientation, and geographic location. A processing pipeline using R/UAStools::plotshpcreate (Anderson ) was used to construct ESRI shapefiles (.shp) of individual research plots for plot extraction using a data frame containing the experimental design, plot dimensions, and unique plot IDs. After the shapefile is visually overlaid on the orthomosaics, some manual adjustment is needed due to subtle variances in the rows. Point clouds were clipped to the trial level and point anomalies above or below the point cloud were manually removed using CloudCompare v2.10 (Girardeau-Montaut ) and a custom batch script was run using executable functions from LAStools (Isenburg 2012; LAStools 220107 2022) and FUSION/LDV (McGaughey 2012) software (https://github.com/andersst91/UAS_Height_Pipeline) to sort data points, remove additional erroneous points, identify ground points, identify key points for digital terrain modeling (DTM), and construct the DTM. The noise-filtered point cloud was adjusted to aboveground height using the DTM and points below the DEM were removed. The plot-level ESRI shapefile was then used to clip individual plot point clouds and estimate height. 3. Instrument- or software-specific information needed to interpret the data: R is needed to visualize and extract data, but the raw data is also present. 4. Standards and calibration information, if appropriate: Ground targets (different for each location) were placed around the border of the area of interest for use as ground control points (GCPs) 5. Environmental/experimental conditions: Weather information including daily minimum and maximum temperatures (˚C) and daily total precipitation (mm.) were collected from WatchDog Weather Stations at each location. 6. Describe any quality-assurance procedures performed on the data: Quality control filtering included the removal of phenotypic records with missing data in terminal plant height, days to anthesis, days to silking, or grain yield and plots with a grain yield of more than 500 bu/acre. Plots were also removed based on stand count, stalk lodging, and root lodging if less than 30 percent of the plot germinated or more than 70 percent of the stand count lodged. Further quality control based on the temporal plant height removed plots with no growth throughout the season as well as plots with too many large dips and peaks in the growth rate throughout the season with a dip being a decrease of more than 20 percent and a peak being an increase of more than 20 percent. Lastly, whole flight days were removed based on visual inspection of the variation of height across the field for that day with some flights showing irregular, unexplainable patterns across the field. 7. People involved with sample collection, processing, analysis and/or submission: Dorothy Sweet, Alper Adak, Mustafa Arik, Jose Varela, Aaron DeSalvio, Seth Murray, Erin Sparks, Jarrod Miller, Cory Hirsch, Candice Hirsch ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: DE_PHTs_2020_merged.csv ----------------------------------------- *Variables with unknown definitions were default outputs of the software and not used for analysis 1. Number of variables: 127 2. Number of cases/rows: 20087 3. Missing data codes: Code/symbol NA 4. Variable List A. Name: file_name Description: name of the flight file for that row of data (YYYYMMDD) of flight B. Name: Plot_ID Description: plot identification C. Name: Total return count Description: Number of points in the point cloud for the specific plot D. Name: Total return count above 0.00 Description: Number of points in the point cloud above a height of 0.00 E. Name: Return 1 count above 0.00 Description: Unknown F. Name: Return 2 count above 0.00 Description: Unknown G. Name: Return 3 count above 0.00 Description: Unknown H. Name: Return 4 count above 0.00 Description: Unknown I. Name: Return 5 count above 0.00 Description: Unknown J. Name: Return 6 count above 0.00 Description: Unknown K. Name: Return 7 count above 0.00 Description: Unknown L. Name: Return 8 count above 0.00 Description: Unknown M. Name: Return 9 count above 0.00 Description: Unknown N. Name: Other return count above 0.00 Description: Unknown O. Name: Elev minimum Description: minimum height value extracted from the point cloud for specific plot P. Name: Elev maximum Description: maximum height value extracted from the point cloud for specific plot Q. Name: Elev mean Description: average height value extracted from the point cloud for specific plot R. Name: Elev mode Description: most common height value extracted from the point cloud for specific plot S. Name: Elev stddev Description: standard deviation of height values extracted from the point cloud for specific plot T. Name: Elev variance Description: variance of height values extracted from the point cloud for specific plot U. Name: Elev CV Description: Unknown V. Name: Elev IQ Description: Unknown W. Name: Elev skewness Description: Unknown X. Name: Elev kurtosis Description: Unknown Y. Name: Elev AAD Description: Unknown Z. Name: Elev MAD median Description: Unknown AA.Name: Elev MAD mode Description: Unknown AB.Name: Elev L1 Description: Unknown AC.Name: Elev L2 Description: Unknown AD.Name: Elev L3 Description: Unknown AE.Name: Elev L4 Description: Unknown AF.Name: Elev L CV Description: Unknown AG.Name: Elev L skewness Description: Unknown AH.Name: Elev L kurtosis Description: Unknown AI.Name: Elev P01 Description: 1 percentile of extracted height for specific plot AJ.Name: Elev P05 Description: 5 percentile of extracted height for specific plot AK.Name: Elev P10 Description: 10 percentile of extracted height for specific plot AL.Name: Elev P20 Description: 20 percentile of extracted height for specific plot AM.Name: Elev P25 Description: 25 percentile of extracted height for specific plot AN.Name: Elev P30 Description: 30 percentile of extracted height for specific plot AO.Name: Elev P40 Description: 40 percentile of extracted height for specific plot AP.Name: Elev P50 Description: 50 percentile of extracted height for specific plot AQ.Name: Elev P60 Description: 60 percentile of extracted height for specific plot AR.Name: Elev P70 Description: 70 percentile of extracted height for specific plot AS.Name: Elev P75 Description: 75 percentile of extracted height for specific plot AT.Name: Elev P80 Description: 80 percentile of extracted height for specific plot AU.Name: Elev P90 Description: 90 percentile of extracted height for specific plot AV.Name: Elev P95 Description: 95 percentile of extracted height for specific plot AW.Name: Elev P99 Description: 99 percentile of extracted height for specific plot AX.Name: Canopy relief ratio Description: percent of plot covered by canopy AY.Name: Elev SQRT mean SQ Description: unknown AZ.Name: Elev CURT mean CUBE Description: unknown BA.Name: Int minimum Description: unknown BB.Name: Int maximum Description: unknown BC.Name: Int mean Description: unknown BD.Name: Int mode Description: unknown BE.Name: Int stddev Description: unknown BF.Name: Int variance Description: unknown BG.Name: Int CV Description: unknown BH.Name: Int IQ Description: unknown BI.Name: Int skewness Description: unknown BJ.Name: Int kurtosis Description: unknown BK.Name: Int AAD Description: unknown BL.Name: Int L1 Description: unknown BM.Name: Int L2 Description: unknown BN.Name: Int L3 Description: unknwon BO.Name: Int L4 Description: unknown BP.Name: Int L CV Description: unknown BQ.Name: Int L skewness Description: unknown BR.Name: Int L kurtosis Description: unknown BS.Name: Int P01 Description: unknown BT.Name: Int P05 Description: unknown BU.Name: Int P10 Description: unknown BV.Name: Int P20 Description: unknown BW.Name: Int P25 Description: unknown BX.Name: Int P30 Description: unknown BY.Name: Int P40 Description: unknown BZ.Name: Int P50 Description: unknown CA.Name: Int P60 Description: unknown CB.Name: Int P70 Description: unknown CC.Name: Int P75 Description: unknown CD.Name: Int P80 Description: unknown CE.Name: Int P90 Description: unknown CF.Name: Int P95 Description: unknown CG.Name: Int P99 Description: unknown CH.Name: Profile area Description: unknown CI.Name: Flight_date Description: Date of the flight of the data CJ.Name: DAP Description: Days After Planting CK.Name: Year Description: Year of flight CL.Name: Field.Location Description: Name of the field location CM.Name: State Description: State of field location CN.Name: City Description: City of field location CO.Name: Plot.length..center.center.in.feet. Description: Length of the plot from center to center in feet CP.Name: Plot.area..ft2. Description: area of the plot in square feet CQ.Name: Alley.length..in.inches. Description: Length of the alley in inches CR.Name: Row.spacing..in.inches. Description: Spacing between plot rows in inches CS.Name: Rows.per.plot Description: Number of rows in plot CT.Name: X..Seed.per.plot Description: Number of seeds planted in the plot CU.Name: Experiment Description: Name of the experiment for the plot with the tester e CV.Name: Source Description: Seed source CW.Name: Pedigree Description: Name of the genotype planted with the female/male parents CX.Name: Family Description: Name of the family of the female parent CY.Name: Tester Description: Name of the tester (male parent) used in the hybrid CZ.Name: Rep Description: Replicate number of plot DA.Name: Block Description: Block number of plot DB.Name: Plot Description: plot number DC.Name: Range Description: Range location of the plot DD.Name: Pass Description: Number of the pass of the plot DE.Name: Date.Plot.Planted..MM.DD.YY. Description: Date the plot was planted MM/DD/YY DF.Name: Date.Plot.Harvested..MM.DD.YY. Description: Date the plot was harvested MM/DD/YY DG.Name: Anthesis..MM.DD.YY. Description: Date of the plot's anthesis MM/DD/YY DH.Name: Silking..MM.DD.YY. Description: Date of the plot's silking MM/DD/YY DI.Name: Anthesis..days. Description: Date of the plot's anthesis in days after planting DJ.Name: Silking..days. Description: Date of the plot's silking in days after planting DK.Name: Plant.Height..cm. Description: Manual plant height measurement at terminal height in centimeters DL.Name: Ear.Height..cm. Description: Manual ear height measurement of first ear at terminal height in centimeters DM.Name: Stand.Count.....of.plants. Description: Plant stand count in the plot DN.Name: Root.Lodging....of.plants. Description: number of plants root lodged at the end of the season DO.Name: Stalk.Lodging....of.plants. Description: number of plants stalk lodged at the end of the season DP.Name: Grain.Moisture.... Description: grain moisture of the plot at harvest DQ.Name: Test.Weight..lbs. Description: grain test weight of the plot at harvest in pounds DR.Name: Plot.Weight..lbs. Description: weight of grain harvested from the plot in pounds DS.Name: Grain.Yield..bu.A. Description: Dry grain yield in bushels per acre for the plot DT.Name: Plot.Discarded..enter..yes..or.blank. Description: Note on whether or not the plot was discarded Value labels if appropriate: blank if not discarded, yes if discarded DU.Name: Comments Description: any comments on the plot Value labels if appropriate: usually blank DV.Name: Filler Description: seed used to replace the hybrid if insufficient seed Value labels if appropriate: blank if no filler used DW.Name: Snap....of.plants. Description: Number of plants snapped off Value labels if appropriate: NA if none, number if there are some ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: YYYYMMDD-G2F-Delaware-RGB.csv ----------------------------------------- *Variables with unknown definitions were default outputs of the software and not used for analysis 1. Number of variables: 87 2. Number of cases/rows: ~1500 3. Missing data codes: Code/symbol NA 4. Variable List A. Name: Identifier Description: Plot ID B. Name: DataFile Description: Location of file for the data in that row C. Name: FileTitle Description: Name of the file for the data in that row D. Name: Total return count Description: Number of points in the point cloud for the specific plot E. Name: Total return count above 0.00 Description: Number of points in the point cloud above a height of 0.00 F. Name: Return 1 count above 0.00 Description: Unknown G. Name: Return 2 count above 0.00 Description: Unknown H. Name: Return 3 count above 0.00 Description: Unknown I. Name: Return 4 count above 0.00 Description: Unknown J. Name: Return 5 count above 0.00 Description: Unknown K. Name: Return 6 count above 0.00 Description: Unknown L. Name: Return 7 count above 0.00 Description: Unknown M. Name: Return 8 count above 0.00 Description: Unknown N. Name: Return 9 count above 0.00 Description: Unknown O. Name: Other return count above 0.00 Description: Unknown P. Name: Elev minimum Description: minimum height value extracted from the point cloud for specific plot Q. Name: Elev maximum Description: maximum height value extracted from the point cloud for specific plot R. Name: Elev mean Description: average height value extracted from the point cloud for specific plot S. Name: Elev mode Description: most common height value extracted from the point cloud for specific plot T. Name: Elev stddev Description: standard deviation of height values extracted from the point cloud for specific plot U. Name: Elev variance Description: variance of height values extracted from the point cloud for specific plot V. Name: Elev CV Description: Unknown W. Name: Elev IQ Description: Unknown X. Name: Elev skewness Description: Unknown Y. Name: Elev kurtosis Description: Unknown Z. Name: Elev AAD Description: Unknown AA.Name: Elev MAD median Description: Unknown AB.Name: Elev MAD mode Description: Unknown AC.Name: Elev L1 Description: Unknown AD.Name: Elev L2 Description: Unknown AE.Name: Elev L3 Description: Unknown AF.Name: Elev L4 Description: Unknown AG.Name: Elev L CV Description: Unknown AH.Name: Elev L skewness Description: Unknown AI.Name: Elev L kurtosis Description: Unknown AJ.Name: Elev P01 Description: 1 percentile of extracted height for specific plot AK.Name: Elev P05 Description: 5 percentile of extracted height for specific plot AL.Name: Elev P10 Description: 10 percentile of extracted height for specific plot AM.Name: Elev P20 Description: 20 percentile of extracted height for specific plot AN.Name: Elev P25 Description: 25 percentile of extracted height for specific plot AO.Name: Elev P30 Description: 30 percentile of extracted height for specific plot AP.Name: Elev P40 Description: 40 percentile of extracted height for specific plot AQ.Name: Elev P50 Description: 50 percentile of extracted height for specific plot AR.Name: Elev P60 Description: 60 percentile of extracted height for specific plot AS.Name: Elev P70 Description: 70 percentile of extracted height for specific plot AT.Name: Elev P75 Description: 75 percentile of extracted height for specific plot AU.Name: Elev P80 Description: 80 percentile of extracted height for specific plot AV.Name: Elev P90 Description: 90 percentile of extracted height for specific plot AW.Name: Elev P95 Description: 95 percentile of extracted height for specific plot AX.Name: Elev P99 Description: 99 percentile of extracted height for specific plot AY.Name: Canopy relief ratio Description: percent of plot covered by canopy AZ.Name: Elev SQRT mean SQ Description: unknown BA.Name: Elev CURT mean CUBE Description: unknown BB.Name: Int minimum Description: unknown BC.Name: Int maximum Description: unknown BD.Name: Int mean Description: unknown BE.Name: Int mode Description: unknown BF.Name: Int stddev Description: unknown BG.Name: Int variance Description: unknown BH.Name: Int CV Description: unknown BI.Name: Int IQ Description: unknown BJ.Name: Int skewness Description: unknown BK.Name: Int kurtosis Description: unknown BL.Name: Int AAD Description: unknown BM.Name: Int L1 Description: unknown BN.Name: Int L2 Description: unknown BO.Name: Int L3 Description: unknown BP.Name: Int L4 Description: unknown BQ.Name: Int L CV Description: unknown BR.Name: Int L skewness Description: unknown BS.Name: Int L kurtosis Description: unknown BT.Name: Int P01 Description: unknown BU.Name: Int P05 Description: unknown BV.Name: Int P10 Description: unknown BW.Name: Int P20 Description: unknown BX.Name: Int P25 Description: unknown BY.Name: Int P30 Description: unknown BZ.Name: Int P40 Description: unknown CA.Name: Int P50 Description: unknown CB.Name: Int P60 Description: unknown CC.Name: Int P70 Description: unknown CD.Name: Int P75 Description: unknown CE.Name: Int P80 Description: unknown CF.Name: Int P90 Description: unknown CG.Name: Int P95 Description: unknown CH.Name: Int P99 Description: unknown CI.Name: Profile area Description: unknown