This codebook.txt file was generated on <20170710> by K. VanderWaal ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Weighted edgelist of cattle herd movements 2. Author Information Principal Investigator Contact Information Name: Kimberly VanderWaal Institution: University of Minnesota Address: 1365 Gortner Avenue, St. Paul, MN 55108 Email: kvw@umn.edu Associate or Co-investigator Contact Information Name: Brian Allan Institution: University of Illinois Address: Department of Entomology Email: ballan@life.illinois.edu Associate or Co-investigator Contact Information Name: Meggan Craft Institution: University of Minnesota Address: 1365 Gortner Avenue, St. Paul, MN 55108 Email: craft@umn.edu Associate or Co-investigator Contact Information Name: Sharon Okanga Institution: University of Illinois Address: Department of Entomology Email: sokanga@gmail.com Associate or Co-investigator Contact Information Name: Marie Gilbertson Institution: University of Minnesota Address: 1365 Gortner Avenue, St. Paul, MN 55108 Email: mjones029@gmail.com 3. Date of data collection (single date, range, approximate date): 20150201 to 20150630 4. Geographic location of data collection (where was data collected?): Ol Pejeta Conservancy, Laikipia County, Kenya 5. Information about funding sources that supported the collection of the data:This project was funded by NSF Grant CNH-1313822 to B.F.A.; M.G. was funded by the University of Minnesota Veterinary Summer Scholars program; M.E.C. was funded by National Science Foundation (DEB-1413925) and the University of Minnesota’s Office of the Vice President for Research and Academic Health Center Seed Grant. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data:CC0 1.0 Universal 2. Links to publications that cite or use the data: VanderWaal et al. “Seasonality and pathogen transmission in pastoral cattle contact networks.” Submitted to Royal Society Open Science 3. Links to other publicly accessible locations of the data: N/A 4. Links/relationships to ancillary data sets: N/A 5. Was data derived from another source? No. If yes, list source(s): 6. Recommended citation for the data: VanderWaal, K., Gilbertson, M., Orange, S., Allan, B.F., Craft, M.E. Epidemiological model and weighted edgelists of contacts among cattle herds in the dry and wet season in central Kenya. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: edgelist.static.pardry.csv Short description: Static edgelist representing contacts among cattle herds in the dry season B. Filename: edgelist.static.wet.csv Short description: Static edgelist representing contacts among cattle herds in the wet season C. Filename: edges.dynamic.dry.csv Short description: Dynamic edgelist representing contacts by day among cattle herds in the dry season. C. Filename: edges.dynamic.wet.csv Short description: Dynamic edgelist representing contacts by day among cattle herds in the wet season. E. Filename: OPC_cattle_model_DRUM.R Short description: R code for running transmission model on dynamic cattle herd contact network. 2. Relationship between files: Dynamic edgelists can be used to simulate disease spread in the wet or dry season using the modeling code in the fifth file. A dynamic edgelist should be read into the modeling code as part of the read.csv(file.choose()) function. The code should be run in entirety to load the function “Net.model()”. Then run Net.model() to produce the output. 3. Additional related data collected that was not included in the current data package: Daily trajectories of cattle herd movement. 4. Are there multiple versions of the dataset? No If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Cattle herds were tracked via GPS units, which recorded GPS location of herd every 15 minutes. For each herd, one cow and one herder carried a GPS unit for tracking purposes. 2. Methods for processing the data: For each possible pair of herds, between-herd contact was defined as any two GPS locations that were within 50 m proximity within a 60 minute interval. This was considered a reasonable approximation of contact in this system. 3. Instrument- or software-specific information needed to interpret the data: R 4. Standards and calibration information, if appropriate: N/A 5. Environmental/experimental conditions: N/A 6. Describe any quality-assurance procedures performed on the data: N/A 7. People involved with sample collection, processing, analysis and/or submission: All authors listed above. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: edgelist.static.pardry.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 165 3. Missing data codes: N/A 4. Variable List A. Name: V1 Description: Each row represents contact between two herds. V1 represents a unique identifier of one herd. B. Name: V2 Description: Each row represents contact between two herds. V2 represents a unique identifier of one herd. C. Name: contact.days Description: Number of days in which herds were in contact, where contact is defined as the herds were within 50 m of one another at least once during the day. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: edgelist.static.wet.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 124 3. Missing data codes: N/A 4. Variable List A. Name: V1 Description: Each row represents contact between two herds. V1 represents a unique identifier of one herd. B. Name: V2 Description: Each row represents contact between two herds. V2 represents a unique identifier of one herd. C. Name: contact.days Description: Number of days in which herds were in contact, where contact is defined as the herds were within 50 m of one another at least once during the day. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: edge.dynamic.dry.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 682 3. Missing data codes: N/A 4. Variable List A. Name: V1 Description: Each row represents contact between two herds. V1 represents a unique identifier of one herd. B. Name: V2 Description: Each row represents contact between two herds. V2 represents a unique identifier of one herd. C. Name: Date Description: Day on which the contact occurred, as defined as days since the first day of the season (Feb 3, 2015) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: edge.dynamic.wet.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 633 3. Missing data codes: N/A 4. Variable List A. Name: V1 Description: Each row represents contact between two herds. V1 represents a unique identifier of one herd. B. Name: V2 Description: Each row represents contact between two herds. V2 represents a unique identifier of one herd. C. Name: Date Description: Day on which the contact occurred, as defined as days since the first day of the season (April 14, 2015) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: OPC_cattle_model_DRUM.R ----------------------------------------- 1. Number of variables: N/A 2. Number of cases/rows: 97 lines of code 3. Missing data codes: N/A 4. Variable List (here used to define arguments of model) A. Name: beta Description: Probability of transmission of a hypothetical disease given contact. User defined B. Name: inc Description: Incubation period of disease (Number of days between exposure to when herd is capable of infecting others). User defined C. Name: edges (a .csv file containing a dynamic edgelist) Description: The model requires users to upload an edgelist that defines contact among nodes. Column one and two should be titled V1 and V2 and represent unique IDs for the nodes in contact. Column three should be titled “Date” and indicate the date in which the nodes were in contact. The output of the function is a list. The first object of the list is a data frame containing column 1 (name): ID of herd; column 2 (state): Infected state of the herd at the end of the simulation: 0=Not infected, 1=Infected; and column 3 (time.I): Time step in which the herd became infected. The second object is a data frame tracking the number of suceptible and infected herds over the course of the simulation. Column 1 (time): timestep; Column 2 (S): Number of susceptible herds at that time step; Column 3 (I): Number of infected herds at that tilmestep