This readme.txt file was generated on 2024-06-17 by Alicia Hofelich Mohr Recommended citation for the data: Segijn, C. M., Menheer, P., Lee, G., Kim, E., Olsen, D., & Hofelich Mohr, A. (2024). Data supporting: Automated Object Detection in Mobile Eye-Tracking Research: Comparing Manual Coding with Tag Detection, Shape Detection, Matching, and Machine Learning [Data set]. Data Repository for the University of Minnesota (DRUM). https://doi.org/10.13020/2SMC-3642 ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Data supporting: Automated Object Detection in Mobile Eye-Tracking Research: Comparing Manual Coding with Tag Detection, Shape Detection, Matching, and Machine Learning 2. Author Information Principal Investigator Contact Information Name: Claire M. Segijn Institution: University of Minnesota Email: segijn@umn.edu ORCID: https://orcid.org/0000-0002-2424-5737 Associate or Co-investigator Contact Information Name: Pernu Menheer Institution: University of Minnesota Email: pernu@umn.edu ORCID: Associate or Co-investigator Contact Information Name: Garim Lee Institution: Indiana University Email: garilee@iu.edu ORCID: https://orcid.org/0000-0002-7054-1967 Associate or Co-investigator Contact Information Name: Eunah Kim Institution: Mount Royal University Email: ekim@mtroyal.ca ORCID: https://orcid.org/0000-0003-2087-4648 Associate or Co-investigator Contact Information Name: David Olsen Institution: University of Minnesota Email: dolsen@umn.edu ORCID: Associate or Co-investigator Contact Information Name: Alicia Hofelich Mohr Institution: University of Minnesota Email: hofelich@umn.edu ORCID: https://orcid.org/0000-0002-7644-4105 3. Date published or finalized for release: 2024-06-18 4. Date of data collection: 2022 5. Geographic location of data collection: Minneapolis, Minnesota, US 6. Information about funding sources that supported the collection of the data: This work was supported by the Office of the Vice President for Research, University of Minnesota [The Grant-in-Aid of Research, Artistry, and Scholarship]. 7. Overview of the data (abstract): The goal of the current study is to compare the different methods for automated object detection (i.e., tag detection, shape detection, matching, and machine learning) with manual coding on different types of objects (i.e., static, dynamic, and dynamic with human interaction) and describe the advantages and limitations of each method. We tested the methods in an experiment that utilizes mobile eye tracking because of the importance of attention in communication science and the challenges this type of data poses to analyze different objects because visual parameters are consistently changing within and between participants. Python scripts, processed videos, R scripts, and processed data files are included for each method. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Attribution-NonCommercial 4.0 International http://creativecommons.org/licenses/by-nc/4.0/ 2. Links to publications that cite or use the data: Segijn, C.M., Menheer, P., Lee, G., Kim, E., Olsen, D., and Hofelich Mohr, A. Automated Object Detection in Mobile Eye-Tracking Research: Comparing Manual Coding with Tag Detection, Shape Detection, Matching, and Machine Learning. Submitted. 3. Was data derived from another source? No 4. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/policies/#drum-terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: 1_ManualCoding_data.csv Short description: Coding file for human raters on each fixation for the subset of videos B. Filename: 2_Shape_Detection.zip Short description: Scripts, data, and videos for the shape detection method. C. Filename: 3_Tag_Detection Short description: Scripts and data for the tag detection method. Because this method was built in through PupilLab, there are no videos or python scripts included. D. Filename: 4_ML_Yolo7 Short description: Scripts, data, and videos for the machine learning method. E. Filename: 5_TemplateMapping Short description: Scripts, data, and videos for the template matching method. F. Filename: 5a_FeatureMatching_Static Short description: Scripts, data, and videos for the static feature matching method. G. Filename: 5b_FeatureMatching_Dynamic Short description: Scripts, data, and videos for the dynamic feature matching method. File tree: ├── 1_ManualCoding_data.csv ├── 2_Shape_Detection │ ├── 2_FindRectangles3.py │ ├── 2_Shapes_aoi_data.csv │ ├── 2_Shapes_fixation_data.csv │ ├── 2_Shapes_script.R │ ├── p136_quadrilaterals_world.mp4 │ ├── p152_quadrilaterals_world.mp4 │ ├── p156_quadrilaterals_world.mp4 │ ├── p31_quadrilaterals_world.mp4 │ ├── p44_quadrilaterals_world.mp4 │ ├── p81_quadrilaterals_world.mp4 │ ├── p84_quadrilaterals_world.mp4 │ ├── p89_quadrilaterals_world.mp4 │ ├── p90_quadrilaterals_world.mp4 │ └── p94_quadrilaterals_world.mp4 ├── 3_Tag_Detection │ ├── 3_Tag_marker_data.csv │ ├── 3_Tag_script.R │ └── 3_Tag_surface_data.csv ├── 4_ML_Yolo7 │ ├── 4_MLDetectAOI-5_13_4.py │ ├── 4_ML_aoi_data.csv │ ├── 4_ML_fixation_data.csv │ ├── 4_ML_script.R │ ├── p136_Yv7world.mp4 │ ├── p152_Yv7world.mp4 │ ├── p156_Yv7world.mp4 │ ├── p31_Yv7world.mp4 │ ├── p44_Yv7world.mp4 │ ├── p81_Yv7world.mp4 │ ├── p84_Yv7world.mp4 │ ├── p89_Yv7world.mp4 │ ├── p90_Yv7world.mp4 │ └── p94_Yv7world.mp4 ├── 5_TemplateMapping │ ├── 5_TemplateBounding.py │ ├── 5_TemplateMatching_aoi_data.csv │ ├── 5_TemplateMatching_fixation_data.csv │ ├── 5_TemplateMatching_script.R │ ├── p136_templateMatching_world.mp4 │ ├── p152_templateMatching_world.mp4 │ ├── p156_templateMatching_world.mp4 │ ├── p31_templateMatching_world.mp4 │ ├── p44_templateMatching_world.mp4 │ ├── p81_templateMatching_world.mp4 │ ├── p84_templateMatching_world.mp4 │ ├── p89_templateMatching_world.mp4 │ ├── p90_templateMatching_world.mp4 │ └── p94_templateMatching_world.mp4 ├── 5a_FeatureMatching_Static │ ├── 5a_FeatureMatching_Static_aoi_data.csv │ ├── 5a_FeatureMatching_Static_fixation_data.csv │ ├── 5a_FeatureMatching_Static_script.R │ ├── 5a_FindBannerGazeMappingMethod.py │ ├── p136_banner_TM_world.mp4 │ ├── p152_banner_TM_world.mp4 │ ├── p156_banner_TM_world.mp4 │ ├── p31_banner_TM_world.mp4 │ ├── p44_banner_TM_world.mp4 │ ├── p81_banner_TM_world.mp4 │ ├── p84_banner_TM_world.mp4 │ ├── p90_banner_TM_world.mp4 │ └── p94_banner_TM_world.mp4 ├── 5b_FeatureMatching_Dynamic │ ├── 5b_DynGazeMappingc5.py │ ├── 5b_FeatureMatching_Dynamic_aoi_data.csv │ ├── 5b_FeatureMatching_Dynamic_fixation_data.csv │ ├── 5b_FeatureMatching_Dynamic_script.R │ ├── p136_world_DGM.mp4 │ ├── p152_world_DGM.mp4 │ ├── p156_world_DGM.mp4 │ ├── p31_world_DGM.mp4 │ ├── p44_world_DGM.mp4 │ ├── p81_world_DGM.mp4 │ ├── p89_world_DGM.mp4 │ ├── p90_world_DGM.mp4 │ └── p94_world_DGM.mp4 └── Readme.txt 2. Relationship between files: Each zip contains separate files for each method. Python scripts were used on the raw videos to generate the processed videos and the CSV files. A CSV file for the areas of interest (aoi) and fixation detections are included for each method. R scripts were used to analyze the CSV files to report the statistics and tables included in the manuscript. Processed videos are included for each participant. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: As part of this study, participants were asked to watch TV and use a tablet at the same time while wearing eye-tracking glasses (i.e., Pupil Core). After we informed participants that this study was about media multitasking and they signed the IRB consent form, they wore the eye-tracker glasses. Researchers instructed the participants to watch a 7-minute video clip on the TV (The Good Place, Season 1, Episode 1; Schur & Goddard, 2016) while reading a magazine on the tablet to obtain a general impression of the magazine content. Also, to ensure enough time for both the TV and tablet, participants were told to divide their attention equally between the TV and tablet. After that, researchers adjusted the eye and world view (i.e., front) cameras of the mobile eye-tracker to ensure that the cameras captured both the front side view and participants’ pupils well. As for the calibration process, participants looked at the numbers on the screens one after another (Figure 1), while researchers used the laptop with Pupil Capture Software to indicate what the participants were looking at. We used a nine-point calibration (i.e., five on the TV and four on the tablet) to calibrate the glasses for each participant. After calibration and instructions, participants watched the video clip on the TV and read a magazine on the tablet. The ad appeared on the tablet after 287 seconds and disappeared when the participant closed it, which all participants did almost immediately. The researcher left the room when the TV clip started and came back when it ended to start and stop the recording. 2. Methods for processing the data: Videos for 10 participants were processed using each of the methods to detect fixations on the TV, Tablet, and Advertisement. Processed videos were clipped to remove research assistants. 3. Instrument- or software-specific information needed to interpret the data: R, Python, text editor, video viewer. 4. Standards and calibration information, if appropriate: Eyetracker was calibrated using the built in pupil lab calibration. 5. People involved with sample collection, processing, analysis and/or submission: All paper authors were involved in data collection and processing. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 1_ManualCoding_data.csv ----------------------------------------- rows: 15928 cols: 18 A. Name: id Description: Frame id in video B. Name: start_timestamp Description: Timestamp of video start C. Name: duration Description: Duration of fixation (in ms) D. Name: TV_fixation Description: Was the center of the fixation point on the TV (rater 1) 1 = yes 0 = no 99 = can't tell or ambiguous E. Name: TV_radius Description: Was the radius of the fixation point on the TV (rater 1) 1 = yes 0 = no 99 = can't tell or ambiguous F. Name: Tablet_fixation Description: Was the center of the fixation point on the Tablet (rater 1) 1 = yes 0 = no 99 = can't tell or ambiguous G. Name: Tablet_radius Description: Was the radius of the fixation point on the Tablet (rater 1) 1 = yes 0 = no 99 = can't tell or ambiguous H. Name: banner_ad1 Description: Was the center of the fixation point on the banner ad (rater 1) 1 = yes blank = no or not present I. Name: banner_ad2 Description: Was the radius of the fixation point on the banner ad (rater 1) 1 = yes blank = no or not present J. Name: Participant Description: participant ID K. Name: TV_fixationB Description: Was the center of the fixation point on the TV (rater 2) 1 = yes 0 = no 99 = can't tell or ambiguous L. Name: TV_radiusB Description: Was the radius of the fixation point on the TV (rater 2) 1 = yes 0 = no 99 = can't tell or ambiguous M. Name: Tablet_fixationB Description: Was the center of the fixation point on the Tablet (rater 2) 1 = yes 0 = no 99 = can't tell or ambiguous N. Name: Tablet_radiusB Description: Was the radius of the fixation point on the TV (rater 2) 1 = yes 0 = no 99 = can't tell or ambiguous O. Name: banner_ad1B Description: Was the center of the fixation point on the Banner Ad (rater 2) 1 = yes blank = no or ad not present P. Name: banner_ad2B Description: Was the radius of the fixation point on the Banner Ad (rater 2) 1 = yes blank = no or ad not present ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 2_Shape_Detection/2_Shapes_aoi_data.csv ----------------------------------------- rows: 8799 cols: 8 A. Name: frame Description: Frame number of video B. Name: label Description: Location that was detected (TV, Tablet, Banner Ad) C. Name: x_min Description: minimum x coordinates of the quadrilateral D. Name: y_min Description: minimum y coordinates of the quadrilateral E. Name: x_max Description: maximum x coordinates of the quadrilateral F. Name: y_max Description: maximum y coordinates of the quadrilateral G. Name: confidence Description: Error measure (distance) generated by python script H. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 2_Shape_Detection/2_Shapes_fixation_data.csv ----------------------------------------- rows: 15928 cols: 5 A. Name: participant Description: Participant ID B. Name: id Description: Frame number of video C. Name: shp_TV_fixation Description: Detected fixation on TV 0 = no 1 = yes D. Name: shp_Tablet_fixation Description: Detected fixation on Tablet 0 = no 1 = yes E. Name: shp_Banner_fixation Description: Detected fixation on Banner Ad 0 = no 1 = yes ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 3_Tag_Detection/3_Tag_marker_data.csv ----------------------------------------- rows: 94 cols: 11 A. Name: world_index Description: Frame of video data B. Name: marker_uid Description: Specific ID of marker that was detected C. Name: corner_0_x Description: X location of top left corner of tag D. Name: corner_0_y Description: y location of top left corner of tag E. Name: corner_1_x Description: x location of top right corner of tag F. Name: corner_1_y Description: y location of top right corner of tag G. Name: corner_2_x Description: x location of bottom right corner of tag H. Name: corner_2_y Description: y location of bottom right corner of tag I. Name: corner_3_x Description: x location of bottom left corner of tag J. Name: corner_3_y Description: y location of bottom left corner of tag K. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 3_Tag_Detection/3_Tag_surface_data.csv ----------------------------------------- rows: 21 cols: 4 A. Name: surface_name Description: Name of area of interest Surface 1: Banner Large TV: TV Tablet 2: Tablet B. Name: visible_frame_count Description: Number of frames that object was detected in C. Name: frame_count Description: Total number of frames in video D. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 4_ML_Yolo7/4_ML_aoi_data.csv ----------------------------------------- rows: 120270 cols: 7 A. Name: frame Description: Frame ID B. Name: class Description: Item of interest C. Name: x_min Description: Minimum x location of item detected D. Name: y_min Description: Minimum y location of item detected E. Name: x_max Description: Maximum x location of item detected F. Name: y_max Description: Maximum y location of item detected G. Name: participant Description: participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 4_ML_Yolo7/4_ML_fixation_data.csv ----------------------------------------- rows: 15928 cols: 5 A. Name: participant Description: Participant ID B. Name: id Description: Frame number C. Name: ml_TV_fixation Description: Detected fixation on TV 0 = no 1 = yes D. Name: ml_Tablet_fixation Description: Detected fixation on Tablet 0 = no 1 = yes E. Name: ml_Banner_fixation Description: Detected fixation on Banner Ad 0 = no 1 = yes ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5_TemplateMapping/5_TemplateMatching_aoi_data.csv ----------------------------------------- rows: 215502 cols: 8 A. Name: frame Description: Frame number B. Name: class Description: Item detected C. Name: x_min Description: minimum x location of item D. Name: y_min Description: minimum y location of item E. Name: x_max Description: maximum x location of item F. Name: y_max Description: maximum y location of item G. Name: confidence Description: Distance error for object detection H. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5_TemplateMapping/5_TemplateMatching_fixation_data.csv ----------------------------------------- rows: 15928 cols: 5 A. Name: participant Description: Participant ID B. Name: id Description: frame number C. Name: tm_TV_fixation Description: Detected fixation on TV 0 = no 1 = yes D. Name: tm_Tablet_fixation Description: Detected fixation on Tablet 0 = no 1 = yes E. Name: tm_Banner_fixation Description: Detected fixation on Banner ad 0 = no 1 = yes ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5a_FeatureMatching_Static/5a_FeatureMatching_Static_aoi_data.csv ----------------------------------------- rows: 356 cols: 11 A. Name: frame Description: Frame number B. Name: label Description: Item of interest C. Name: X1 Description: Top left corner x coordinate where origin (0,0) is the top left corner of the frame. D. Name: Y1 Description: Top left corner y coordinate where origin (0,0) is the top left corner of the frame. E. Name: X2 Description: Top right corner x coordinate where origin (0,0) is the top left corner of the frame. F. Name: Y2 Description: Top right corner y coordinate where origin (0,0) is the top left corner of the frame. G. Name: X3 Description: Bottom right corner x coordinate where origin (0,0) is the top left corner of the frame. H. Name: Y3 Description: Bottom right corner y coordinate where origin (0,0) is the top left corner of the frame. I. Name: X4 Description: Bottom left corner x coordinate where origin (0,0) is the top left corner of the frame. J. Name: Y4 Description: Bottom left corner y coordinate where origin (0,0) is the top left corner of the frame. K. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5a_FeatureMatching_Static/5a_FeatureMatching_Static_fixation_data.csv ----------------------------------------- rows: 14284 cols: 3 A. Name: participant Description: Participant ID B. Name: id Description: frame number C. Name: gm_Banner_fixation Description: Detected fixation on Banner Ad 0 = no 1 = yes ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5b_FeatureMatching_Dynamic/5b_FeatureMatching_Dynamic_aoi_data.csv ----------------------------------------- rows: 36481 cols: 16 A. Name: Frame.Index Description: Frame number B. Name: Width1 Description: width of the original video on TV in pixels B. Name: Height1 Description: height of the original video on TV in pixels B. Name: Width2 Description: width of the participant world view video in pixels B. Name: Height2 Description: height of the participant world view video in pixels B. Name: Width3 Description: 75% of width2 (artifact from analysis tests) B. Name: Height3 Description: 75% of height2 (artifact from analysis tests) C. Name: Top.Left.X Description: Top left corner x coordinate where origin (0,0) is the top left corner of the frame. D. Name: Top.Left.Y Description: Top left corner y coordinate where origin (0,0) is the top left corner of the frame. E. Name: Top.Right.X Description: Top right corner x coordinate where origin (0,0) is the top left corner of the frame. F. Name: Top.Right.Y Description: Top right corner y coordinate where origin (0,0) is the top left corner of the frame. G. Name: Bottom.Right.X Description: Bottom right corner x coordinate where origin (0,0) is the top left corner of the frame. H. Name: Bottom.Right.Y Description: Bottom right corner y coordinate where origin (0,0) is the top left corner of the frame. I. Name: Bottom.Left.X Description: Bottom left corner x coordinate where origin (0,0) is the top left corner of the frame. J. Name: Bottom.Left.Y Description: Bottom left corner y coordinate where origin (0,0) is the top left corner of the frame. K. Name: participant Description: Participant ID ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: 5b_FeatureMatching_Dynamic/5b_FeatureMatching_Dynamic_fixation_data.csv ----------------------------------------- rows: 14852 cols: 3 A. Name: participant Description: participant id B. Name: id Description: frame number C. Name: dgm_tv_fixation Description: Detected fixation on TV 0 = no 1 = yes