This readme.txt file was generated on 2023-04-05 Recommended citation for the data: Haynes, David. (2023). National lung cancer screening estimates of the United States for the year 2020. Retrieved from the Data Repository for the University of Minnesota. https://conservancy.umn.edu/handle/11299/253603. ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: National lung cancer screening estimates of the United States for the year 2020 2. Author Information Author Contact: David Haynes (dahaynes@umn.edu) Name: David Haynes Email: dahaynes@umn.edu ORCID: 0000-0002-6858-428X 3. Date published or finalized for release: 2023-04-05 4. Date of data collection (single date, range, approximate date): 2023-03-01 5. Information about funding sources that supported the collection of the data: National Cancer Insitute 6. Overview of the data (abstract): The uptake and utilization of lung cancer screening services (LCS) is less than 15% a year. One reason for the low use of these services is the difficulty in determining if a person is eligible for screening services, which requires calculating a smoking pack-year history. A gap in the literature is the lack of integrated smoking history data into a national dataset. We present the first dataset to estimate the number of individuals eligible for lung cancer screening by integrating in smoking history data. We develop a publicly available dataset that allows researchers to understand the estimated population eligible for lung cancer screening at the census tract level. Additionally, our approach allows for the understanding of the eligible population by age, gender, racial/ethnic grouping, and current or former smoking status. The dataset is flexible and allows for filtering of any age, gender, racial/ethnic category in addition to smoking status. This dataset will greatly enhance future programmatic efforts to identify and direct resources to communities that have the highest burden. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Academic Free License ("AFL") v. 3.0 (https://opensource.org/license/afl-3-0-php/) 2. Links to publications that cite or use the data: 3. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/drum/policies/#terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- File List Filename: national_LCS_estimates_current_former_total.csv Short description: National LCS Estimates for current smokers, former smokers, current & former smokers Filename: national_LCS_estimates_all_categories.csv Short description: National LCS Estimates current & former smokers all categories Filename: national_LCS_estimates_all_categories.zip Short description: National LCS Estimates current & former smokers all categories - Shapefile Filename: national_LCS_estimates_current_former_total.zip Short description: National LCS Estimates for current smokers, former smokers, current & former smokers - Shapefile 2. Relationship between files: CSVs and zipped shapefiles that share the same name are representations of the same data in different file formats. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Was data derived from another source? If yes, list source(s): BRFSS datasets 2016-2020, American Community Survey 2016-2020 was downloaded from the National Historical Geographic Information System (NHGIS) 2. Description of methods used for creation and processing of data: Python 3.7 to automate the download and extraction process of all files. Access to the private GitHub repository is available upon request. The manuscript which describes the methods is under review at Health and Place. 3. Version of software used for analysis: Python 3.7, PostgreSQL 13, ArcPro 3.0.3 ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: national_LCS_estimates_current_former_total - CSV and zipped shapefile (CPG, DBF, PRJ, QMD, SHP, SHX, XML) ----------------------------------------- 1. Number of variables: 6 2. Number of features/rows: 84121 3. Variable List State_name = state abbreviation County_code = State and county FIPS code Tract_fips = State, county, trcts FIPS code Current_smokers = total individuals who are current smokers who are eligible for LCS Former_smokers = total individuals who are former smokers who are eligible for LCS Total = combined values of current_smokers & former smokers. All individual eligible for LCS ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: national_LCS_estimates_all_categories - CSV and zipped shapefile (CPG, DBF, PRJ, QMD, SHP, SHX, XML) ----------------------------------------- 1. Number of variables: 14 2. Number of cases/rows: 84121 3. Variable List Gid = geographic identifier Geoid = US FIPS identifiers White = Number of Individuals who are White and are eligible for LCS Black = Number of Individuals who are African American and are eligible for LCS AmericanIndian = Number of Individuals who are American Indian and Alaska Native and are eligible for LCS PacificIslander = Number of Individuals Pacific Islander and/or Asian American and are eligible for LCS Hispanic = Number of Individuals who are Hispanic and are eligible for LCS 50-54 = Number of Individuals who are aged 50-54 eligible for LCS 55-64 = Number of Individuals who are aged 55-64 eligible for LCS 65-74 = Number of Individuals who are aged 65-74 eligible for LCS 75-80 = Number of Individuals who are aged 75-80 eligible for LCS Male = Number of Individuals who are Male and are eligible for LCS Female = Number of Individuals who are Female and are eligible for LCS Total = The total number of Individuals who are eligible for LCS