This readme.txt file was generated on 2021-10-04 by Brian Allan Woodcock. It was modified on 2021-11-06 by Samuel C. Fletcher. Data-specific information updated on 2021-11-14 by Brian Allan Woodcock. It was modified on 2021-12-01 by Samuel C. Fletcher. ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Classification of formal methods use, type, sophistication, and subdiscipline in the journal Philosophical Studies, 1999, 2005, 2007, 2009, 2015, 2017, 2019 2. Author Information Principal Investigator Contact Information Name: Samuel C. Fletcher Institution: University of Minnesota Address: Email: scfletch@umn.edu ORCID: 0000-0002-9061-8976 Associate or Co-investigator Contact Information Name: Joshua Knobe Institution: Yale University Address: Email: joshua.knobe@yale.edu ORCID: 0000-0003-0733-3775 Associate or Co-investigator Contact Information Name: Gregory Wheeler Institution: Frankfurt School of Finance & Management Address: Email: G.Wheeler@fs.de ORCID: Associate or Co-investigator Contact Information Name: Brian Allan Woodcock Institution: University of Minnesota Address: Email: brianwoodcock99@gmail.com ORCID: 0000-0003-1423-939X 3. Date of data collection August 2020: Article gathering Sep 2020 - July 2021: Article screening and classification 4. Geographic location of data collection: Article screening occurred in Minnesota Article classification: NA 5. Information about funding sources that supported the collection of the data: Grant-in-aid to Samuel C. Fletcher from the Office of the Vice President for Research, University of Minnesota -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: 2. Links to publications that cite or use the data: https://dailynous.com/2021/09/24/evidence-for-a-probabilistic-turn-in-philosophy-guest-post/ http://philsci-archive.pitt.edu/19575/1/Changing_Use_of_Formal_Methods_in_Philosophy__Late_2000s_vs__Late_2010s(2).pdf 3. Links to other publicly accessible locations of the data: 4. Links/relationships to ancillary data sets: 5. Was data derived from another source? If yes, list source(s): Data was scraped from the Philpapers.org database. 6. Recommended citation for the data: --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: phil_studies_7y_july_foranalysis_master.csv Short description: the final, master data set Date completed in current form: 2021-07-29 B. Filename: phil_studies_7y_july_interrater_reliability.csv Short description: data set created from the master for the calculation of classification interrater reliability Date completed in current form: 2021-08-02 C. Filename: formal_methods_interrater_reliability.ipynb Short description: Python notebook for generating '...interrater_reliability.csv' from '...foranalysis_master.csv' and calculating interrater reliabilities Date completed in current form: 2021-11-14 D. Filename: formal_methods_interrater_reliability.html Short description: HTML version of Python notebook for generating interrater reliability Date completed in current form: 2021-11-14 E. Filename: Formal methods.Rmd Short description: R notebook with statistical analyses using the data sets Date completed in current form: 2021-11-26 F. Filename: Formal-methods.html Short description: HTML compiled version of R notebook for statistical analyses using the data sets Date completed in current form: 2021-11-26 G. Filename: Stages in the Formal Methods Project.pdf Short description: elaboration of the data and code generation stages, with process and code files Date completed in current form: 2021-10-02 2. Relationship between files: See 'Stages in the Formal Methods Project.pdf'. 3. Additional related data collected that was not included in the current data package: 4. Are there multiple versions of the dataset? Intermediary versions were generated with different names. None has this exact name. See 'Stages in the Formal Methods Project.pdf'. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: See the paper. http://philsci-archive.pitt.edu/19575/ 2. Methods for processing the data: See the paper. http://philsci-archive.pitt.edu/19575/ 3. Instrument- or software-specific information needed to interpret the data: 4. Standards and calibration information, if appropriate: 5. Environmental/experimental conditions: 6. Describe any quality-assurance procedures performed on the data: See the paper. http://philsci-archive.pitt.edu/19575/ 7. People involved with sample collection, processing, analysis and/or submission: See the paper. http://philsci-archive.pitt.edu/19575/ ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: phil_studies_7y_july_foranalysis_master.csv ----------------------------------------- 1. Number of variables: 54 (excluding ‘key’ as unique identifier column) 2. Number of cases/rows: 975 cases (excluding first row of column titles) Index: 975 entries, CBNREUXB to VIT964GL Data columns (total 54 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 author 975 non-null string 1 title 975 non-null string 2 year 975 non-null Int64 3 vol 975 non-null Int64 4 iss 975 non-null string 5 pg_start 975 non-null Int64 6 pg_end 975 non-null Int64 7 doi 975 non-null string 8 num_authors 975 non-null Int64 9 discussion 975 non-null Int64 10 screener1 211 non-null Int64 11 screener2 330 non-null Int64 12 screener3 735 non-null Int64 13 screener4 443 non-null Int64 14 screener5 73 non-null Int64 15 screener6 264 non-null Int64 16 screen_num 975 non-null Int64 17 screen_ave 975 non-null Float64 18 screen_hits 975 non-null Int64 19 class_assignments 245 non-null string 20 method_a 122 non-null string 21 level_a 122 non-null Int64 22 subdisc_a 122 non-null string 23 method_b 124 non-null string 24 level_b 124 non-null Int64 25 subdisc_b 124 non-null string 26 method_c 122 non-null string 27 level_c 122 non-null Int64 28 subdisc_c 122 non-null string 29 method_d 122 non-null string 30 level_d 122 non-null Int64 31 subdisc_d 122 non-null string 32 resolver 207 non-null string 33 method_resolved 245 non-null string 34 level_resolved 245 non-null Int64 35 subdisc_resolved 245 non-null string 36 c 245 non-null Int64 37 d 245 non-null Int64 38 l 245 non-null Int64 39 m 245 non-null Int64 40 o 245 non-null Int64 41 p 245 non-null Int64 42 s 245 non-null Int64 43 t 245 non-null Int64 44 act 245 non-null Int64 45 dec 245 non-null Int64 46 epi 245 non-null Int64 47 lan 245 non-null Int64 48 log 245 non-null Int64 49 met 245 non-null Int64 50 min 245 non-null Int64 51 oth 245 non-null Int64 52 sci 245 non-null Int64 53 val 245 non-null Int64 dtypes: Float64(1), Int64(37), string(16) Classifications Key for Method, Level, and Subdiscipline method level subdiscipline c - Causal modeling 0 - None act - Action & free will d - Decision & game theory 1 - Basic/Fundamental dec - Decision & Game Theory l - Logic 2 - Intermediate epi - Epistemology m - Logic: modal 3 - Advanced lan - Philosophy of Language o - Other log - Logic p - Probability met - Metaphysics s - Set theory and theory of relations min - Philosophy of Mind t - Statistics oth - Other sci - Philosophy of Science VARIABLE GROUP VARIABLES DESCRIPTION article info author Self-explanatory article information variables title year vol iss pg_start pg_end doi num_authors discussion Attempt at: Is this article part of a discussion? (1=yes, 0=no) article screening screener1 Uses formal methods? (1=yes, 0=no) screener2 screener3 screener4 screener5 screener6 screen_num Screeners per article screen_ave Average screening score screen_hits Uses formal methods, based on screen_ave >= 0.5 (1=yes, 0=no) classifications class_assignments Which two classifiers randomly assigned to the article? (2 from a, b, c, or d) method_a Method classification by a level_a Level classification by a subdisc_a Subdiscipline classification by a method_b level_b subdisc_b method_c level_c subdisc_c method_d level_d subdisc_d resolver Classifier randomly assigned to resolve classification conflicts compressed results method_resolved Combination of c,d,l,m,o,p,s,t level_resolved 0, 1, 2, 3 subdisc_resolved Combination of act,dec,epi,lan,log,met,min,oth,sci expanded results c Method indicator variable (1=yes, 0=no) d l m o p s t act Subdiscipline indicator variable (1=yes, 0=no) dec epi lan log met min oth sci ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: phil_studies_7y_july_interrater_reliability.csv ----------------------------------------- 1. Number of variables: 92 (excluding ‘key’ as unique identifier column) 2. Number of cases/rows: 245 cases (excluding first row of column titles) Index: 245 entries, CBNREUXB to ATMYJ8IU Data columns (total 92 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 author 245 non-null string 1 title 245 non-null string 2 year 245 non-null Int64 3 vol 245 non-null Int64 4 iss 245 non-null string 5 pg_start 245 non-null Int64 6 pg_end 245 non-null Int64 7 doi 245 non-null string 8 num_authors 245 non-null Int64 9 discussion 245 non-null Int64 10 screener1 56 non-null Int64 11 screener2 78 non-null Int64 12 screener3 193 non-null Int64 13 screener4 109 non-null Int64 14 screener5 25 non-null Int64 15 screener6 61 non-null Int64 16 screen_num 245 non-null Int64 17 screen_ave 245 non-null Float64 18 screen_hits 245 non-null Int64 19 class_assignments 245 non-null string 20 method_a 122 non-null string 21 level_a 122 non-null Int64 22 subdisc_a 122 non-null string 23 method_b 124 non-null string 24 level_b 124 non-null Int64 25 subdisc_b 124 non-null string 26 method_c 122 non-null string 27 level_c 122 non-null Int64 28 subdisc_c 122 non-null string 29 method_d 122 non-null string 30 level_d 122 non-null Int64 31 subdisc_d 122 non-null string 32 resolver 207 non-null string 33 method_resolved 245 non-null string 34 level_resolved 245 non-null Int64 35 subdisc_resolved 245 non-null string 36 c 245 non-null Int64 37 d 245 non-null Int64 38 l 245 non-null Int64 39 m 245 non-null Int64 40 o 245 non-null Int64 41 p 245 non-null Int64 42 s 245 non-null Int64 43 t 245 non-null Int64 44 act 245 non-null Int64 45 dec 245 non-null Int64 46 epi 245 non-null Int64 47 lan 245 non-null Int64 48 log 245 non-null Int64 49 met 245 non-null Int64 50 min 245 non-null Int64 51 oth 245 non-null Int64 52 sci 245 non-null Int64 53 val 245 non-null Int64 54 c_1 245 non-null int64 55 c_2 245 non-null int64 56 d_1 245 non-null int64 57 d_2 245 non-null int64 58 l_1 245 non-null int64 59 l_2 245 non-null int64 60 m_1 245 non-null int64 61 m_2 245 non-null int64 62 o_1 245 non-null int64 63 o_2 245 non-null int64 64 p_1 245 non-null int64 65 p_2 245 non-null int64 66 s_1 245 non-null int64 67 s_2 245 non-null int64 68 t_1 245 non-null int64 69 t_2 245 non-null int64 70 level_1 245 non-null int64 71 level_2 245 non-null int64 72 act_1 245 non-null int64 73 act_2 245 non-null int64 74 dec_1 245 non-null int64 75 dec_2 245 non-null int64 76 epi_1 245 non-null int64 77 epi_2 245 non-null int64 78 lan_1 245 non-null int64 79 lan_2 245 non-null int64 80 log_1 245 non-null int64 81 log_2 245 non-null int64 82 met_1 245 non-null int64 83 met_2 245 non-null int64 84 min_1 245 non-null int64 85 min_2 245 non-null int64 86 oth_1 245 non-null int64 87 oth_2 245 non-null int64 88 sci_1 245 non-null int64 89 sci_2 245 non-null int64 90 val_1 245 non-null int64 91 val_2 245 non-null int64 dtypes: Float64(1), Int64(37), int64(38), string(16) memory usage: 187.1+ KB The first 54 variables (0 - 53 above) are the same as for phil_studies_7y_july_foranalysis_master.csv. The added variables (54 - 91 above) are for the calculation of interrater reliability. Since there were two classifiers for each article (but not always the same ones), two separate indicator variables were created for each expanded method variable (c, d, l, m, o, p, s, t) and for each expanded subdiscipline variable (act, dec, epi, lan, log, met, min, oth, sci, val) as well as for level classification. The two new variables (with suffixes '_1' and '_2') indicate the classifications made by the two classifiers assigned to that article. Again, the particular classifiers assigned are not the same from article to article.