This readme.txt file was generated on 2021-10-04 by Brian Allan Woodcock. 
It was modified on 2021-11-06 by Samuel C. Fletcher.
Data-specific information updated on 2021-11-14 by Brian Allan Woodcock. 
It was modified on 2021-12-01 by Samuel C. Fletcher.

-------------------
GENERAL INFORMATION
-------------------


1. Title of Dataset 
Classification of formal methods use, type, sophistication, and subdiscipline in the journal Philosophical Studies, 1999, 2005, 2007, 2009, 2015, 2017, 2019

2. Author Information


  Principal Investigator Contact Information
        Name: Samuel C. Fletcher
           Institution:  University of Minnesota
           Address:
           Email: scfletch@umn.edu
	   ORCID: 0000-0002-9061-8976

  Associate or Co-investigator Contact Information
        Name: Joshua Knobe
           Institution: Yale University
           Address:
           Email: joshua.knobe@yale.edu
	   ORCID: 0000-0003-0733-3775

  Associate or Co-investigator Contact Information
           Name: Gregory Wheeler
           Institution:  Frankfurt School of Finance & Management
           Address:
           Email: G.Wheeler@fs.de
	   ORCID:

  Associate or Co-investigator Contact Information
           Name: Brian Allan Woodcock   
           Institution: University of Minnesota
           Address: 
           Email: brianwoodcock99@gmail.com
	   ORCID: 0000-0003-1423-939X

3. Date of data collection
August 2020: Article gathering
Sep 2020 - July 2021: Article screening and classification


4. Geographic location of data collection: 
Article screening occurred in Minnesota
Article classification: NA 


5. Information about funding sources that supported the collection of the data:
Grant-in-aid to Samuel C. Fletcher from the Office of the Vice President for Research, University of Minnesota



--------------------------
SHARING/ACCESS INFORMATION
-------------------------- 


1. Licenses/restrictions placed on the data:


2. Links to publications that cite or use the data:
https://dailynous.com/2021/09/24/evidence-for-a-probabilistic-turn-in-philosophy-guest-post/
http://philsci-archive.pitt.edu/19575/1/Changing_Use_of_Formal_Methods_in_Philosophy__Late_2000s_vs__Late_2010s(2).pdf


3. Links to other publicly accessible locations of the data:


4. Links/relationships to ancillary data sets:


5. Was data derived from another source? 
           If yes, list source(s): Data was scraped from the Philpapers.org database.


6. Recommended citation for the data:




---------------------
DATA & FILE OVERVIEW
---------------------


1. File List
   A. Filename:        phil_studies_7y_july_foranalysis_master.csv
      Short description:        the final, master data set
      Date completed in current form:   2021-07-29
        
   B. Filename:        phil_studies_7y_july_interrater_reliability.csv
      Short description:        data set created from the master for the calculation of classification interrater reliability 
      Date completed in current form:  2021-08-02
        
   C. Filename:        formal_methods_interrater_reliability.ipynb
      Short description:        Python notebook for generating '...interrater_reliability.csv' from '...foranalysis_master.csv'
                                and calculating interrater reliabilities
      Date completed in current form:   2021-11-14

   D. Filename:		formal_methods_interrater_reliability.html
      Short description:        HTML version of Python notebook for generating interrater reliability
      Date completed in current form:   2021-11-14

   E. Filename:        Formal methods.Rmd
      Short description:        R notebook with statistical analyses using the data sets
      Date completed in current form:   2021-11-26

   F. Filename:		Formal-methods.html
      Short description:	HTML compiled version of R notebook for statistical analyses using the data sets
     Date completed in current form:   2021-11-26

   G. Filename:		Stages in the Formal Methods Project.pdf
      Short description:	elaboration of the data and code generation stages, with process and code files
      Date completed in current form:   2021-10-02

2. Relationship between files:        
See 'Stages in the Formal Methods Project.pdf'.



3. Additional related data collected that was not included in the current data package:




4. Are there multiple versions of the dataset? 

Intermediary versions were generated with different names.  None has this exact name. See 'Stages in the Formal Methods Project.pdf'.






--------------------------
METHODOLOGICAL INFORMATION
--------------------------


1. Description of methods used for collection/generation of data: 
See the paper.
http://philsci-archive.pitt.edu/19575/


2. Methods for processing the data: 
See the paper.
http://philsci-archive.pitt.edu/19575/


3. Instrument- or software-specific information needed to interpret the data:


4. Standards and calibration information, if appropriate:


5. Environmental/experimental conditions:


6. Describe any quality-assurance procedures performed on the data:
See the paper.
http://philsci-archive.pitt.edu/19575/


7. People involved with sample collection, processing, analysis and/or submission:
See the paper.
http://philsci-archive.pitt.edu/19575/





-----------------------------------------
DATA-SPECIFIC INFORMATION FOR: phil_studies_7y_july_foranalysis_master.csv
-----------------------------------------


1. Number of variables: 54 (excluding ‘key’ as unique identifier column)


2. Number of cases/rows: 975 cases (excluding first row of column titles)


Index: 975 entries, CBNREUXB to VIT964GL
Data columns (total 54 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   author             975 non-null    string 
 1   title              975 non-null    string 
 2   year               975 non-null    Int64  
 3   vol                975 non-null    Int64  
 4   iss                975 non-null    string 
 5   pg_start           975 non-null    Int64  
 6   pg_end             975 non-null    Int64  
 7   doi                975 non-null    string 
 8   num_authors        975 non-null    Int64  
 9   discussion         975 non-null    Int64  
 10  screener1          211 non-null    Int64  
 11  screener2          330 non-null    Int64  
 12  screener3          735 non-null    Int64  
 13  screener4          443 non-null    Int64  
 14  screener5          73 non-null     Int64  
 15  screener6          264 non-null    Int64  
 16  screen_num         975 non-null    Int64  
 17  screen_ave         975 non-null    Float64
 18  screen_hits        975 non-null    Int64  
 19  class_assignments  245 non-null    string 
 20  method_a           122 non-null    string 
 21  level_a            122 non-null    Int64  
 22  subdisc_a          122 non-null    string 
 23  method_b           124 non-null    string 
 24  level_b            124 non-null    Int64  
 25  subdisc_b          124 non-null    string 
 26  method_c           122 non-null    string 
 27  level_c            122 non-null    Int64  
 28  subdisc_c          122 non-null    string 
 29  method_d           122 non-null    string 
 30  level_d            122 non-null    Int64  
 31  subdisc_d          122 non-null    string 
 32  resolver           207 non-null    string 
 33  method_resolved    245 non-null    string 
 34  level_resolved     245 non-null    Int64  
 35  subdisc_resolved   245 non-null    string 
 36  c                  245 non-null    Int64  
 37  d                  245 non-null    Int64  
 38  l                  245 non-null    Int64  
 39  m                  245 non-null    Int64  
 40  o                  245 non-null    Int64  
 41  p                  245 non-null    Int64  
 42  s                  245 non-null    Int64  
 43  t                  245 non-null    Int64  
 44  act                245 non-null    Int64  
 45  dec                245 non-null    Int64  
 46  epi                245 non-null    Int64  
 47  lan                245 non-null    Int64  
 48  log                245 non-null    Int64  
 49  met                245 non-null    Int64  
 50  min                245 non-null    Int64  
 51  oth                245 non-null    Int64  
 52  sci                245 non-null    Int64  
 53  val                245 non-null    Int64  
dtypes: Float64(1), Int64(37), string(16)


Classifications Key for Method, Level, and Subdiscipline

method 		                    level 		                subdiscipline
c - Causal modeling 		    0 - None 		            act - Action & free will
d - Decision & game theory 		1 - Basic/Fundamental 		dec - Decision & Game Theory
l - Logic 		                2 - Intermediate 		    epi - Epistemology
m - Logic: modal 		        3 - Advanced 		        lan - Philosophy of Language
o - Other 				                                    log - Logic
p - Probability 				                            met - Metaphysics
s - Set theory and theory of relations 				        min - Philosophy of Mind
t - Statistics 				                                oth - Other
				                                            sci - Philosophy of Science


VARIABLE GROUP      VARIABLES           DESCRIPTION
article info        author              Self-explanatory article information variables
                    title
                    year
                    vol
                    iss
                    pg_start
                    pg_end
                    doi
                    num_authors

                    discussion          Attempt at: Is this article part of a discussion? (1=yes, 0=no)
                    
article screening   screener1           Uses formal methods? (1=yes, 0=no)
                    screener2
                    screener3
                    screener4
                    screener5
                    screener6
                    
                    screen_num          Screeners per article
                    screen_ave          Average screening score
                    screen_hits         Uses formal methods, based on screen_ave >= 0.5 (1=yes, 0=no) 
                    
classifications     class_assignments   Which two classifiers randomly assigned to the article? (2 from a, b, c, or d)
                    
                    method_a            Method classification by a
                    level_a             Level classification by a
                    subdisc_a           Subdiscipline classification by a
                    method_b
                    level_b
                    subdisc_b
                    method_c
                    level_c
                    subdisc_c
                    method_d
                    level_d
                    subdisc_d

                    resolver            Classifier randomly assigned to resolve classification conflicts
                    
compressed results  method_resolved     Combination of c,d,l,m,o,p,s,t
                    level_resolved      0, 1, 2, 3
                    subdisc_resolved    Combination of act,dec,epi,lan,log,met,min,oth,sci 
                    
expanded results    c                   Method indicator variable (1=yes, 0=no)
                    d
                    l
                    m
                    o
                    p
                    s
                    t
                    
                    act                 Subdiscipline indicator variable (1=yes, 0=no)
                    dec 
                    epi 
                    lan 
                    log 
                    met 
                    min 
                    oth 
                    sci 
                    

-----------------------------------------
DATA-SPECIFIC INFORMATION FOR: phil_studies_7y_july_interrater_reliability.csv
-----------------------------------------

1. Number of variables: 92 (excluding ‘key’ as unique identifier column)

2. Number of cases/rows: 245 cases (excluding first row of column titles)


<class 'pandas.core.frame.DataFrame'>
Index: 245 entries, CBNREUXB to ATMYJ8IU
Data columns (total 92 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   author             245 non-null    string 
 1   title              245 non-null    string 
 2   year               245 non-null    Int64  
 3   vol                245 non-null    Int64  
 4   iss                245 non-null    string 
 5   pg_start           245 non-null    Int64  
 6   pg_end             245 non-null    Int64  
 7   doi                245 non-null    string 
 8   num_authors        245 non-null    Int64  
 9   discussion         245 non-null    Int64  
 10  screener1          56 non-null     Int64  
 11  screener2          78 non-null     Int64  
 12  screener3          193 non-null    Int64  
 13  screener4          109 non-null    Int64  
 14  screener5          25 non-null     Int64  
 15  screener6          61 non-null     Int64  
 16  screen_num         245 non-null    Int64  
 17  screen_ave         245 non-null    Float64
 18  screen_hits        245 non-null    Int64  
 19  class_assignments  245 non-null    string 
 20  method_a           122 non-null    string 
 21  level_a            122 non-null    Int64  
 22  subdisc_a          122 non-null    string 
 23  method_b           124 non-null    string 
 24  level_b            124 non-null    Int64  
 25  subdisc_b          124 non-null    string 
 26  method_c           122 non-null    string 
 27  level_c            122 non-null    Int64  
 28  subdisc_c          122 non-null    string 
 29  method_d           122 non-null    string 
 30  level_d            122 non-null    Int64  
 31  subdisc_d          122 non-null    string 
 32  resolver           207 non-null    string 
 33  method_resolved    245 non-null    string 
 34  level_resolved     245 non-null    Int64  
 35  subdisc_resolved   245 non-null    string 
 36  c                  245 non-null    Int64  
 37  d                  245 non-null    Int64  
 38  l                  245 non-null    Int64  
 39  m                  245 non-null    Int64  
 40  o                  245 non-null    Int64  
 41  p                  245 non-null    Int64  
 42  s                  245 non-null    Int64  
 43  t                  245 non-null    Int64  
 44  act                245 non-null    Int64  
 45  dec                245 non-null    Int64  
 46  epi                245 non-null    Int64  
 47  lan                245 non-null    Int64  
 48  log                245 non-null    Int64  
 49  met                245 non-null    Int64  
 50  min                245 non-null    Int64  
 51  oth                245 non-null    Int64  
 52  sci                245 non-null    Int64  
 53  val                245 non-null    Int64  
 54  c_1                245 non-null    int64  
 55  c_2                245 non-null    int64  
 56  d_1                245 non-null    int64  
 57  d_2                245 non-null    int64  
 58  l_1                245 non-null    int64  
 59  l_2                245 non-null    int64  
 60  m_1                245 non-null    int64  
 61  m_2                245 non-null    int64  
 62  o_1                245 non-null    int64  
 63  o_2                245 non-null    int64  
 64  p_1                245 non-null    int64  
 65  p_2                245 non-null    int64  
 66  s_1                245 non-null    int64  
 67  s_2                245 non-null    int64  
 68  t_1                245 non-null    int64  
 69  t_2                245 non-null    int64  
 70  level_1            245 non-null    int64  
 71  level_2            245 non-null    int64  
 72  act_1              245 non-null    int64  
 73  act_2              245 non-null    int64  
 74  dec_1              245 non-null    int64  
 75  dec_2              245 non-null    int64  
 76  epi_1              245 non-null    int64  
 77  epi_2              245 non-null    int64  
 78  lan_1              245 non-null    int64  
 79  lan_2              245 non-null    int64  
 80  log_1              245 non-null    int64  
 81  log_2              245 non-null    int64  
 82  met_1              245 non-null    int64  
 83  met_2              245 non-null    int64  
 84  min_1              245 non-null    int64  
 85  min_2              245 non-null    int64  
 86  oth_1              245 non-null    int64  
 87  oth_2              245 non-null    int64  
 88  sci_1              245 non-null    int64  
 89  sci_2              245 non-null    int64  
 90  val_1              245 non-null    int64  
 91  val_2              245 non-null    int64  
dtypes: Float64(1), Int64(37), int64(38), string(16)
memory usage: 187.1+ KB

The first 54 variables (0 - 53 above) are the same as for phil_studies_7y_july_foranalysis_master.csv.
The added variables (54 - 91 above) are for the calculation of interrater reliability.  

Since there were two classifiers for each article (but not always the same ones),
two separate indicator variables were created for each expanded method variable (c, d, l, m, o, p, s, t)
and for each expanded subdiscipline variable (act, dec, epi, lan, log, met, min, oth, sci, val)
as well as for level classification.
The two new variables (with suffixes '_1' and '_2') indicate the classifications made by the two classifiers
assigned to that article.  Again, the particular classifiers assigned are not the same from article to article.