Formal Methods
0.1 Overall change over time in the two broad families
0.2 Breaking things down by more specific methods
0.2.0.1 Proportion of logic papers using each more specific method
0.2.0.2 Proportion of probability papers using each more specific method
0.3 Breaking things down by level
0.3.0.1 Proportion of logic papers over time broken down by level
0.3.0.2 Proportion of probability papers over time broken down by level
0.4 Proportion of papers in 2010s in each family broken down by subdiscipline
0.5 Contingency table
0.6 Change over time in the two broad families
0.6.0.1 Logistic regression asking whether the proportion of papers that use logic is changing over time
0.6.0.2 Logistic regression asking whether the proportion of papers that use probability is changing over time
0.6.0.3 Is there still an increase over time in the probability family when we ignore the decision theory papers?
0.6.0.4 McNemar’s exact test asking whether the proportion of logic papers is higher than the proportion of probability papers in each time period
0.6.0.5 Is there still an increase over time in the probability family when we ignore the advanced papers?
0.7 Change over time in the individual methods
0.7.0.1 Non-Modal Logic
0.7.0.2 Modal Logic
0.7.0.3 Set theory
0.7.0.4 Probability theory
0.7.0.5 Game theory and decision theory
0.7.0.6 Statistics
0.7.0.7 Causal modeling
0.8 Interrater reliabilities
0.8.0.1 Families
0.8.0.2 Specific methods
0.8.0.3 Subdisciplines
0.8.0.4 Levels
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] conflicted_1.0.4 forcats_0.5.0    stringr_1.4.0    dplyr_1.0.2     
##  [5] purrr_0.3.4      readr_1.4.0      tidyr_1.1.2      tibble_3.0.4    
##  [9] tidyverse_1.3.0  psych_2.0.9      exact2x2_1.6.5   exactci_1.3-3   
## [13] ssanv_1.1        sjlabelled_1.1.7 sjmisc_2.8.6     sjPlot_2.8.9    
## [17] knitr_1.36       nnet_7.3-15      MASS_7.3-53.1    scales_1.1.1    
## [21] ggplot2_3.3.5   
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-149      fs_1.5.0          lubridate_1.7.9   insight_0.14.5   
##  [5] httr_1.4.2        tools_4.0.3       backports_1.1.10  bslib_0.3.1      
##  [9] R6_2.5.0          DBI_1.1.0         colorspace_1.4-1  withr_2.4.2      
## [13] tidyselect_1.1.0  mnormt_2.0.2      emmeans_1.6.2-1   compiler_4.0.3   
## [17] cli_3.1.0         rvest_0.3.6       performance_0.8.0 xml2_1.3.2       
## [21] sandwich_3.0-0    bayestestR_0.11.5 sass_0.4.0        mvtnorm_1.1-1    
## [25] digest_0.6.27     minqa_1.2.4       rmarkdown_2.11    pkgconfig_2.0.3  
## [29] htmltools_0.5.2   lme4_1.1-25       dbplyr_1.4.4      fastmap_1.1.0    
## [33] rlang_0.4.12      readxl_1.3.1      rstudioapi_0.13   jquerylib_0.1.4  
## [37] generics_0.0.2    zoo_1.8-8         jsonlite_1.7.1    magrittr_2.0.1   
## [41] parameters_0.15.0 Matrix_1.2-18     Rcpp_1.0.7        munsell_0.5.0    
## [45] lifecycle_1.0.1   stringi_1.5.3     multcomp_1.4-15   yaml_2.2.1       
## [49] grid_4.0.3        blob_1.2.1        parallel_4.0.3    crayon_1.3.4     
## [53] lattice_0.20-41   ggeffects_1.0.2   haven_2.4.3       splines_4.0.3    
## [57] sjstats_0.18.0    hms_0.5.3         tmvnsim_1.0-2     pillar_1.4.6     
## [61] boot_1.3-25       estimability_1.3  effectsize_0.5    codetools_0.2-16 
## [65] reprex_0.3.0      glue_1.4.2        evaluate_0.14     modelr_0.1.8     
## [69] vctrs_0.3.4       nloptr_1.2.2.2    cellranger_1.1.0  gtable_0.3.0     
## [73] datawizard_0.2.1  assertthat_0.2.1  cachem_1.0.6      xfun_0.28        
## [77] xtable_1.8-4      broom_0.7.10      coda_0.19-4       survival_3.2-7   
## [81] memoise_2.0.0     statmod_1.4.35    TH.data_1.0-10    ellipsis_0.3.1
0.1 Overall change over time in the two broad families
 
0.2 Breaking things down by more specific methods
0.2.0.1 Proportion of logic papers using each more specific method
 
0.2.0.2 Proportion of probability papers using each more specific method
## # A tibble: 2 x 5
##   Var1      Category      No   Yes Percent
##   <fct>     <chr>      <int> <int>   <dbl>
## 1 2005-2009 Statistics   369     1 0.00270
## 2 2015-2019 Statistics   532     7 0.0130
 
0.3 Breaking things down by level
0.3.0.1 Proportion of logic papers over time broken down by level
 
0.3.0.2 Proportion of probability papers over time broken down by level
 
0.4 Proportion of papers in 2010s in each family broken down by subdiscipline
  
0.5 Contingency table
##                      Log         Mod         Set        Caus        Prob
## Action       0.002702703 0.002702703 0.000000000 0.000000000 0.000000000
## Decision     0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## Epistemology 0.008108108 0.024324324 0.000000000 0.000000000 0.010810811
## Language     0.029729730 0.013513514 0.005405405 0.000000000 0.002702703
## Logic        0.021621622 0.005405405 0.002702703 0.000000000 0.000000000
## Metaphysics  0.021621622 0.021621622 0.013513514 0.002702703 0.000000000
## Mind         0.002702703 0.002702703 0.002702703 0.000000000 0.000000000
## Science      0.002702703 0.000000000 0.000000000 0.000000000 0.008108108
## Value        0.005405405 0.008108108 0.002702703 0.000000000 0.000000000
##                      Dec       Stats
## Action       0.000000000 0.000000000
## Decision     0.002702703 0.000000000
## Epistemology 0.000000000 0.002702703
## Language     0.000000000 0.000000000
## Logic        0.000000000 0.000000000
## Metaphysics  0.000000000 0.000000000
## Mind         0.000000000 0.000000000
## Science      0.000000000 0.000000000
## Value        0.002702703 0.000000000
##                      Log         Mod         Set        Caus        Prob
## Action       0.001855288 0.001855288 0.000000000 0.003710575 0.000000000
## Decision     0.001855288 0.000000000 0.000000000 0.000000000 0.001855288
## Epistemology 0.012987013 0.018552876 0.001855288 0.000000000 0.025974026
## Language     0.022263451 0.001855288 0.003710575 0.000000000 0.000000000
## Logic        0.020408163 0.007421150 0.003710575 0.000000000 0.000000000
## Metaphysics  0.035250464 0.024118738 0.018552876 0.005565863 0.007421150
## Mind         0.001855288 0.001855288 0.001855288 0.001855288 0.000000000
## Science      0.001855288 0.003710575 0.001855288 0.001855288 0.011131725
## Value        0.007421150 0.003710575 0.005565863 0.000000000 0.003710575
##                      Dec       Stats
## Action       0.001855288 0.003710575
## Decision     0.022263451 0.001855288
## Epistemology 0.005565863 0.000000000
## Language     0.000000000 0.000000000
## Logic        0.000000000 0.000000000
## Metaphysics  0.000000000 0.001855288
## Mind         0.000000000 0.001855288
## Science      0.001855288 0.001855288
## Value        0.011131725 0.005565863
0.6 Change over time in the two broad families
FormalModel <- glm(Formal ~ year, data= data)
summary(FormalModel)
## 
## Call:
## glm(formula = Formal ~ year, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.2274  -0.2216  -0.1987  -0.1872   0.8128  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.560329   5.416186  -1.027    0.305
## year         0.002867   0.002691   1.065    0.287
## 
## (Dispersion parameter for gaussian family taken to be 0.1661282)
## 
##     Null deviance: 150.87  on 908  degrees of freedom
## Residual deviance: 150.68  on 907  degrees of freedom
## AIC: 951.98
## 
## Number of Fisher Scoring iterations: 2
exp(confint(FormalModel))
## Waiting for profiling to be done...
##                        2.5 %     97.5 %
## (Intercept) 0.00000009438748 156.835912
## year        0.99759599228188   1.008173
table(data$TimePeriod, data$Formal)
##            
##               0   1
##   2005-2009 301  69
##   2015-2019 417 122
0.6.0.1 Logistic regression asking whether the proportion of papers that use logic is changing over time
## 
## Call:
## glm(formula = Logic ~ year, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.1577  -0.1537  -0.1476  -0.1435   0.8565  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  2.194112   4.744615   0.462    0.644
## year        -0.001016   0.002357  -0.431    0.667
## 
## (Dispersion parameter for gaussian family taken to be 0.1274848)
## 
##     Null deviance: 115.65  on 908  degrees of freedom
## Residual deviance: 115.63  on 907  degrees of freedom
## AIC: 711.31
## 
## Number of Fisher Scoring iterations: 2
## Waiting for profiling to be done...
##                    2.5 %      97.5 %
## (Intercept) 0.0008208564 98065.06425
## year        0.9943805509     1.00361
0.6.0.2 Logistic regression asking whether the proportion of papers that use probability is changing over time
ProbModel <- glm(Probability ~ year, data= data)
summary(ProbModel)
## 
## Call:
## glm(formula = Probability ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.10100  -0.08974  -0.07848  -0.03343   0.97783  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -11.267085   3.307159  -3.407 0.000686 ***
## year          0.005631   0.001643   3.427 0.000637 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.06193934)
## 
##     Null deviance: 56.906  on 908  degrees of freedom
## Residual deviance: 56.179  on 907  degrees of freedom
## AIC: 55.154
## 
## Number of Fisher Scoring iterations: 2
confint(ProbModel)
##                     2.5 %       97.5 %
## (Intercept) -17.748997420 -4.785172102
## year          0.002410505  0.008850601
exp(confint(ProbModel))
##                       2.5 %      97.5 %
## (Intercept) 0.0000000195753 0.008352686
## year        1.0024134126886 1.008889883
0.6.0.3 Is there still an increase over time in the probability family when we ignore the decision theory papers?
noDec <- filter(data, Decision != "Yes")

ProbModel <- glm(Probability ~ year, data= noDec)
summary(ProbModel)
## 
## Call:
## glm(formula = Probability ~ year, data = noDec)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.06339  -0.05737  -0.05134  -0.02726   0.97876  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -6.015201   2.779887  -2.164   0.0307 *
## year         0.003011   0.001381   2.180   0.0295 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.04288306)
## 
##     Null deviance: 38.198  on 887  degrees of freedom
## Residual deviance: 37.994  on 886  degrees of freedom
## AIC: -272.53
## 
## Number of Fisher Scoring iterations: 2
confint(ProbModel)
##                      2.5 %       97.5 %
## (Intercept) -11.4636794315 -0.566721589
## year          0.0003039303  0.005717454
exp(confint(ProbModel))
##                     2.5 %    97.5 %
## (Intercept) 0.00001050479 0.5673825
## year        1.00030397651 1.0057338
0.6.0.4 McNemar’s exact test asking whether the proportion of logic papers is higher than the proportion of probability papers in each time period
##            Logic
## Probability Absent Present
##     Absent     304      55
##     Present      9       2
## 
##  Exact McNemar test (with central confidence intervals)
## 
## data:  OldFreqs
## b = 55, c = 9, p-value = 0.000000003542
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##   2.996214 14.066124
## sample estimates:
## odds ratio 
##   6.111111
##            Logic
## Probability Absent Present
##     Absent     421      68
##     Present     39      11
## 
##  Exact McNemar test (with central confidence intervals)
## 
## data:  NowFreqs
## b = 68, c = 39, p-value = 0.006518
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  1.159364 2.655239
## sample estimates:
## odds ratio 
##    1.74359
0.6.0.5 Is there still an increase over time in the probability family when we ignore the advanced papers?
withoutAdvanced <- data %>% 
  filter(LogLevel != 3)

ProbModel <- glm(Probability ~ year, data= withoutAdvanced)
summary(ProbModel)
## 
## Call:
## glm(formula = Probability ~ year, data = withoutAdvanced)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.09763  -0.08699  -0.07635  -0.03378   0.97686  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -10.645350   3.306997  -3.219  0.00133 **
## year          0.005321   0.001643   3.239  0.00125 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.06040846)
## 
##     Null deviance: 54.216  on 888  degrees of freedom
## Residual deviance: 53.582  on 887  degrees of freedom
## AIC: 31.78
## 
## Number of Fisher Scoring iterations: 2
confint(ProbModel)
##                     2.5 %       97.5 %
## (Intercept) -17.126945635 -4.163754681
## year          0.002100944  0.008540942
exp(confint(ProbModel))
##                        2.5 %     97.5 %
## (Intercept) 0.00000003646381 0.01554907
## year        1.00210315288999 1.00857752
withoutAdvanced <- data %>% 
  filter(ProbLevel != 3)

ProbModel <- glm(Probability ~ year, data= withoutAdvanced)
summary(ProbModel)
## 
## Call:
## glm(formula = Probability ~ year, data = withoutAdvanced)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.07791  -0.06988  -0.06186  -0.02975   0.97827  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -8.023895   2.999292  -2.675  0.00760 **
## year         0.004013   0.001490   2.693  0.00721 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.05040602)
## 
##     Null deviance: 45.429  on 895  degrees of freedom
## Residual deviance: 45.063  on 894  degrees of freedom
## AIC: -130.19
## 
## Number of Fisher Scoring iterations: 2
confint(ProbModel)
##                     2.5 %       97.5 %
## (Intercept) -13.902400214 -2.145389778
## year          0.001092424  0.006933136
exp(confint(ProbModel))
##                       2.5 %    97.5 %
## (Intercept) 0.0000009167783 0.1170224
## year        1.0010930213055 1.0069572
0.7 Change over time in the individual methods
0.7.0.1 Non-Modal Logic
## 
## Call:
## glm(formula = Nonmodal ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.08110  -0.08011  -0.07911  -0.07514   0.92586  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.922428   3.569592  -0.258    0.796
## year         0.000497   0.001773   0.280    0.779
## 
## (Dispersion parameter for gaussian family taken to be 0.07215951)
## 
##     Null deviance: 65.454  on 908  degrees of freedom
## Residual deviance: 65.449  on 907  degrees of freedom
## AIC: 193.98
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.000497
##                    2.5 %     97.5 %
## (Intercept) 0.0003638751 434.347253
## year        0.9970259050   1.003981
0.7.0.2 Modal Logic
## 
## Call:
## glm(formula = Modal ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.07404  -0.06726  -0.05708  -0.05369   0.94970  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  3.473791   3.169707   1.096    0.273
## year        -0.001696   0.001575  -1.077    0.282
## 
## (Dispersion parameter for gaussian family taken to be 0.05689767)
## 
##     Null deviance: 51.672  on 908  degrees of freedom
## Residual deviance: 51.606  on 907  degrees of freedom
## AIC: -22.021
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 0.9983058
##                  2.5 %       97.5 %
## (Intercept) 0.06465309 16095.611914
## year        0.99522956     1.001392
0.7.0.3 Set theory
## 
## Call:
## glm(formula = Set ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.03457  -0.03295  -0.03133  -0.02487   0.97674  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.5961673  2.2577626  -0.707    0.480
## year         0.0008077  0.0011216   0.720    0.472
## 
## (Dispersion parameter for gaussian family taken to be 0.02886775)
## 
##     Null deviance: 26.198  on 908  degrees of freedom
## Residual deviance: 26.183  on 907  degrees of freedom
## AIC: -638.8
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.000808
##                   2.5 %    97.5 %
## (Intercept) 0.002426511 16.927951
## year        0.998610367  1.003011
0.7.0.4 Probability theory
## 
## Call:
## glm(formula = Prob ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.04360  -0.03971  -0.03583  -0.02029   0.98360  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -3.878176   2.334296  -1.661   0.0970 .
## year         0.001942   0.001160   1.675   0.0943 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.03085802)
## 
##     Null deviance: 28.075  on 908  degrees of freedom
## Residual deviance: 27.988  on 907  degrees of freedom
## AIC: -578.2
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.001944
##                    2.5 %   97.5 %
## (Intercept) 0.0002131934 2.007639
## year        0.9996696777 1.004224
0.7.0.5 Game theory and decision theory
## 
## Call:
## glm(formula = Decision ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.04000  -0.03439  -0.02877  -0.00632   0.99930  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -5.6274915  1.9896648  -2.828  0.00478 **
## year         0.0028071  0.0009884   2.840  0.00461 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.022419)
## 
##     Null deviance: 20.515  on 908  degrees of freedom
## Residual deviance: 20.334  on 907  degrees of freedom
## AIC: -868.61
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.002811
##                     2.5 %    97.5 %
## (Intercept) 0.00007284601 0.1776713
## year        1.00087020054 1.0047556
0.7.0.6 Statistics
## 
## Call:
## glm(formula = Statistics ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.01409  -0.01233  -0.01058  -0.00355   0.99470  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.7595067  1.2411039  -1.418    0.157
## year         0.0008785  0.0006165   1.425    0.155
## 
## (Dispersion parameter for gaussian family taken to be 0.008723136)
## 
##     Null deviance: 7.9296  on 908  degrees of freedom
## Residual deviance: 7.9119  on 907  degrees of freedom
## AIC: -1726.6
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.000879
##                  2.5 %   97.5 %
## (Intercept) 0.01511564 1.960133
## year        0.99967009 1.002089
0.7.0.7 Causal modeling
## 
## Call:
## glm(formula = Causal ~ year, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.01141  -0.01018  -0.00895  -0.00402   0.99598  
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.2325464  1.1621602  -1.061    0.289
## year         0.0006161  0.0005773   1.067    0.286
## 
## (Dispersion parameter for gaussian family taken to be 0.007648714)
## 
##     Null deviance: 6.9461  on 908  degrees of freedom
## Residual deviance: 6.9374  on 907  degrees of freedom
## AIC: -1846.1
## 
## Number of Fisher Scoring iterations: 2
Odds ratio and confidence interval
## [1] 1.000616
##                  2.5 %   97.5 %
## (Intercept) 0.02988679 2.844098
## year        0.99948471 1.001749
0.8 Interrater reliabilities
0.8.0.1 Families
agreement(data$Logic_1, data$Logic_2)
##   agreement     kappa
## 1 0.9192825 0.8063302
agreement(data$Probability_1, data$Probability_2)
##   agreement     kappa
## 1 0.9327354 0.8392986
0.8.0.2 Specific methods
agreement(data$c_1, data$c_2)
##   agreement     kappa
## 1 0.9820628 0.6574501
agreement(data$p_1, data$p_2)
##   agreement     kappa
## 1 0.9282511 0.7081629
agreement(data$d_1, data$d_2)
##   agreement     kappa
## 1 0.9461883 0.7093831
agreement(data$t_1, data$t_2)
## Warning in cohen.kappa1(x, w = w, n.obs = n.obs, alpha = alpha, levels =
## levels): upper or lower confidence interval exceed abs(1) and set to +/- 1.
##   agreement     kappa
## 1 0.9955157 0.9310238
agreement(data$l_1, data$l_2)
##   agreement     kappa
## 1 0.7713004 0.5163101
agreement(data$m_1, data$m_2)
##   agreement     kappa
## 1 0.8430493 0.5856999
agreement(data$s_1, data$s_2)
##   agreement     kappa
## 1 0.8430493 0.3367043
0.8.0.3 Subdisciplines
agreement(data$act_1, data$act_2)
##   agreement     kappa
## 1 0.9506726 0.5360318
agreement(data$dec_1, data$dec_2)
##   agreement     kappa
## 1 0.9641256 0.6737381
agreement(data$epi_1, data$epi_2)
##   agreement     kappa
## 1 0.9013453 0.7219136
agreement(data$lan_1, data$lan_2)
##   agreement     kappa
## 1 0.9058296 0.6706519
agreement(data$log_1, data$log_2)
##   agreement     kappa
## 1 0.9058296 0.5352784
agreement(data$met_1, data$met_2)
##   agreement     kappa
## 1 0.9147982 0.7929029
agreement(data$min_1, data$min_2)
##   agreement     kappa
## 1 0.9596413 0.5874615
agreement(data$sci_1, data$sci_2)
##   agreement    kappa
## 1 0.9461883 0.618912
agreement(data$val_1, data$val_2)
##   agreement     kappa
## 1 0.9461883 0.7395874
0.8.0.4 Levels
cor.test(data$level_1, data$level_2,  method = "spearman")
## Warning in cor.test.default(data$level_1, data$level_2, method = "spearman"):
## Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  data$level_1 and data$level_2
## S = 585529, p-value < 0.00000000000000022
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.6831935
data <- data %>% 
  mutate(
  Formal1 = case_when(
    level_1 == 0 ~ 0,
    TRUE ~ 1),
  Formal2 = case_when(
    level_2 == 0 ~ 0,
    TRUE ~ 1)
  )

table(data$Formal1, data$Formal2)
##    
##       0   1
##   0  17  17
##   1  17 172
agreement(data$Formal1, data$Formal2)
##   agreement     kappa
## 1 0.8475336 0.4100529
Formal <- data %>% 
  filter(level_resolved != 0)


cor.test(Formal$level_1, Formal$level_2,  method = "spearman")
## Warning in cor.test.default(Formal$level_1, Formal$level_2, method =
## "spearman"): Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  Formal$level_1 and Formal$level_2
## S = 492706, p-value < 0.00000000000000022
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.5757215