Causal Network Analysis in the Human Brain: Applications in Cognitive Control and Parkinson’s Disease A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Satya Venkata Sandeep Avvaru IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Prof. Keshab K. Parhi, Advisor April, 2022 © Satya Venkata Sandeep Avvaru 2022 ALL RIGHTS RESERVED Acknowledgements I would like to express my sincere gratitude to my advisor Prof. Keshab Parhi for his guidance throughout my Ph.D. journey and his continued support. I would like to thank Prof. Alik Widge for his mentorship, encouragement, and invaluable feedback. I would like to thank Prof. Emad Ebbini and Prof. Andrew Lamperski for serving as members of my Ph.D. committee and for taking the time to review my work. I would like to thank all my current and former lab mates, especially Sand- hya Koteshwara, Bhaskar Sen, Nanda Kumar Unnikrishnan, and Lulu Ge for their friendship, encouragement and suggestions. I am also very thankful for my friends in Minnesota, Krishna Sandeep Prata, Anirudh Reddy Ravula, Teja Dasari, Vas- anth Ravikumar, Prabhavant Reddy Chilukuri, Nicole Shaffer and Ankit Moldgy, for their constant moral support and ongoing friendship. Lastly, I am incredibly grateful to my family, especially my parents, for their unconditional love and belief in me. Without their support, this work would not have been possible. i Dedication I dedicate this dissertation to my mother, Prof. Annapurna Nowduri, and my grandfather, Prof. Rajagopala Rao Nowduri, who are tremendous sources of strength and inspiration in every sphere of my life. ii Abstract The human brain is an efficient organization of 100 billion neurons anatomically connected by about 100 trillion synapses over multiple scales of space and func- tionally interactive over multiple scales of time. The recent mathematical and con- ceptual development of network science combined with the technological advance- ment of measuring neuronal dynamics motivated the field of network neuroscience. Network science provides a particularly appropriate framework to study several mechanisms in the brain by treating neural elements (a population of neurons, a sub-region) as nodes in a graph and neural interactions (synaptic connections, information flow) as its edges. The central goal of network neuroscience is to link macro-scale human brain network topology to cognitive functions and pathology. Although interactions between any two neural elements are inherently asymmetri- cal, few techniques characterize directional/causal connectivity. This dissertation proposes model-free techniques to estimate and analyze nonlinear causal interac- tions in the human brain. The proposed methods were employed to build machine learning models that decode the network organization using electrophysiological signals. Mental disorders constitute a significant source of disability, with few effec- tive treatments. Dysfunctional cognitive control is a common element in various psychiatric disorders. The first part of the dissertation addresses the challenge of decoding human cognitive control. To this end, we analyze local field potentials (LFP) from 10 human subjects to discover network biomarkers of cognitive con- flict. We utilize cortical and subcortical LFP recordings from the subjects during a cognitive task known as the Multi-Source Interference Task (MSIT). We pro- pose a novel method called maximal variance node merging (MVNM) that merges nodes within a brain region to construct informative inter-region brain networks. Region-level effective (causal) networks computed using convergent cross-mapping iii and MVNM differentiate task engagement from background neural activity with 85% median classification accuracy. We also derive task engagement networks (TENs) that constitute the most discriminative inter-region connections. Sub- sequent graph analysis illustrates the crucial role of the dorsolateral prefrontal cortex (dlPFC) in task engagement, consistent with a widely accepted model for cognition. We also show that task engagement is linked to the theta (4-8 Hz) oscillations in the prefrontal cortex. Thus, we decode the task engagement and discover biomarkers that may facilitate closed-loop neuromodulation to enhance cognitive control. In the second part of the dissertation, the main goal is to use network fea- tures derived from non-invasive electroencephalography (EEG) to develop neural decoders that can differentiate Parkinson’s disease (PD) patients from healthy controls (HC). We introduce a novel causality measure called frequency-domain convergent cross-mapping (FDCCM) that utilizes frequency-domain dynamics through nonlinear state-space reconstruction. Using synthesized chaotic time- series, we investigate the general applicability of FDCCM at different causal strengths and noise levels. We show that FDCCM is resilient to additive Gaus- sian noise, making it suitable for real-world data. We used FDCCM networks estimated from scalp-EEG signals to classify the PD and HC groups with ap- proximately 97% accuracy. The classifiers achieve high accuracy, independent of the patients’ medication status. More importantly, our spectral-based causal- ity measure can significantly improve classification performance and reveal useful network biomarkers of Parkinson’s disease. Overall, this dissertation provides valuable techniques for causal network construction and analysis. Their usage is demonstrated on two applications: decoding cognitive control and detecting Parkinson’s disease. These methods can be extended to other neurological and psychiatric conditions to elucidate their network mechanisms. iv Contents Acknowledgements i Dedication ii Abstract iii List of Tables x List of Figures xii 1 Introduction 1 1.1 Challenges in Brain Network Analysis . . . . . . . . . . . . . . . . 1 1.2 Research Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Part I: Decoding Cognitive Control . . . . . . . . . . . . . 3 1.2.2 Part II: Parkinson’s Disease Detection . . . . . . . . . . . 3 1.3 Dissertation Structure and Outline . . . . . . . . . . . . . . . . . 5 1.4 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 6 2 Background: Measures of Effective Connectivity 9 2.1 Brain Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Summary of Effective Connectivity Measures . . . . . . . . . . . . 10 2.2.1 Granger Causality . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Directed Transfer Function . . . . . . . . . . . . . . . . . . 12 2.2.3 Partial Directed Coherence . . . . . . . . . . . . . . . . . . 13 v 2.2.4 Transfer Entropy . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Model-Free Effective Connectivity Measures . . . . . . . . . . . . 15 2.3.1 Directed Information . . . . . . . . . . . . . . . . . . . . . 15 2.3.2 Convergent Cross-Mapping . . . . . . . . . . . . . . . . . . 16 PART I: DECODING COGNITIVE CONTROL 18 3 Introduction to Cognitive Control 19 3.1 Human Cognitive Control Definition . . . . . . . . . . . . . . . . 19 3.2 Physiological Correlates of Cognitive Control . . . . . . . . . . . . 20 3.3 Cognitive Control Deficits and Psychiatric disorders . . . . . . . . 21 3.4 The Multi-Source Interference Task (MSIT) . . . . . . . . . . . . 21 3.5 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . 22 3.5.2 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5.3 Signal Acquisition and Preprocessing . . . . . . . . . . . . 23 3.5.4 Defining Task and Non-Task Segments . . . . . . . . . . . 24 4 Decoding Cognitive Control Using Spectral features 27 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . 28 4.2.2 Features and Classification . . . . . . . . . . . . . . . . . . 28 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.3.1 Classifier Performance . . . . . . . . . . . . . . . . . . . . 30 4.3.2 The Role of Theta and High Gamma Bands . . . . . . . . 31 4.3.3 Comparison of Regions . . . . . . . . . . . . . . . . . . . . 32 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5 Decoding Cognitive Control Using Channel Level Networks 35 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 vi 5.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . 37 5.2.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . 37 5.2.2 Class Label Assignment . . . . . . . . . . . . . . . . . . . 37 5.2.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 37 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.3.1 PCA Features . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.3.2 Classification Results . . . . . . . . . . . . . . . . . . . . . 41 5.3.3 Classifier Runtimes . . . . . . . . . . . . . . . . . . . . . . 42 5.4 Comparison with Effective Networks . . . . . . . . . . . . . . . . 43 5.5 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . 45 6 Maximal Variance Node Merging and Decoding Cognitive Con- trol Using Region-Level Networks 46 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . 49 6.2.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . 49 6.2.2 Connectivity Measures . . . . . . . . . . . . . . . . . . . . 49 6.2.3 Maximal Variance Node Merging and Region-Level Networks 50 6.2.4 Edge Importance Score and Task Engagement Networks (TENs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6.2.5 Task vs. Non-Task Classification . . . . . . . . . . . . . . 56 6.2.6 Subband Networks . . . . . . . . . . . . . . . . . . . . . . 56 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.3.1 Identifying Task States . . . . . . . . . . . . . . . . . . . . 57 6.3.2 Classification Results Using Mean Region Networks . . . . 59 6.3.3 Task Engagement Networks . . . . . . . . . . . . . . . . . 59 6.3.4 Subject-Specific Task Engagement Networks . . . . . . . . 62 6.3.5 TENs in Left and Right Hemispheres . . . . . . . . . . . . 66 6.3.6 Increased Theta Band Activity During Task Performance . 66 vii 6.3.7 Theta Band Network Interactions Differentiate Task and Non-task States . . . . . . . . . . . . . . . . . . . . . . . . 70 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 PART II: PARKINSON’S DISEASE DETECTION 76 7 Introduction to Parkinson’s Disease 77 7.1 Parkinson’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.2 Network Biomarkers of Parkinson’s Disease . . . . . . . . . . . . . 78 7.3 Parkinson’s Data Used in the Dissertation . . . . . . . . . . . . . 79 7.3.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 7.3.2 EEG Recordings . . . . . . . . . . . . . . . . . . . . . . . 81 8 Distinguishing Parkinson’s Disease Patients Using Functional Brain Networks 83 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 8.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . 84 8.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 8.2.2 Feature Extraction Using Network Analysis . . . . . . . . 85 8.2.3 Feature Selection and Classification . . . . . . . . . . . . . 87 8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 8.3.1 Single-Channel Classification . . . . . . . . . . . . . . . . . 88 8.3.2 Multi-Channel Classification . . . . . . . . . . . . . . . . . 88 8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 9 Frequency-Domain Convergent Cross-Mapping: A Novel Causal Connectivity Measure 93 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 9.2 A Brief Introduction to Convergent Cross-Mapping (CCM) . . . . 94 viii 9.3 Frequency-Domain Convergent Cross-Mapping (FDCCM) . . . . . 96 9.3.1 The Basic Concept . . . . . . . . . . . . . . . . . . . . . . 96 9.3.2 Cross-Mapping in FDCCM . . . . . . . . . . . . . . . . . . 96 9.3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 97 9.3.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 9.4 Toy Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 99 9.4.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . 99 9.4.2 Toy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 9.4.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 100 9.4.4 Effect of Coupling Strength . . . . . . . . . . . . . . . . . 101 9.4.5 Effect of Volume Conduction . . . . . . . . . . . . . . . . . 102 9.4.6 Effect of Noise . . . . . . . . . . . . . . . . . . . . . . . . . 103 10 Distinguishing Parkinson’s Patients Using Causal Networks 106 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 10.2 Materials Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 107 10.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 10.2.2 Network Features . . . . . . . . . . . . . . . . . . . . . . . 108 10.2.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 110 10.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 10.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 10.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 11 Conclusion and Future Directions 120 11.1 Implications for Causal Network Analysis . . . . . . . . . . . . . . 120 11.2 Implications for Neurological and Psychiatric Disorders . . . . . . 121 References 124 ix List of Tables 2.1 Data-driven effective connectivity measures summarized from [1] 11 4.1 Summary of task vs. non-task classification results. Random label- assignment would result in a baseline accuracy of 50% . . . . . . . 34 5.1 Accuracy of task vs. non-task classifiers for the ten subjects. Sen- sitivity and specificity represent the task and non-task accuracies, respectively. Random label-assignment would result in a baseline accuracy of 50%. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.2 The time taken to predict whether the participant is engaged in the MSIT or not. Each column shows the mean and standard deviation of 100 runs of the algorithm on a 3.8 second multidimensional segment 43 5.3 Summary of task vs. non-task classification results. The classifiers were trained separately for each subject. The table presents mean and standard deviation values of classification accuracy, sensitivity and specificity for each of the six network construction methods. Random label-assignment would result in a baseline accuracy of 50% 43 6.1 Summary of task vs. non-task classification results. The classifiers are subject specific. The table presents median and interquartile range values of classification accuracy, sensitivity (task accuracy) and specificity (non-task accuracy) for each of the three network construction methods. Random label-assignment would result in a baseline accuracy of 50%. The highest accuracy in each row is presented in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 x 6.2 Summary of task vs. non-task classification results. The table presents median and interquartile range values of classification ac- curacy, sensitivity (task accuracy) and specificity (non-task accu- racy) for each of the three network connectivity measures: R, DI and CCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 7.1 Parkinson’s dataset-1 subject demographics (mean ± SD) repro- duced from [2]. Each control was matched to a person with PD with respect to their age, sex and handedness. . . . . . . . . . . 79 7.2 Parkinson’s dataset-2 subject demographics (mean ± SD) repro- duced from [3]. Each control was age and sex matched to a person with PD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 8.1 Leave-one-subject-out cross validation results summary for HC vs. PD classification. Sensitivity and specificity represent the healthy control and PD accuracies, respectively. Random label-assignment would result in a baseline accuracy of 50%. . . . . . . . . . . . . . 91 10.1 Summary of HC vs. PD classification results for PD Dataset-1. The table presents classification accuracy, sensitivity (PD accu- racy), specificity (HC accuracy) and AUC for each of the three network construction methods. Random label-assignment would result in a baseline accuracy of 50%. The highest values between the three methods are shown in bold. . . . . . . . . . . . . . . . . 114 10.2 Summary of HC vs. PD classification results for PD Dataset-2. The table presents classification accuracy, sensitivity (PD accu- racy), specificity (HC accuracy) and AUC for each of the three network construction methods. Random label-assignment would result in a baseline accuracy of 50%. The highest values between the three methods are shown in bold. . . . . . . . . . . . . . . . . 115 xi List of Figures 1.1 Commonly used effective connectivity metrics. . . . . . . . . . . . 4 3.1 The Multi-Source Interference Task. . . . . . . . . . . . . . . . . . 23 3.2 Glass brain models showing the electrode locations. Colors repre- sent different subjects. . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1 Classification accuracy as a function of the number of features (k). The reported values are mean 10-fold cross-validation accuracy. Each plot represents a different subject. . . . . . . . . . . . . . . . 30 4.2 Proportion of optimal features from a specific frequency band. . . 32 4.3 Task vs. non-task classification accuracy using recordings from a specific region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.1 Functional network of a randomly chosen task and non-task seg- ments from one subject constructed using local field potential sig- nals from 64 channels. . . . . . . . . . . . . . . . . . . . . . . . . 38 5.2 Two-dimensional scatter plot of two features before and after PCA from task and non-task data of subject-1. . . . . . . . . . . . . . . 40 5.3 Accuracy, sensitivity and specificity of task vs. non-task classification. 41 5.4 Channel-level effective (causal) networks of subject-1. The net- works were constructed using local field potential signals from 64 channels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 xii 6.1 The MVNM algorithm and graph visualization of a sample causal network before and after MVNM. The causal network has three regions with two channels (each) in regions 1 and 2, and three channels in region 3. . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.2 Functional (correlative) and effective (causal) networks of subject-1 constructed from a randomly chosen task segment. . . . . . . . . . 52 6.3 Flowchart showing the key steps of ‘task’ vs. ‘non-task’ classifica- tion process and determination of task engagement networks. . . 55 6.4 Task vs. non-task classification accuracy using networks constructed using three connectivity measures: correlation (R), directed infor- mation (DI), convergent cross-mapping (CCM). Each point within the boxplots represents 10-fold cross-validation accuracy of a par- ticipant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.5 Task engagement networks generated from the three network con- struction methods. Each node represents one of the 14 regions of interest (Acc: accumbens, Amyg: amygdala, caudate, Hipp: hip- pocampus, dACC: dorsal anterior cingulate cortex, dlPFC: dor- solateral prefrontal cortex, dlPFC: dorsomedial prefrontal cortex, lOFC: lateral orbitofrontal cortex, mOFC: medial orbitofrontal cor- tex, parahipp: parahippocampus, postCC: posterior cingulate cor- tex, rACC: rostal anterior cingulate cortex, temporal lobe, vlPFC: ventral lateral prefrontal cortex). The thickness of the edges rep- resent edge strength, as described by (6.2). . . . . . . . . . . . . . 60 6.6 Node centrality of each region in the task engagement networks. The bar plots represent node degree for undirected (R) networks and outdegree for directed networks (DI and CCM). . . . . . . . 61 6.7 Task engagement networks derived using inter-region correlational networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.8 Task engagement networks derived using inter-region DI networks. 64 6.9 Task engagement networks derived using inter-region CCM networks. 65 xiii 6.10 Task engagement networks in left and right hemispheres generated using CCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.11 Node centrality of each region in the left and right task engagement networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.12 Task engagement network including inter-hemispheric connections. 69 6.13 Histogram of relative theta band power in left dorsolateral PFC of subject-2 for task and non-task periods. . . . . . . . . . . . . . . . 70 6.14 Proportion of optimal features from each subband. . . . . . . . . 71 6.15 Outdegree of the regions in theta band TEN. . . . . . . . . . . . . 72 7.1 EEG channel locations plotted on a 2-D head diagram. Channels plotted beyond the head limit extend below the head center’s hor- izontal plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 8.1 Scalp topographical maps of average betweenness centrality. . . . 86 8.2 Single-channel classification performance comparison. . . . . . . . 89 8.3 Receiver operating characteristic (ROC) curves comparison between 3 node centrality measures. . . . . . . . . . . . . . . . . . . . . . . 90 9.1 Illustration of cross-mapping between X and Y when X influences Y but Y has no (or minimal) effect on X. . . . . . . . . . . . . . . 98 9.2 Correlation coefficients of the two estimated timeseries, Xˆ|MY and Yˆ |MX , with respect to library length (L). Coupling strengths: βxy = 0.05 and βyx = 0.5. Xˆ|MY represents the influence X → Y , and Yˆ |MX represents the influence Y → X. . . . . . . . . . . . . . . 101 9.3 Difference between ρX→Y and ρY→X (denoted by ρdiff ) as a function of coupling strength βyx. βxy = 0.5. . . . . . . . . . . . . . . . . 102 9.4 Difference between ρP→Q and ρQ→P (denoted by ρdiff ) as a function of coupling strength βyx, with volume conducting. βxy = 0.5. . . 103 9.5 The effect of noise on CCM and FDCCM estimates. ρdiff as a func- tion of increasing signal-to-noise ratio (SNR) for different coupling strengths βyx. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 xiv 10.1 Example time-frequency representations (spectrogram) derived from two arbitrarily chosen channels. The spectrograms correspond to a Parkinson’s patient from PD dataset-1. Time resolution: 0.5- second windows with 95% overlap. Frequency resolution: 5Hz, up to 200 Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 10.2 HC vs. PD classification receiver operating characteristic (ROC) curves - PD dataset-1. . . . . . . . . . . . . . . . . . . . . . . . . 112 10.3 HC vs. PD classification receiver operating characteristic (ROC) curves - PD dataset-2. . . . . . . . . . . . . . . . . . . . . . . . . 113 10.4 Scalp topographical maps of average betweenness centralities of healthy controls (HC) and PD patients (ON and OFF) from PD dataset-1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 10.5 Scalp topographical maps of average betweenness centralities of healthy controls (HC) and PD patients (ON and OFF) from PD dataset-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 xv Chapter 1 Introduction 1.1 Challenges in Brain Network Analysis Modern network science has posed exciting new opportunities for understanding the brain as a complex system of interacting units. The increasing availability of rich neuroimaging data combined with advances in mathematical modeling contributed to this perspective of the brain as a large-scale network. This approach has revealed fundamental aspects of the typical brain-network organization such as small-worldness [4]. Small-world networks are comprised of highly connected hub nodes and efficient long-distance connections. Therefore, these hub nodes have relatively higher node centralities. Note that node centrality is a measure of how well-connected a node is in a graph. This broad degree distribution implies that different nodes of brain networks differ widely in terms of their centrality. In the past few decades, functional neuroimaging techniques such as func- tional magnetic resonance imaging and electroencephalography have become cost- effective while the data analysis techniques have become more finely tuned. Con- sequently, a plethora of studies have attempted to elucidate the complex network properties of the human brain. However, the majority of the existing brain net- work studies are limited to expressing patterns of activity in terms of correla- tions or symmetric connections between brain regions [5–7]. While the underlying 1 2anatomical pathways between brain regions or populations of neurons are bidirec- tional, we cannot assume that the connectivity is symmetrical. Although there are several linear and nonlinear measures of undirected correlation, estimating the directionality in brain networks is an important and largely unaddressed issue [8]. This dissertation proposes methods to estimate and analyze causal interactions between brain regions and demonstrate their applicability on human brain data. Although several techniques and studies exist for causal brain network analysis, most of them are limited to data acquired through functional magnetic resonance imaging (fMRI) [9,10]. fMRI is an indirect measure of neural activity and recorded at a low temporal resolution, which is not ideal for capturing cognitive functions that involve finer temporal dynamics. Electrical signals such as electroencephalo- gram and local field potentials can sample neural activity at 100–1000x higher time resolution than fMRI, making it more suitable to assess temporal dynamics. Another significant challenge with decoding brain network organization from distributed electrodes is an imbalance in the number of measurements between nodes/edges. A given brain region (node) may have anywhere from 1 to 5 physi- cal electrodes measuring it, depending on the region’s size and the specific clinical placement of the electrode. Besides, electrode positions vary between subjects. This imbalance makes it harder to interpret channel-level networks, where each node corresponds to specific electrode contact. Despite identifying channels be- longing to specific brain regions, the best way to combine these signals without losing helpful information is still unclear. Current studies are based on a simplis- tic assumption that an average of the measured recordings from channels in the same region represent the neural activity of that region. 1.2 Research Overview The main objective of this dissertation is to propose methods to infer and ana- lyze causal interactions using electrophysiological signals. We apply the proposed methods and demonstrate their validity in two specific domains: 1. Decoding 3human cognitive control, and 2. Parkinson’s disease detection. 1.2.1 Part I: Decoding Cognitive Control This study uncovers the network mechanisms associated with cognitive control by distinguishing cognitive task-based connectivity from resting-state connectivity. The ability to consolidate large-scale heterogeneous channel-level networks into smaller region-level interactions can help generalize the findings to broader groups and make useful observations from a physiological standpoint. To this end, we develop a novel node-merging mechanism by which one can translate a channel- level network into a more interpretable region-level network. The proposed method is called maximal variance node merging (MVNM). Ac- cording to MVNM, new inter-region edges are estimated by computing the optimal linear combination of connections between channels in two different regions such that their variance is maximized. This process is equivalent to finding the first principal component of the original edges between the two regions using principal component analysis (PCA) – a dimensionality reduction method that attempts to reduce the number of variables while preserving as much information as possible. We construct these networks using electrical activity measures from cortical and sub-cortical electrodes while subjects exert cognitive effort to perform a task. We then use the network connections as features to classify the task states from non- task states. The MVNM algorithm’s rationale is that a larger variance implies a broader spread, which helps differentiate between two classes: task and non-task. 1.2.2 Part II: Parkinson’s Disease Detection In this part of the dissertation, we propose a novel causal connectivity method and apply it to accurately distinguish Parkinson’s patients from healthy controls us- ing network features. Existing data-driven causal connectivity methods vary from 4Granger Causality (GC) Dynamic Causal Modeling Directed Transfer Function Partial Directed Coherence Frequency domain GC Directed Information Transfer Entropy Convergent Cross- Mapping -- Frequency-DomainTime-Domain Model-Based Model-Free Figure 1.1: Commonly used effective connectivity metrics. linear to nonlinear, model-based to model-free, and time-domain to frequency- domain [1,11]. The most commonly applied causal connectivity measures in neu- roscience are depicted in Fig. 1.1. The heterogeneity of neural interactions makes it challenging to determine a unifying causal model. Therefore, any method that relies on a mathematical model is inherently limited by the model assumptions. Although a few model-free measures like convergent cross-mapping (CCM) and directed information (DI exist, they are limited to time-domain interactions and are sensitive to noise. We propose a model-free metric to estimate causal influ- ence between interacting elements using nonlinear state-space reconstruction to overcome these barriers. The proposed solution exploits the frequency-domain dynamics to estimate causal interactions from the power spectra of measured neural recordings. The rationale behind FDCCM is based on the following intuition. Since the cause- variable influences the effect-variable, we can reconstruct the causal timeseries by finding its signature in the power spectrum of the effect timeseries. We first apply the proposed method on toy data by simulating coupled logistic maps, 5an established mathematical system to simulate complex dynamics. We then demonstrate our approach on two Parkinson’s datasets comprised of resting-state scalp EEG. 1.3 Dissertation Structure and Outline The dissertation is divided into Part I – Decoding Cognitive Control and Part II – Parkinson’s Disease Detection. The subsequent chapters are organized as follows: • Chapter 2 provides a brief background on brain networks and summarizes commonly used effective (causal) connectivity measures. PART I – Decoding Cognitive Control • Chapter 3 provides the necessary background for chapters 4, 5, and 6. We present an introduction to human cognitive control and the experimental setup and data description for Part I of the dissertation. • Chapter 4 presents an approach for differentiating task engagement in a cognitive task from resting-state activity using spectral power features. • Chapter 5 presents an approach for differentiating the two mental states, task, and non-task, using correlational and causal brain networks. • Chapter 6 presents a novel method called maximal variance node merging to translate channel-level networks (presented in chapter 5) into region- level networks. This chapter also classifies the task and non-task states and compares the proposed method with existing methods. PART II – Parkinson’s Disease Detection • Chapter 7 provides the necessary background for chapters 8, 9, and 10. We present an introduction to Parkinson’s disease and present details of the two Parkinson’s datasets used in this study. 6• Chapter 8 presents an approach for differentiating Parkinson’s patients from healthy controls using functional connectivity. • Chapter 9 presents a novel effective connectivity method called frequency- domain convergent cross-mapping (FDCCM). We also present the results of the proposed method on a toy dataset. • Chapter 10 applies FDCCM to the Parkinson’s dataset and develops classi- fiers to distinguish Parkinson’s patients from healthy controls. 1.4 Summary of Contributions The specific contributions of each section of the dissertation are outlined below: Part I: Decoding Cognitive Control – Chapters 3 to 6. • We analyze local field potentials (LFP) from 10 human subjects to discover frequency-dependent biomarkers of cognitive conflict. We show that spectral power features in predefined frequency bands can classify task and non-task segments with a median accuracy of 88.1%. We also demonstrate that the theta (4–8 Hz) band and high gamma (65–200 Hz) band oscillations are modulated during the task performance. • We construct and compare three different brain networks: one functional and two effective networks). We derive causal networks based on a tech- nique called convergent cross-mapping (CCM) [12] and show that the causal networks help identify regions of interest associated with task engagement. Thus, we utilize distributed brain connectivity analysis to detect task en- gagement and identify potential biomarkers for cognitive control. • We propose a novel technique called maximal variance node merging (MVNM) to estimate region-level interactions. Unlike channel-level networks, region- level networks are more interpretable and relevant for clinical translation. 7• We introduce and present task engagement networks (TENs) by combin- ing the most explanatory network interactions from multiple subjects. The TENS can be further analyzed to identify significant regions that function as stimulation sites. • We demonstrate that the causal inter-region networks can differentiate men- tal states associated with task performance from resting-state activity with 85.2% median accuracy. A previous analysis using the same data attained 78% accuracy [13]. • We show that subband networks constructed from bandpass filtered signals also encode task-specific activity. Especially, theta band (4–8 Hz) networks play a significant role in detecting task engagement, consistent with prior findings that theta-band oscillations are modulated during cognitive control. Part II: Parkinson’s Disease Detection – Chapters 7 to 10. • We graph theory network measures derived from non-invasive electroen- cephalography (EEG) to develop neural decoders that can differentiate Parkin- son’s disease (PD) patients from healthy controls (HC). Using functional networks, we demonstrate that PD patients could be distinguished from healthy controls with 89% accuracy – approximately 4% higher than the state-of-the-art on the same dataset. • We introduce a spectral measure of causality called frequency-domain con- vergent cross-mapping or FDCCM. Unlike existing spectral measures of ef- fective connectivity, FDCCM is model-free and can infer nonlinear interac- tions. • We describe the algorithm and validate our method on a toy dataset. We illustrate the effect of coupling strength and noise on the quality of causal inference; and demonstrate that FDCCM is more robust to external noise than CCM. 8• We apply our method on resting-state scalp EEG recordings from two Parkin- son’s datasets. We employ graph analysis to showcase the difference in be- tweenness centrality between the patients and controls. • We demonstrate that machine learning classifiers based on FDCCM causal networks can differentiate between Parkinson’s patients and demographi- cally matched healthy controls with 97% accuracy. The performance of FD- CCM is shown to be better than correlational networks and CCM networks on both datasets. Chapter 2 Background: Measures of Effective Connectivity 2.1 Brain Networks Describing complex systems such as the brain as a network of interacting elements can provide useful insights. A graph is defined as a collection of nodes and edges. The interaction between two nodes is described by their edge(s) strength. A graph with N nodes is characterized by an adjacency matrix of size N × N , whose elements indicate the edge strengths between the nodes. Network science provides a particularly appropriate framework to study several mechanisms in the brain by treating neural elements (a population of neurons, a sub-region) as nodes in a graph and neural interactions (synaptic connections, information flow) as its edges. In order to answer questions and make observations about various cognitive, pathological, or behavioral mechanisms, it is critical to define the network con- nections appropriately. Brain connectivity is broadly categorized into structural connectivity, functional connectivity, and effective connectivity. Structural con- nectivity represents large-scale anatomical connections between cortical regions. Functional and effective connectivity is generally estimated from time series of 9 10 brain dynamics [14,15]; The fundamental distinction between the two is that func- tional networks represent patterns of cross-correlation, while effective networks represent patterns of causal interactions. Several tools and network measures can be adopted following the network construction, depending on the type of network and application [14]. The applications of network analysis of the brain range from understanding the dynamics behind cognition and behavior to diagnosing and monitoring dis- eases, and is also beneficial in the treatment and rehabilitation of multiple clinical conditions [16]. Several studies show that disorders such as Epilepsy, Alzheimer’s, Schizophrenia, and Autism can be characterized as connectome abnormalities. 2.2 Summary of Effective Connectivity Measures Causal connections can be inferred from dynamic causal modeling (DCM) [17] which is based on biologically plausible neural mass models or can be estimated from data. Other than DCM, There are numerous other effective connectivity measures in the literature. However, a vast majority of them fall under one of the three families: Granger Causality-based measures, coherence-based measures, and information theory-based measures [1]. A summary of these measures is presented in Table. 2.1. Some of the most commonly applied measures are described in this section. 2.2.1 Granger Causality Granger causality analysis is a well-established methodology and widely applied in the neuroscience community [18]. Granger causality (from here on called GC) or ‘Wiener-Granger causality’ was first introduced by Wiener in 1956 and later formalized by Granger in 1969 [19]. Studies have shown that GC achieves similar results to dynamic causal modeling [20, 21] and also has plausible estimates of human seizure propagation pathways [22]. GC methodology helps answer two key 11 GC-Based Coherence-Based Information Theoretic Linear GC Directed Coherence (DC) Transfer entropy Conditional GC Directed transfer function Symbolic transfer Partial GC (DTF) entropy Copula-based GC Partial DC (PDC) Partial transfer entropy Partial frequency- Direct DTF KL divergence domain GC Extended DC and PDC Directed partial mutual Multi-variate versions Phase locked value information Table 2.1: Data-driven effective connectivity measures summarized from [1] questions that provide supporting evidence for a causal interaction. First, can the activity measured by electrode A be predicted now, if the activity measured by electrode B in the past is known? Second, is this better than knowing only the past of A [23]? Mathematical Formulation of GC: Suppose there are two time-series X and Y , and the aim is to investigate the causal interaction from X to Y . We learn two auto-regressive models of order m that describe the dynamics in Y: full model and reduced model. The full model (FM) predicts Y (t) at time index t, using the past values of X and Y , and is written as follows: Y (t) = m∑ j=1 ajX(t− j) + bjY (t− j) + ϵFM . The reduced model (RM) predicts Y (t) by just using the past values of Y (or by excluding X) and is given by, Y (t) = m∑ j=1 bjY (t− j) + ϵRM . 12 Here, ϵFM and ϵRM denote the prediction errors of FM and RM, respectively. If the variance of ϵRM (σ 2 FM) is higher than the variance of ϵFM (σ 2 RM), i.e. if the reduced model is less accurate than the full model, it implies a causal interaction from X to Y . The magnitude of causality is given by log ( σ2FM σ2RM ) . Limitations of GC: Despite its wide usage, there are several shortcomings to GC-based techniques [24]. Granger causality analysis neither establishes nor requires causality; it can, how- ever, provide evidence in support of a hypothesis about causal interactions [23]. The GC approach is more suitable for stochastic, linear, and strongly coupled systems. Interactions between neuronal subsystems are nonlinear and are char- acterized by moderate to weak coupling. More importantly, GC estimates are based on model assumptions. As described in [12], the critical requirement of GC is separability, namely that information about a causative factor is independently unique to one variable and that it can be removed by eliminating that variable from the model. This separability is characteristic of purely stochastic and linear systems and reflects the view that systems can be understood as one piece at a time rather than as a whole. There is a need to develop more suitable techniques to study the complex nonlinear, bidirectional, weakly coupled interactions in the brain. 2.2.2 Directed Transfer Function Consider a k-dimensional time-seriesX(t), represented by (X1(t), X2(t), . . . , Xk(t)) at time t. Directed transfer function (DTC) is based on multi-variate auto- regressive (MVAR) modeling of the multivariate time-series. According to the MVAR model, the signal can be expressed, at a given time t, as a weighted sum of its p previous values and a k-dimensional random noise vector E(t): 13 X(t) = p∑ τ=1 A(τ)X(t− τ) + E(t), where A(i) are the model coefficients and p is called the model order. The above expression can be transformed into frequency domain by computing its Fourier transform. The resultant equation is given by X(f) = A−1(f)E(f) = H(f)E(f), where X(f), A(f) and E(f) are the transforms of X(t), A(i) and E(t), respec- tively. H(f) is the transfer function of the system and its elements Hij(f) indicate the causal influence from the j−th input to the i−th output at frequency f . The DFT from Xj to Xi, denoted by βij is defined as [25,26]: β2ij = |Hij(f)|2. (2.1) Normalized DTF is defined as γ2ij = |Hij(f)|2∑k m=1 |Him(f)|2 . (2.2) 2.2.3 Partial Directed Coherence Partial directed coherence (PDC), introduced in 2001 [27], is based on the frequency- domain MVAR model representation (like DTF). The PDC from Xj to Xi is defined as: πij(f) = A¯ij(f)√ N∑ k=1 A¯kj(f)A¯∗kj(f) , where A¯ij(f) represents the elements of the matrix A¯(f) = I − A(f). That is, A¯ij(f) = 1− ∑p k=1 aij(k)e −2jπfk, if i = j −∑pk=1 aij(k)e−2jπfk, otherwise 14 An essential difference between DTF and PDC is that DTF is normalized with respect to the structure that receives the signal, whereas PDC is normalized with respect to the structure that sends the signal. 2.2.4 Transfer Entropy Transfer entropy or TE is the most well-known non-parametric effective connec- tivity measure used in neuroscience research. TE is an information-theoretic ap- proach that models timeseries as random processes and measures causal effects as a function of their probability distributions. Assume that two random time series X(t) and Y (t) can be estimated by generalized Markov processes as follows, which m and n indicate the orders of random time series X and Y , respectively. Xmt = (X(t), . . . , X(t−m+ 1)) Y nt = (Y (t), . . . , Y (t− n+ 1)) Note that when there is no causal interaction from X to Y , we can assume the following equation is true p(Y (t+ 1)|Y nt , Xmt ) = p(Y (t+ 1)|Y nt ). Transfer entropy is defined as the Kullback-Leibler divergence (KLD) between the two conditional distributions in the previous equation: p(Y (t + 1)|Y nt , Xmt ) and p(Y (t+ 1)|Y nt ). That is, TE(X → Y ) =∑ Y (t+1),Y nt ,X n t p(Y (t+ 1), Y nt , X m t ).log ( p(Y (t+ 1)|Y nt , Xnt ) p(Y (t+ 1)|Y nt ) ) 15 2.3 Model-Free Effective Connectivity Measures This section presents two measures to infer causality between observational data that do not rely on mathematical models: directed information (DI) and conver- gent cross-mapping (CCM). Consequently, these measures can estimate nonlinear causal interactions. Note that all our analyses in this dissertation utilize one or both of these measures. 2.3.1 Directed Information Directed information is an information-theoretic measure that estimates causal interactions between two jointly distributed sequences (two timeseries). It quan- tifies the amount of causal information in one timeseries explained by the other timeseries. DI was initially defined for discrete-time, discrete-valued random pro- cesses and was later extended to discrete-time continuous-valued processes [28,29]. Let XN and Y N be N samples of continuous-valued sequences X and Y . Directed Information form XN to Y N , denoted I(XN → Y N), can be written as I(XN → Y N) = h(Y N)− h(Y N∥XN), where h(Y N) and h(Y N ||XN) represent the differential entropy of the random vector Y N and the differential entropy of Y N causally conditioned on XN . The parameter h(Y N ||XN) is defined as h(Y N∥XN) = N∑ n=1 h(Yn|Y n−1, Xn). DI is not restricted to a particular class of statistical models and is non- parametric [30]. Two widely used causality metrics, Granger causality and transfer entropy are closely related to DI [19, 31]. If the signals are assumed to originate from an autoregressive model with Gaussian noise, DI is equivalent to Granger causality [30]. DI can be extended to more general systems than transfer entropy; it is not limited to stationary Markov processes and can quantify the instantaneous causality [32]. 16 2.3.2 Convergent Cross-Mapping Dynamical system theory suggests that each causally linked variable in a dynam- ical system can be used to estimate the state of another variable since they share a common manifold (M). Also, when one variable X drives another variable Y , information about the states of X can be recovered from Y . In other words, the ability to predict X from Y is a necessary condition to establish a causal link from X to Y . This rationale forms the basis for convergent cross-mapping. Convergent cross-mapping, introduced by Sugihara in 2012 [12], tests for cau- sation between two time-series X and Y by looking at the (temporal) correspon- dence between their shadow manifolds. The shadow manifolds, MX and MY , are constructed from time-lagged coordinates of the time series values of X and Y , respectively. Specifically, CCM measures the extent to which the nearby points in MY estimate the states of X. This process of estimating states of X from MY (or vice-versa) is called cross-mapping. Consider two time-series of length L, XL and Y L. As a result of time-delayed embedding, the shadow manifold MX is formed by the time-lagged coordinate vectors xt = (xt, xt−τ , xt−2τ . . . xt−(E−1)τ ) for t ∈ [1+(E−1)τ, L]. The embedding thus depends on two parameters: the time delay τ and the embedding dimension E. The cross-map of xt, denoted xˆt∥MY is computed by first identifying yt and its E + 1 nearest neighbors in MY , denoted {y1(t),y2(t), . . . ,yk(t)}. As a consequence of Taken’s theorem, MX is dieffomorphic to MY , i.e., each point in MX can be mapped to a unique point in MY [12]. Therefore the neighborhood of yt is mapped to a set of points in MX , represented as {x1(t),x2(t), . . . ,xk(t)}. The weighted mean of {x1(t),x2(t), . . . ,xk(t)} provides an estimate of x(t) as shown in the equation, ˆx(t)∥MY = ∑E+1 i=1 wixi(t) The weighting wi is based on the distance between y(t) and its ith nearest neighbor yi(t) as given by the equations wi = ui/ D+1∑ j=1 uj and ui = exp [ − d(y(t),yi(t) d(y(t),y1(t) ] where d(., .) represents the Euclidean distance. 17 A benefit of using CCM for causal inference is that, unlike Granger-related causality measures, CCM does assume ‘separability’ between variables, which is usually valid in linear and strongly coupled non-linear systems. The separability assumption implies that ifX causes Y , then the prediction accuracy of the variable Y from the historical record of a set of variables U diminishes when X is excluded from U . Granger’s approach could lead to ambiguity in biological and ecological networks due to non-separable dynamics. Additionally, complex subsystems such as the brain are characterized by moderate to weak coupling and are affected by unobserved/external variables. 18 PART I: DECODING COGNITIVE CONTROL Chapter 3 Introduction to Cognitive Control This chapter provides a brief introduction to cognitive control networks in humans and describes the task and data used in this study to decode human cognitive control. Hence, this chapter provides the necessary background for chapters 4, 5, and 6. 3.1 Human Cognitive Control Definition In contemporary cognitive neuroscience, cognitive control refers to the adaptation of information processing, and hence behaviors, depending on current goals. This ability to coordinate thoughts and actions concerning internal goals is often re- quired in our everyday lives and forms the basis of higher cognition processes such as planning and reasoning. Cognitive controls involves a broad range of mental processes that can be consolidated into three key elements: 1. updating (monitor- ing and changing working memory contents), shifting (flexible changes between task-sets or goals), and inhibition (overriding habitual or prepotent/default re- sponses) [33, 34]. For example, an application of successful cognitive control in everyday life is switching from using social media to studying for an upcoming test while inhibiting the desire to return. A closely-related counterpart of cognition, attention, relates to the mechanisms used for selecting environmental information 19 20 relevant for generating adaptive response (i.e., responses that are appropriate for the subject’s context or task). In the past 25 years, a substantial theoretical and experimental progress has been made to decipher the neural mechanisms that enable cognitive control [35–40]. However, there is no clear consensus regarding the neural signature of cognitive control, and a great deal still remains to be understood [41,42]. 3.2 Physiological Correlates of Cognitive Con- trol Mechanisms of control have been studied using two different perspectives [43,44]. The first approach is based on understanding the specialized processing within individual brain regions, especially in the prefrontal cortex [45–47]. The second approach is focused on the distributed processing across different brain regions organized as large-scale complex networks. The latter approach has become more popular recently due to the advancement in brain imaging and network analysis techniques. The methods in this dissertation are also based on the second ap- proach. It is essential to be familiar with the neuroscience of cognition to validate and interpret the findings of the empirical observations in our study. A number of studies indicate that the prefrontal cortex (PFC) subserves cogni- tive control [45,46]. Specifically, lateral PFC/dorsolateral PFC has been observed to exhibit a higher activation while implementing cognitive control in several func- tional magnetic resonance imaging studies in humans [48–52]. This role of PFC is supported by some transcranial brain stimulation studies that indicate that cog- nitive enhancement can be achieved by stimulating the dorsolateral PFC [53,54]. Moreover, neurophysiological studies of cognitive control that analyze oscillatory activity of signals recorded from the PFC suggest that theta activity is increased with the need for control [55–60]. 21 3.3 Cognitive Control Deficits and Psychiatric disorders Since cognitive control is crucial in making goal-directed behavioral decisions, it impacts social stability, success, and other measures of quality of life [61]. There- fore, cognitive dysfunction is a common feature of psychiatric disorders such as including schizophrenia, bipolar or unipolar depression, anxiety disorders, and substance use disorders [62]. Dysregulation of the brain’s cognitive control sys- tems is evident across multiple disorders (i.e., transdiagnostically) [62–64]. Brain networks provide a unifying for characterizing how information is encoded in the brain and understanding the neurobiological basis of psychiatric disorders. In par- ticular, aberrations in large-scale brain networks that implement cognitive control have been the underpinning of virtually all psychiatric disorders [65]. 3.4 The Multi-Source Interference Task (MSIT) The Multisource Interference Task (MSIT) was developed to test cognitive control in normal and pathological conditions, such as schizophrenia, attention deficit hyperactive disorder (ADHD), and obsessive-compulsive disorder (OCD). In 2003, Bush et al. developed the Multi-Source Interference Task (MSIT) that reliably and robustly activates the cingulo-frontal-parietal (CFP) network in individual healthy subjects [66]. The CFP network plays a crucial role in cognitive processing. Functional imaging studies showed that the MSIT could be used to i) identify the cognitive/attention network in normal volunteers and (ii) test its integrity in people with neuropsychiatric disorders [67]. EEG studies also show that the MSIT reliably modulates brain electrical activity related to cognitive control and attention [68]. Thus, electrical activity (from EEG or LFP) obtained during the 22 performance of this task may be useful for exploring the functioning of cogni- tive/attentional networks in healthy and clinical populations. Subsequent exper- iments also showed that the MSIT performance is affected by electrical stimula- tion [69,70]. 3.5 Data Description 3.5.1 Experimental Setup The experimental setup, as depicted in Fig. 3.1(a), consisted of 1–5 blocks of trials. Each trial-block had 32 or 64 trials. The task stimuli in each trial com- prised of an array of three numbers (1, 2, or 3) presented at the center of a computer screen. The subjects were asked to report, via button press, the value of the unique number that differs from the other two distractors (see 3.1(b)). If the distractors were a ‘0’ and the target’s position matched its value, the trials were called congruent trials. Otherwise, they were referred to as incongruent tri- als. Each experimental run comprised a roughly equal number of congruent and incongruent trials. The incongruent trials were also characterized by a slightly lower success rate of 97.1± 5.52% compared to 100%± 2.47% associated with the congruent trials [13]. 3.5.2 Subjects Fourteen subjects participated in the experiment when they were hospitalized for invasive epilepsy monitoring and subsequent seizure localization. Use of in- tracranial recordings of patients undergoing epilepsy monitoring to study cognitive phenomena is gaining popularity [45,71–73]. Each subject had a history of drug- resistant complex-partial seizures. We discarded the data from 4 out of the 14 subjects due to lack of statistically sufficient task/non-task recordings. The re- maining ten subjects were included this study. All surgical decisions, including the location, type, and the number of electrodes, were made by clinicians independent 23 (a) Block structure of the MSIT trials + 1 2 3 1 2 3 Fixation cross 1 0 0 ✔ ✘ + 1 2 3 1 2 32 1 1 ✔ ✘ Fixation cross Congruent trial Incongruent trial (b) Examples of congruent and incongruent trials Figure 3.1: The Multi-Source Interference Task. of this study. The participants were informed that participation in the experiment would not affect their treatment. They were allowed to withdraw at any point during the task. According to the study sponsor guidelines, each participant gave fully informed consent. The original study was approved by the Institutional Re- view Board of Massachusetts General Hospital and the US Army Human Research Protection Office. The present study re-analyzed a publicly available, de-identified copy of the published dataset, and thus did not require further review [13]. 3.5.3 Signal Acquisition and Preprocessing Local field potential (LFP) signals were recorded through depth electrodes surgi- cally implanted for seizure monitoring in each participant. Between five and nine electrodes with diameters 0.8-1.0 mm were placed in each hemisphere. Each elec- trode consisted of 8-16 platinum/iridium contacts. The distribution of electrodes 24 is illustrated in Fig. 3.2. The Multi-Modality Visualization Tool was used to cre- ate the visualization [74]. The signals were acquired at a 2 kHz sampling rate via neural signal processor recording systems from Blackrock Microsystems Inc., Salt Lake City, UT. All signals were referenced to a scalp EEG electrode. Electrodes with excessive line noise (60 Hz), close to seizure focus (based on clinical reports), and other artifacts found on visual inspection were removed. Each channel was down-sampled to 1000 Hz. The line noise and its harmonics were removed. Adja- cent channels were then bipolar re-referenced to each other to alleviate the effect of volume conduction [7]. This data was previously reported in [13], and was analyzed using a different approach. Electrode Localization Spatial coordinates of the electrodes were determined manually through post- operative computerized tomography (CT). Pre-operative T1 weighted MRI were aligned with the anatomical CT images through a volumetric image co-registration method utilizing the FreeSurfer software package [75, 76]. An electrode labeling algorithm was employed to estimate the probability that a particular brain region contributes to the signal’s source at each electrode [77]. The regions of interest were parcellated based on the Desikan-Killiany-Tourville brain atlas [78]. The number of bipolar re-referenced channels ranged between 64 and 195 based on the subjects’ electrode montage, and these channels were mapped to 17-23 regions. 3.5.4 Defining Task and Non-Task Segments The neural activity recorded during MSIT blocks is referred to as task data, and the data recorded during rest periods (before or after task blocks) is referred to as non-task data. To differentiate between the task and non-task states, the signals were divided into multiple task segments and non-task segments. On average, each MSIT trial was approximately 4 s long; each trial’s actual duration varies depending on the subject’s reaction time. The time when the fixation cross 25 (a) Anterior (b) Posterior (c) Inferior (d) Superior (e) Lateral left (f) Lateral right (g) Medial left (h) Medial right Figure 3.2: Glass brain models showing the electrode locations. Colors represent different subjects. 26 was presented during a trial was marked as the trial’s start. The time duration between two consecutive fixation crosses determined the length of each trial. The minimum trial length for most subjects was approximately 3.8 s. Therefore, the 3.8-second time segments from every trial’s start were labeled as ‘task’ data. The signals recorded during rest periods were windowed into ‘non-task’ segments with a window length equal to the minimum task duration. Overlapping windows were used when the number of non-task segments was less than the number of task segments. If the amount of task data was less than the amount of non-task data, only a subset of the non-task data was used for classification. Thus, the two classes were balanced to make sure the classifiers were not biased. The multidimensional time-series corresponding to task and non-task segments were then used to construct task and non-task networks. Chapter 4 Decoding Cognitive Control Using Spectral features 4.1 Introduction Neuropsychiatric disorders are the leading cause of disability in the United States; about one in every five American adults experiences mental illness. Existing treat- ments for mental illness are less than 50% effective, which calls for a better under- standing of the mechanisms underlying these disorders that can lead to alleviation of impaired cognitive control. Electrical deep brain stimulation (DBS) has been shown to be promising, but it suffers from ambiguous clinical outcomes, which limit its usage [79–81]. The action mechanisms of DBS are still unclear, and sev- eral questions are yet to be answered: What should be the target stimulation site? What are the best stimulation parameters (e.g., current, frequency, and duty cy- cle)? Dysfunctional decision-making and cognitive control are common features in a wide range of mental disorders such as depression, addiction, anxiety disor- ders, autism spectrum disorders, schizophrenia, and obsessive-compulsive disorder (OCD) [62, 82]. Cognitive control involves restricting and controlling default re- sponses in favor of a more desired adaptive response. Developing an adaptive system that can calibrate stimulation in response to 27 28 predefined biomarkers (for cognitive effort) may improve DBS efficacy, but this requires further knowledge about its neural signatures [83, 84]. A closed-loop direct brain stimulation strategy was developed in a recent study to enhance cog- nitive control [69]. However, it is still unclear when to automatically activate the intervention. Therefore, further research is necessary to discover the best biomarkers for cognitive control in humans. This chapter attempts to uncover frequency-dependent biomarkers by classifying task engagement from background (non-task) activity. To this end, we utilize local field potential signals as ten subjects perform the Multi-Source Interference Task (MSIT), a well-established experimental paradigm to study cognitive control [66]. The MSIT has been shown to evoke connectivity changes related to cognitive impairment in major depressive disorder (MDD), OCD, and schizophrenia [85, 86]. A similar decoding of task activity based on fixed canonical correlation analysis was demonstrated in [13]. However, the canonical correlation operators are not readily implementable on an implanted device. The main advantage of this approach is that the proposed features make it feasible to develop, implement and evaluate an adaptive neuro- modulation mechanism using existing hardware devices. 4.2 Materials and Methods 4.2.1 Data Description The experimental setup, data collection, and prepossessing are described in 3. 4.2.2 Features and Classification Extracting task and non-task segments The onset of the fixation cross marked the beginning of each trial. The trials lasted at least 3.8 s. Therefore, 3.8 s segments of neural recordings from the onset of each trial constitute task data. The recordings during the rest phase, i.e., 29 before and after the task activity are considered non-task data (see Fig. 3.1(a)). Unbiased classification of the two mental states requires class-balancing. For subjects with insufficient non-task data, overlapping windows were used to create more segments. Feature extraction and selection Spectral powers in predefined frequency bands were used as features to differenti- ate between the two classes. Prior studies demonstrate that spectral power based features can distinguish psychiatric and neurological disorders [87–90]. Five fre- quency bands between 4 Hz and 200 Hz were considered. These bands are defined as follows: theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), low gamma (30–55 Hz), and high gamma (65–200 Hz). Additionally, the total power (1–200 Hz) is also considered a feature. The spectral powers were computed as averaged peri- odogram estimates in the frequency bands using the Matlab function bandpower. Consequently, a subject with N channels has 6×N total features. The number of bipolar re-referenced channels ranged between 64 and 195 based on the subjects’ electrode montage. The features were then ranked using the Bayes factor. The Bayesian approach, unlike p-values, yields an intuitive interpretation of the evidence supported by data [91]. Although significant p-values can provide evidence against the null hypothesis, the inverse may not be true [92]. This ambiguity and lack of a formal calculus of inference make p-values elusive to interpret [93]. In contrast, the Bayes factor (sometimes referred to as likelihood ratio) measures the strength of relative evidence that the data provide for one hypothesis versus the other. Consider two possible hypotheses H0 and H1 for a data. The evidence for the hypothesis can be compared by looking at the posterior odds Ω, which can be computed as, Ω = Pr(H0|data) Pr(H1|data) = Pr(data|H0) Pr(data|H1) Pr(H0) Pr(H1) In practice, it is reasonable to set the prior odds Pr(H0) Pr(H1) = 1, to be unbiased 30 0 50 100 150 200 250 300 Num. of features (k) 60 65 70 75 80 85 90 95 Ac cu ra cy (% ) Figure 4.1: Classification accuracy as a function of the number of features (k). The reported values are mean 10-fold cross-validation accuracy. Each plot represents a different subject. towards either hypotheses. The data in the two classes were balanced making this a reasonable assumption. This assumption reduces Ω to the ratio of marginal likelihoods Pr(data|H0) and Pr(data|H1). This ratio, known as the Bayes factor BF , is given by the equation, BF = Pr(data|H0) Pr(data|H1) . The likelihoods were computed assuming the data in both classes are Gaussian. 4.3 Results 4.3.1 Classifier Performance For each subject, the data were split into ten subsets via sequential sub-sampling to ensure that test samples are not in the training samples’ temporal vicinity. One 31 of the ten folds acts as the test set in each iteration, while the other nine act as the training data. This procedure ensures that all the data are tested and makes the classifiers less prone to overfitting. Classification accuracy of each of the ten test sets was computed using the top k features (ranked using Bayes factor) as inputs to linear support vector machine (SVM) classifiers. This process was repeated by changing the value of k up to a maximum of 600 features. The top k∗ features with maximum accuracy are considered the optimal features for each subject. The median value of k∗ was observed to be 184. Fig. 4.1 illustrates how the accuracy changes as a function of k till k = 300. It can be observed that the accuracy increases with the value of k. All subjects attain a near-maximum accuracy with 50 features, after which the improvement is marginal. The best accuracy for the 10 subjects varied between 82.5% to 95%. The median classification accuracy is 88.1%, with a standard deviation of 3.7%. These results show that frequency-domain features can reliably distinguish task states from non-task states. The low standard deviation shows the robustness of the approach across multiple subjects. The task and non-task classification rates (sensitivity and specificity) are 90 ± 4.7% and 87 ± 4.3%, respectively. These rates are significantly higher than a prior approach based on the fixed canonical correlation analysis (FCCA) for the same data [13]. 4.3.2 The Role of Theta and High Gamma Bands Prior studies show that specific frequency bands could be associated with cognitive control [58, 58, 60, 70]. To determine the contribution of each of the five bands, we categorize the optimal features into five groups based on their corresponding frequency bands. Let F be set of all available features for a given subject. The number of features in F , i.e., |F| depends on the number of channels of signals recorded from the subject. The set of optimal features selected by the features selection method is denoted by F∗, where F∗ ⊂ F . The proportion of optimal 32 1 2 3 4 5 6 7 8 9 10 Subject Number 0 5 10 15 20 25 30 35 40 45 50 55 Pr op or tio n of F ea tu re s Theta Alpha Beta Low Gamma High Gamma Figure 4.2: Proportion of optimal features from a specific frequency band. features (p) in a subband B is given by p = |F∗B|/|F∗|, where F∗B ⊂ F∗ is the set of optimal features corresponding to the frequency band B. We observe that at least 40% of the optimal features originate from theta band (4–8 Hz) activity in six subjects. In the remaining four subjects, high gamma (65– 200 Hz) is more involved. Fig. 4.2 presents the proportion of optimal features (p) in each of the five bands: theta, alpha, beta, low gamma, and high gamma. It shows that the theta (median p = 32.8%) and the high gamma (median p = 24.3%) bands play a dominant role in the MSIT task engagement. 4.3.3 Comparison of Regions Next, we isolated the recordings from specific regions of interest. We observed each region’s ability to decode task states. The ten regions of interest span across at least nine subjects: amygdala, caudate, dorsal anterior cingulate cortex, dor- solateral prefrontal cortex, dorsomedial prefrontal cortex, hippocampus, lateral orbitofrontal cortex, medial orbitofrontal cortex, temporal lobe, and ventrolateral prefrontal cortex. Fig. 4.3 shows a comparison of classifier performance of the ten 33 Figure 4.3: Task vs. non-task classification accuracy using recordings from a specific region. regions of interest. The results illustrate that a subset of optimal features from one brain region can maintain the decoding ability. The dorsolateral PFC is the most contributory region in all ten subjects with a median classification accuracy of 86.6 ± 5.2%. The number of features required to achieve the best accuracy was 115 (median of the ten subjects). We can attain an accuracy of 88.1% using all available regions (see Fig. 4.1), which is only marginally higher. The second and the third best decoding performances are from the temporal lobe and the dorsomedial PFC with median accuracies of 83.6% and 81.6%, respectively. The summary of prediction accuracies is presented in Table 4.1. 4.4 Conclusion Effective treatment of psychiatric diseases requires detecting cognitive control states, identifying lapses in control, and neurostimulation as patients go about 34 Table 4.1: Summary of task vs. non-task classification results. Random label- assignment would result in a baseline accuracy of 50% Features FCCA [13] Proposed–all reg. Proposed–dlPFC Accuracy 78.1 88.07 86.57 Sensitivity 71.0 90.05 86.97 Specificity 79.2 87 85.07 their daily lives. The ability to differentiate task and non-task mental states with high accuracy, as demonstrated in this chapter, is crucial to identify objective biomarkers of cognitive control. The cross-validation ensures that the results gen- eralize across time, and the low standard deviation of 3.7% indicates the validity of the approach for multiple subjects. The discriminatory capacity of theta band (see Fig. 4.2) corroborates recent findings [58, 60]. Disorders such as schizophre- nia and OCD have been linked to dysfunctional modulation in dlPFC [94,95]. As shown in [53, 96], using dlPFC as a stimulation site may enhance cognitive con- trol. The results in Fig. 4.3 validate the hypothesis that cognitive effort involves modulation of dlPFC. Additionally, the decoders presented here utilize spectral powers in pre-defined frequency bands, which can be computed using existing neuromodulation devices without a significant computational overhead. Chapter 5 Decoding Cognitive Control Using Channel Level Networks 5.1 Introduction Neuropsychiatric disorders are the leading cause of disability in the United States; about one in every five American adults experiences mental illness. Existing treatments for mental illness are less than 50% effective, which leaves many pa- tients with behavioral disorders with degraded well-being [5]. The ineffectiveness of existing treatments is partly due to a lack of mechanistic understanding of the disorders and the consequent inability to address the cognitive symptoms. Dysfunctional cognitive control characterizes a wide range of mental disorders such as depression, addiction, anxiety disorders, autism spectrum disorders, and schizophrenia [62, 82, 97–99]. Therefore, there is a pressing need to study the neurological mechanisms underlying these disorders and develop new ways to rec- tify impaired cognitive control. Effective cognitive control involves restricting and controlling default responses in favor of a more desired adaptive response. Electrical deep brain stimulation (DBS) has shown some promise in modu- lating the brain circuits underlying abnormal behaviors. It has been proposed as a more effective approach for treating mental illnesses such as Parkinson’s 35 36 disorder, major depressive disorder (MDD), and obsessive-compulsive disorder (OCD) [79–81]. However, it suffers from ambiguous clinical outcomes, which limits its usage [80]. Therefore, further research is required to understand its mechanisms of action fully. Additionally, the optimal parameters of stimulation are still unknown. Increasing evidence suggests that adapting DBS parameters to a desired psychological effect, commonly known as closed-loop DBS, can be help- ful [69,83,84]. Since stimulation during healthy activity can interfere with normal function, it is critical to identify mental states of cognitive effort (e.g., decision making during a task) to target for intervention. However, there is no established signature for engagement in mentally demanding tasks. This chapter decodes local field potential (LFP) signals recorded from ten participants performing the multi-source interference task. The goal is to differ- entiate task engagement from background mental activity. Brain connectivity can be used to discover biomarkers that distinguish psychiatric disorders [100–102]. To this end, LFP based functional networks were utilized. Reliable detection of task-engagement can provide objective biomarkers to trigger stimulation. Thus, it could facilitate the development of closed-loop neuromodulation to prospectively bias a decision-making process before it begins. A prior study showed that task and rest states could be separated [13]. However, we found that the analysis in [13] suffers from a few drawbacks. The algorithm suffers from data-leakage, which results in an unreliable evaluation of the classifiers. More specifically, (1) the feature extraction utilizes test data that should be independent of the train- ing data, and (2) the training and test data were not temporally separated. The proposed network approach outperforms the previous methods by identifying task engagement with 89.7% median classification accuracy. The rest of the chapter is organized as follows. Section II describes the task, data and methods. Section III presents the results. Section IV concludes the chapter. 37 5.2 Materials and Methods 5.2.1 Data Description The experimental setup, data collection, and prepossessing are described in 3. 5.2.2 Class Label Assignment The neural activity recorded during MSIT trials is referred to as task data, and the data recorded during rest periods (before or after trial blocks) is referred to as non-task data. To differentiate between the task and non-task states, the signals were divided into multiple task segments and non-task segments. The time when the fixation cross was presented during a trial was marked as the trial’s start. The time duration between two consecutive fixation crosses determined the length of each trial. The duration of trials varies depending on the subjects’ reaction times. 3.8-second (minimum trial duration for most subjects) time segments from every trial’s start were labeled ‘task’ data. The signals recorded during rest periods were windowed into ‘non-task’ segments with a window length equal to the minimum task duration. Overlapping windows were used when the number of non-task segments was less than the number of task segments. If the amount of task data was less than the amount of non-task data, only a subset of the non-task data was used for classification. Thus, the two classes were balanced to make sure the classifiers are not biased. 5.2.3 Feature Extraction Functional networks were then constructed for each task/non-task segment by computing its correlation matrix. Fig. 5.1 depicts examples of the task and non- task functional networks in the form of adjacency matrices. Each entry in an adjacency matrix represents the strength of the connection (also known as the edge strength) between the two channels given by its row and column numbers. 38 10 20 30 40 50 60 To Channel 10 20 30 40 50 60 Fr om C ha nn el 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 (a) Task 10 20 30 40 50 60 To Channel 10 20 30 40 50 60 Fr om C ha nn el 0 0.1 0.2 0.3 0.4 0.5 0.6 (b) Non-Task Figure 5.1: Functional network of a randomly chosen task and non-task segments from one subject constructed using local field potential signals from 64 channels. 39 The adjacency matrices are symmetric matrices indicating that they are undi- rected correlational networks. The diagonals were set to zeros for better visual- ization. To extract useful patterns in these networks and eliminate redundancy, we employ principal component analysis (PCA). PCA is a dimensionality reduc- tion method that attempts to reduce the number of variables while preserving as much information as possible. The resultant principal components are unique linear combinations of the edge strengths such that their variance is maximized. Maximizing the variance aids in differentiating between the two states: task and non-task. The PCA coefficients were computed only using the training data to ensure no information leakage into the test data. 5.3 Results 5.3.1 PCA Features Functional networks shown in Fig. 5.1 contain task-specific patterns of brain activity. A timeseries with N channels represents a network with N nodes, and( N 2 ) connections. For example, a 64-channel recording would create 2016 features. The high dimensionality of the data makes it challenging to identify the patterns. PCA transforms the feature space such that the variance of the projections (in the new space) is maximized. The resultant features are usually more separable, aiding the classification process. Fig. 5.2(a) illustrates that when edge strengths are used as features, the two classes are not easily separable. In Fig. 5.2(b), we present a two-dimensional scatter plot of the first two principal components of subject-1. It can be observed that the task and non-task data are separable in the resultant feature space. Thus, PCA helps in finding patterns that differentiate task and non-task functional networks. 40 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Feature 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Fe at ur e 2 Task Non-task (a) Edge Strengths -1.5 -1 -0.5 0 0.5 1 1.5 Principal Component 1 -1 -0.5 0 0.5 1 1.5 Pr in ci pa l C om po ne nt 2 Task Non-task (b) First two principal components Figure 5.2: Two-dimensional scatter plot of two features before and after PCA from task and non-task data of subject-1. 41 Figure 5.3: Accuracy, sensitivity and specificity of task vs. non-task classification. 5.3.2 Classification Results For each subject, the data were split into ten subsets via sequential sub-sampling to ensure that test samples are not in the training samples’ temporal vicinity. One of the ten folds acts as the test set in each iteration, while the other nine act as the training data. This procedure, known as ten-fold cross-validation, ensures that all the data are tested and makes the classifiers less prone to overfitting. The PCA features were used as inputs to linear support vector machine (SVM) classifiers. The mean classification accuracy of the ten test sets was recorded. This process was repeated by changing the number of features/inputs from 1 to 150. The features with the highest cross-validation accuracy are considered the optimal features. Results of the classification for all subjects are presented in Table 4.1. The median task vs. non-task prediction accuracy is 89.7± 6.5%. This is a significant improvement over the 78.1± 7.39% reported in [13]. The task and non-task accuracies are 93.7±6.05% and 90.2±9.69%, respectively. The summary of prediction accuracies is presented in Fig. 5.3. 42 Table 5.1: Accuracy of task vs. non-task classifiers for the ten subjects. Sensitivity and specificity represent the task and non-task accuracies, respectively. Random label-assignment would result in a baseline accuracy of 50%. Sub. Num. ch. Acc. Sens. Spec. Num. of PCs 1 64 87.17 95.22 79.13 36 2 150 92.27 94.55 90.00 92 3 141 86.49 92.78 80.53 113 4 162 98.75 100.00 97.50 14 5 189 85.20 79.17 90.77 13 6 183 94.29 95.71 92.86 53 7 130 77.08 86.94 67.22 115 8 194 93.93 91.43 96.43 73 9 195 93.54 96.67 90.42 108 10 126 82.63 87.32 77.69 149 5.3.3 Classifier Runtimes Table 5.1 shows that the number of channels and, consequently, the number of optimal features, i.e., principal components (PCs), varies among the subjects. The computational complexity of the decoders is relatively low due to the simplicity of the approach. We computed the time taken to process a randomly chosen LFP segment, extract the features and determine its classifier outcome for each subject. These classifiers can be used to detect the task engagement in 2 seconds or less, as illustrated in Table 5.2. The runtimes were calculated using MATLAB programs implemented on a general-purpose machine with an Intel Core i7-8565U processor and 16 GB memory. 43 Table 5.2: The time taken to predict whether the participant is engaged in the MSIT or not. Each column shows the mean and standard deviation of 100 runs of the algorithm on a 3.8 second multidimensional segment Subject 1 2 3 4 5 6 7 8 9 10 µ(sec.) 2.09 1.99 2.04 1.57 1.65 1.88 1.89 1.42 1.55 1.50 σ(sec.) 0.09 0.15 0.07 0.09 0.13 0.18 0.09 0.03 0.11 0.04 Table 5.3: Summary of task vs. non-task classification results. The classifiers were trained separately for each subject. The table presents mean and standard deviation values of classification accuracy, sensitivity and specificity for each of the six network construction methods. Random label-assignment would result in a baseline accuracy of 50% Connectivity type Functional – R Effective – DI Effective – CCM Accuracy 89.13 ± 6.5 84.7 ± 5.8 86.6 ± 5.06 Sensitivity 91.98 ± 6.05 87.7 ± 6.3 89.18 ± 5.3 Specificity 86.25 ± 9.69 81.9 ± 7.33 84.01 ± 7.4 5.4 Comparison with Effective Networks Inter-channel DI and CCM connections calculated from 64-dimensional time se- ries from subject-1 are depicted in Fig. 5.4. The task vs non-task classification accuracy using effective networks is presented in Table .5.3. The results from correctional networks are also shown for comparison. The correlational networks outperform causal networks in this context. This is possibly due to redundancy added by the additional features from causal network. Note that causal networks have twice as many edges compared to correctional networks due to their direc- tionality. This leads to redundant information unless there is a way to combine these edges in a fashion that retains useful information. Next chapter presents a way to consolidate these networks into more meaningful region-level networks. 44 0 0.2 0.4 0.6 0.8 1 (a) DI 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 (b) CCM Figure 5.4: Channel-level effective (causal) networks of subject-1. The networks were constructed using local field potential signals from 64 channels. 45 5.5 Discussion and Conclusion Cognitive control refers to the effortful deployment of cognitive resources for adaptive response to the environment. Intact cognition consists of interrelated executive functions, including updating (i.e., monitoring working memory), in- hibition (resisting prepotent responses), and shifting (switching between mental states) [43]. There is no well-defined neural signature for the mental effort asso- ciated with these coordinated cognitive processes that facilitate decision-making. Several studies show that cognitive control deficits are central symptoms in many psychopathological conditions, including schizophrenia, depression, and addiction. Interventions such as DBS that target the cognitive control networks could be influential for mitigating symptomatic distress, functional impairments, and di- minished quality of life prevalent across psychiatric disorders. Developing effec- tive therapeutics through adaptive neuromodulation requires rapid detection of focused mental activity. This chapter demonstrates that such mental states are encoded in the func- tional connectivity of the brain. Fig. 5.2 illustrates that neural correlates of task-related mental effort can be separated from background activity. The results in Fig. 5.3 and Table 4.1 show that task engagement can be detected with high accuracy. The proposed approach attains a median accuracy of 89.7%, a substan- tial increase from the 78.1% reported in [13]. There is some variation between the subjects, partly due to variation in the electrode implant setups; the number of channels varied between 64 and 195. Moreover, the non-task data involved recordings during free-behavior without any restrictions or standardization be- tween the participants. This heterogeneity in the non-task activity might explain why the classifiers have slightly higher sensitivity than specificity. A limitation of the current work is that it is limited to a single task. The future work will be focused on applying the algorithm to multiple cognitive tasks. Chapter 6 Maximal Variance Node Merging and Decoding Cognitive Control Using Region-Level Networks 6.1 Introduction Neuropsychiatric disorders impose an enormous global disease burden that leads to premature mortality and degraded quality of life. Existing treatments for these disorders are less than 50% effective [103]. Many patients with mental illness do not get relief from gold-standard clinical therapies, resulting in a pressing need for new treatments. Recent studies indicate that these treatments might emerge from measuring and remediating cognitive deficits underpinning mental illness, e.g., through electric stimulation [84,104,105]. Dysfunctional decision-making and cognitive control are common features in a wide range of mental disorders such as depression, addiction, anxiety disorders, autism, and schizophrenia [62, 82, 106]. Cognitive control is a set of interrelated executive functions, including updating (i.e., monitoring working memory), inhibition (resisting prepotent responses), and shifting (switching between mental sets) [43]. It is, therefore, crucial to accom- modating daily life requirements, ultimately affecting the quality of life [5]. The 46 47 ability to remediate cognitive control might thus be a new approach to treating mental disorders. Cognitive control can be measured in real time through standard laboratory tasks and direct brain stimulation can improve the performance during these tasks [69, 70, 96]. It is, however, unclear when this type of intervention should be automatically triggered to achieve a desired psychological effect. Increasing evidence indicates, however, that cognitive dysfunction in mental disorders can be described as aberrant patterns of interactions between neural elements in a large-scale brain network [101, 102, 104, 107]. It should therefore be possible to decode this dysfunction by examining changes in functional network communica- tion. The majority of the existing brain network studies related to cognitive control are limited to functional connectivity [5, 6]. Although some effective connectiv- ity studies exist, they are based on signals acquired through functional magnetic resonance imaging (fMRI) [9, 10]. fMRI is an indirect measure of neural activity and recorded at a low temporal resolution, which is not ideal for capturing cog- nitive functions that involve finer temporal dynamics. A better approach would be direct decoding of cognitive control and control lapses from electrical brain recordings [69]. A recent attempt to decode cognitive control task engagement reported suc- cessful classification [13, 108], but only analyzed functional connectivity. Such correlational analyses cannot identify the direction of information flow in brain networks, a critical variable for deciding how to stimulate. The analysis in [13] also suffered from data leakage, which may have lead to unreliable evaluation of the classifiers. More specifically, (1) the feature extraction utilized test data that should be independent of the training data, and (2) the training and test data were not temporally separated. Another challenge is that the underlying datasets come from human participants with highly variable placement of their brain elec- trodes in each brain region. Data interpretation requires methods for measuring and controlling for that variability across participants. 48 Deriving meaningful causal networks from task-related activity has remained a challenge. Granger causality has been used in numerous neuroscience appli- cations [19]. However, Granger’s approach could lead to ambiguity in biological and ecological networks due to non-separable dynamics [24]. This chapter uses directed information and convergent cross-mapping, causal inference techniques that are not based on Granger’s theory. The problem of electrode variability can be addressed by merging several channels (electrodes) associated with a region to form region-level networks. In most prior research, region-level signals were computed by averaging signals from multiple channels in a region [109]. These averaged signals do not necessarily correlate to the tasks, as classifiers based on these networks fail to achieve high accuracy (as described in Section IV). We over- come this challenge by using the proposed novel maximal variance node merging (MVNM) approach. Causal cognitive control networks based on electrophysiolog- ical signals have not been constructed before. These networks not only confirm known network properties of cognitive control but also lead to new findings that can be explored further. The contributions of this chapter are five-fold. 1. We construct and compare three different brain networks: one functional and two effective networks). We derive causal networks based on a tech- nique called convergent cross-mapping (CCM) [12] and show that the causal networks help identify regions of interest associated with task engagement. Thus, we utilize distributed brain connectivity analysis to not only detect task engagement, but also identify potential biomarkers for cognitive control. 2. We propose a novel technique referred to as maximal variance node merging (MVNM) to estimate region-level interactions. Unlike channel-level net- works, region-level networks are more interpretable and more relevant for clinical translation. 3. We introduce and present task engagement networks (TENs) by combin- ing the most explanatory network interactions from multiple subjects. The 49 TENS can be further analyzed to identify significant regions that may be useful as stimulation sites. 4. We demonstrate that the causal inter-region networks can differentiate men- tal states associated with task performance from resting-state activity with 85.2% median accuracy. A previous analysis using the same data attained 78% accuracy [13]. 5. We show that subband networks constructed from bandpass filtered signals also encode task specific activity. Especially, theta band (4–8 Hz) networks play a major role in detecting task engagement, consistent with prior findings that theta band oscillations are modulated during cognitive control. 6.2 Materials and Methods 6.2.1 Data Description The experimental setup, data collection, and prepossessing are described in 3. 6.2.2 Connectivity Measures Network science provides a particularly appropriate framework to study brain mechanisms by treating neural elements (a population of neurons, a subregion) as nodes in a graph and neural interactions (synaptic connections, information flow) as its edges [110]. Structural connectivity represents large-scale anatomical connections between cortical regions. Functional and effective connectivities are generally estimated from the time-series of brain signals. A fundamental distinc- tion between the two is that the edge weights in functional networks correspond to cross-correlation coefficients, while those in effective networks correspond to patterns of causal interactions. Quantifying (correlative or causative) interactions between time-series is of particular interest in studying complex network systems such as the brain [14]. This study estimates and analyzes functional connectivity 50 and effective connectivity between the electrodes and the regions. The effective networks were constructed using two methods: directed information (DI) and convergent cross-mapping (CCM). These methods were described in Chapter 2. Instead of assuming parametric models such as the auto-regressive model used by Granger causality, DI is based on information theory [29]. It can therefore measure nonlinear interactions and is not dependent on accurate estimation of model parameters. A benefit of using CCM for causal inference is that, unlike Granger-related causality measures, CCM does not assume ‘separability’ between variables [12]. The separability assumption is usually valid in linear and strongly coupled non-linear systems. Complex subsystems such as the brain are charac- terized by moderate to weak coupling and are affected by unobserved/external variables. The cross-mapping in CCM was implemented in MATLAB based on the algo- rithm presented in [111]. A library of L points is estimated, and the correlation coefficient ρxxˆ is used as an indicator of the influence of X on Y . The causal connection from Y to X can also be determined analogously. Library length (L) of 3000 is used in this study since all signals are at least 3000 samples long. The parameters τ and E are chosen to be 5 and 10, respectively, after extensive trial and error. 6.2.3 Maximal Variance Node Merging and Region-Level Networks A significant challenge in this specific application space, read-out of cognition from distributed electrodes, is an imbalance in the number of measurements be- tween nodes/edges. A given brain region (node) may have anywhere from 1 to 5 physical electrodes measuring it, depending on the size of the region and the specific clinical placement of the electrode. This imbalance makes it harder to in- terpret channel-level networks, where each node corresponds to a specific electrode contact. Similarly, since electrode positions vary between subjects, it is unclear 51 (a) Original network (b) Inter-region network after MVNM (c) Illustration of the MVNM algorithm. N is the number of task/non-task segments Figure 6.1: The MVNM algorithm and graph visualization of a sample causal network before and after MVNM. The causal network has three regions with two channels (each) in regions 1 and 2, and three channels in region 3. 52 LA my g LH ipp Ld aC C Ld lPF C Ld mP FC Lm OF C Lp os tCC Lte mp ora l RA my g RC au da te RH ipp Rd aC C Rd lPF C Rd mP FC RlO FC Rt em po ral Rv lPF C To LA m yg LH ip p Ld aC C Ld lP FC Ld m PF C Lm O FC Lp os tC C Lt em po ra l R A m yg R Ca ud at e R H ip p R da CC R dl PF C R dm PF C R lO FC R te m po ra l R vl PF C From 00. 05 0. 1 0. 15 0. 2 0. 25 (a ) F u n ct io n al – R LA my g LH ipp Ld aC C Ld lPF C Ld mP FC Lm OF C Lp os tCC Lte mp ora l RA my g RC au da te RH ipp Rd aC C Rd lPF C Rd mP FC RlO FC Rt em po ral Rv lPF C To LA m yg LH ip p Ld aC C Ld lP FC Ld m PF C Lm O FC Lp os tC C Lt em po ra l R A m yg R Ca ud at e R H ip p R da CC R dl PF C R dm PF C R lO FC R te m po ra l R vl PF C From 00. 2 0. 4 0. 6 0. 8 (b ) E ff ec ti v e – D I LA my g LH ipp Ld aC C Ld lPF C Ld mP FC Lm OF C Lp os tCC Lte mp ora l RA my g RC au da te RH ipp Rd aC C Rd lPF C Rd mP FC RlO FC Rt em po ral Rv lPF C To LA m yg LH ip p Ld aC C Ld lP FC Ld m PF C Lm O FC Lp os tC C Lt em po ra l R A m yg R Ca ud at e R H ip p R da CC R dl PF C R dm PF C R lO FC R te m po ra l R vl PF C From 00. 1 0. 2 0. 3 0. 4 0. 5 (c ) E ff ec ti v e – C C M F ig u re 6. 2: F u n ct io n al (c or re la ti ve ) an d eff ec ti ve (c au sa l) n et w or k s of su b je ct -1 co n st ru ct ed fr om a ra n d om ly ch os en ta sk se gm en t. 53 which channels can be safely averaged/combined. That combination is achievable if we can identify channels as belonging to specific brain regions, such that we can work in terms of the dominant signal within each region and the inter-region interactions. We introduce maximal variance node merging (MVNM) as an approach to combine nodes in a channel-level network that were mapped to the same brain region to generate a region-level network. Each node in a region-level network is associated with a brain region. This enables us to interpret the network connec- tions better and, consequently, detect specific task engagement regions. First, the channels were organized into their corresponding regions based on electrode localization results. Then, region-level networks were constructed from channel-level networks using MVNM. For each pair of regions, all the network connections between the regions (inter-region connections) were replaced by one representative connection in the case of undirected networks. Note that intra- region connections were discarded. There would be two resultant connections in directed networks: one representing all the edges from region-A to region-B and the other for the edges from region-B to region-A. Fig. 6.1(a) and Fig. 6.1(b) depict an example network before and after the node merging process, respectively. These new inter-region edges were estimated by computing the optimal lin- ear combination of the original edges to maximize the variance over time. This process is equivalent to finding the first principal component of the original edges between the two regions using principal component analysis (PCA) – a dimen- sionality reduction method that attempts to reduce the number of variables while preserving as much information as possible. This approach is beneficial since the goal is to use these edges as features for classification. The MVNM algorithm’s rationale is that a larger variance implies a broader spread in the feature space, which helps find the decision boundary. Fig. 6.1(c) illustrates this process for two regions A and B with m and n channels, respectively. There can be mn edges between the two regions in an undirected network (functional) and 2mn edges in a directed network (effective). 54 Intra-region connections can be ignored here. If there are N such networks in the dataset, it can be represented by a matrix of dimensions N ×mn, denoted by X. Note that N is the number of time-series segments (task and non-task) extracted for a given subject. There would be two such matrices, Xin and Xout, for effective networks. The optimal linear combination of the columns of X that maximizes its variance is given by Xw, where the mn× 1 weight vector w is computed as: w = arg max ∥w∥=1 { wTXTXw } . (6.1) The optimal linear combination of these connections, Xw, provides the inter- region edge strengths for the N networks. Note that Xw is the first principal component of X. In effective networks, the incoming and outgoing connections were processed separately to determine the two inter-region connections. It can be noticed that the adjacency matrices in effective networks (Fig. 6.2(b) and Fig. 6.2(c)) are not symmetric, indicating the directionality of the networks. 6.2.4 Edge Importance Score and Task Engagement Net- works (TENs) Once all the networks were computed, PCA of the edges was used for feature extraction. Task engagement networks were constructed from the most significant connections that were effective in identifying MSIT states. Note that principal components of the network edges were used as the features for classification. The PCA assigns a weight to each network connection to compute optimal principal components. These PCA coefficients were used as an indicator of the significance of the connections. For a given connection between two regions, the sum of its coefficients linked to the top p features (principal components) is defined as its importance score. The edges were then ranked based on their importance scores and the q highest-ranked edges were deemed significant edges – represented by the set Ssub for a given subject sub. Parameters p = 10 and q = 60 were used in our analysis. Note that p and q were chosen such that the choice of these parameters 55                                 !               ⋂    "     :    ∑  ,   ,  , ∈                         Figure 6.3: Flowchart showing the key steps of ‘task’ vs. ‘non-task’ classification process and determination of task engagement networks. leads to high classification accuracy (see subsection below). This algorithm is depicted in Fig. 6.3. A 10-fold cross-validation split was performed on the data before the PCA as show in fig. 6.3. Since the cross validation results in ten models for each subject, Ssub is the intersection of significant edges from the ten models. This is given by the expression, Ssub = ⋂ fold Ssub,fold. This ensures that the edges are not specific to a subset of the data and generalize across the dataset. The mean network of all Ssub constitutes the task engagement 56 network STEN . That is, the edge strength eAB from regions A to B is defined as eAB = ∑ sub eAB,sub |sub| , (6.2) where eAB,sub ∈ Ssub and |sub| is the number of subjects. Therefore, the edges in the TEN signify their prevalence among multiple subjects. 6.2.5 Task vs. Non-Task Classification The PCA features were used as inputs to linear support vector machine (SVM) classifiers to distinguish task data from non-task data. For a given subject, train- ing and testing sets for each of the ten folds were selected via sequential sub- sampling to ensure that test samples are not in the temporal vicinity of the train- ing samples. Essentially, data from each class are sequentially partitioned into ten subsets (folds). In each iteration, one of the ten folds is used for testing, while the other nine are used for training. This setup ensures that all the data are tested, and the classifiers are less prone to overfitting. Fig. 6.3 depicts the classification process. Classifiers with a varying number of inputs were trained with a maximum limit of 100 features and the testing accuracy with the optimal number of features is reported. 6.2.6 Subband Networks Neurophysiological studies indicate that theta activity increases with the need for cognitive control [55, 58–60]. We constructed subband networks to analyze the network interactions of frequency-specific neural activity. First, the neural recordings were bandpass filtered into five pre-defined subbands to quantify the role of individual frequency bands on task engagement. These bandpass filtered signals were then used to construct band-specific networks, named subband net- works. The frequency bands considered in the study are theta (4-8 Hz), alpha (8-13 Hz), beta (13-30 Hz), gamma-1 (30-55 Hz), and gamma-2 (65-100 Hz). The bandpass filtering was implemented using 6th order Butterworth IIR filters. All 57 R DI CCM 60 70 80 90 100 A cc ur ac y( % ) Figure 6.4: Task vs. non-task classification accuracy using networks constructed using three connectivity measures: correlation (R), directed information (DI), convergent cross-mapping (CCM). Each point within the boxplots represents 10- fold cross-validation accuracy of a participant. signals were filtered bidirectionally to avoid undesired phase shifts introduced by the filtering, which can affect causal inference. 6.3 Results 6.3.1 Identifying Task States The SVM models were evaluated based on how accurately they could distinguish between task and non-task states for all subjects. Their classification accuracy, sensitivity (true-positive rate), and specificity (true-negative rate) were calcu- lated; the task data was considered as the positive class. Fig. 6.4 presents the 58 Table 6.1: Summary of task vs. non-task classification results. The classifiers are subject specific. The table presents median and interquartile range values of clas- sification accuracy, sensitivity (task accuracy) and specificity (non-task accuracy) for each of the three network construction methods. Random label-assignment would result in a baseline accuracy of 50%. The highest accuracy in each row is presented in bold. Network Functional Effective Method FCHA [13] R+MVNM DI+MVNM CCM+MVNM Acc. 78.1 ± 7.39 82.74 ± 7.45 80.85 ± 4.9 85.17 ± 5.0 Sens. 71.0 ± 10.3 84.86 ± 8.57 82.58 ± 8.57 87.49 ± 8.77 Spec. 79.2 ± 7.7 83.7 ± 9.31 79.74 ± 6.38 82.12 ± 13.2 SVM classification accuracy for all the subjects evaluated based on the three net- work construction approaches. The reported values represent the mean accuracy over 10-fold cross-validation. The plot illustrates that all network types attain accuracies substantially higher than the baseline of 50%, suggesting that there are patterns of interaction in the neural activity that can be used to identify task engagement. CCM networks perform the best with 85.2 ± 5.0% accuracy. The high accuracy indicates that inter-region interactions contain useful task-related information. The median accuracy and interquartile range values across the ten subjects are summarized in Table 6.1. The median accuracy, sensitivity, and speci- ficity of all MVNM-based networks exceed 80%. The interquartile range values are also low, indicating the algorithm’s reliability and robustness across multiple human subjects. The slightly lower specificity compared to sensitivity may be attributed to heterogeneity in the non-task data. It is to be noted that the non- task represents free-behaviour of the subjects between trials. All three methods outperform fixed canonical coherence analysis (FCHA) presented in [13]. 59 6.3.2 Classification Results Using Mean Region Networks This section presents a summary of task vs. non-task classification results using networks computed from averaged time-series in each region. The local field po- tential signals from all channels with a region of interest are averaged, and the resulting time-series is used to generate the networks. It can be observed that this approach attains about 10% lower classification accuracy compared to MVNM. Results from MVNM and FCHA are also included in Table S1 for comparison. Table 6.2: Summary of task vs. non-task classification results. The table presents median and interquartile range values of classification accuracy, sensitivity (task accuracy) and specificity (non-task accuracy) for each of the three network con- nectivity measures: R, DI and CCM. Method Averaged time-series MVNM Conn. FCHA [13] R DI CCM R DI CCM Acc. 78.1 ± 7.39 75.79 ± 7.59 70.04 ± 10.7 66.9 ± 8.87 82.74 ± 7.45 80.85 ± 4.9 85.17 ± 5.0 Sens. 71.0 ± 10.3 78.45 ± 11.39 69.8 ± 23 56.53± 26.36 84.86 ± 8.57 82.58 ± 8.57 87.49 ± 8.77 Spec. 79.2 ± 7.7 76.7 ± 14.1 67.97 ± 9.6 82.89± 16.84 83.7 ± 9.31 79.74 ± 6.38 82.12 ± 13.2 6.3.3 Task Engagement Networks Fig. 6.5 depicts the TENs from functional (R) and effective networks (DI and CCM). Each edge in the graphs represents the number of times a specific con- nection appears in STEN . Out of the fourteen regions of interest that emerged from the analysis, connections between some regions are more prominent than the others. More interestingly, the graph visualizations in Fig. 6.5(c) showcase a close resemblance to the results in Fig. 6.5(b), which are based on an independent causal approach. To quantify the importance of each of the regions in task engagement, we mea- sured the node centralities of the regions in each TEN (see Fig. 6.6). Node degree and outdegree were computed for functional and (two) effective networks, respec- tively. The dorsolateral PFC (dlPFC) and temporal lobe show more centrality in 60 (a ) C or re la ti on (b ) D ir ec te d in fo rm a ti o n (c ) C o n ve rg en t cr o ss -m a p p in g F ig u re 6. 5: T as k en ga ge m en t n et w or k s ge n er at ed fr om th e th re e n et w or k co n st ru ct io n m et h o d s. E ac h n o d e re p re se n ts on e of th e 14 re gi on s of in te re st (A cc : ac cu m b en s, A m y g: am y gd al a, ca u d at e, H ip p : h ip p o ca m p u s, d A C C : d or sa l an te ri or ci n gu la te co rt ex , d lP F C : d or so la te ra l p re fr on ta l co rt ex , d lP F C : d or so m ed ia l p re fr on ta l co rt ex , lO F C : la te ra l or b it of ro n ta l co rt ex , m O F C : m ed ia l or b it of ro n ta l co rt ex , p ar ah ip p : p ar ah ip p o ca m p u s, p os tC C : p os te ri or ci n gu la te co rt ex , rA C C : ro st al an te ri or ci n gu la te co rt ex , te m p or al lo b e, v lP F C : ve n tr al la te ra l p re fr on ta l co rt ex ). T h e th ic k n es s of th e ed ge s re p re se n t ed ge st re n gt h , as d es cr ib ed b y (6 .2 ). 61 Ac c Am yg Ca ud ate Hip p da CC d lPF C dm PF C lO FC m OF C pa rah ipp po stC C rA CC tem po ral vl PF C 024681012 Node Degree (a ) C or re la ti on Ac c Am yg Ca ud ate Hip p da CC d lPF C dm PF C lO FC m OF C pa rah ipp po stC C rA CC tem po ral vl PF C 024681012 Outdegree (b ) D ir ec te d in fo rm a ti o n Ac c Am yg Ca ud ate Hip p da CC d lPF C dm PF C lO FC m OF C pa rah ipp po stC C rA CC tem po ral vl PF C 024681012 Outdegree (c ) C o n ve rg en t cr o ss -m a p p in g F ig u re 6. 6: N o d e ce n tr al it y of ea ch re gi on in th e ta sk en ga ge m en t n et w or k s. T h e b ar p lo ts re p re se n t n o d e d eg re e fo r u n d ir ec te d (R ) n et w or k s an d ou td eg re e fo r d ir ec te d n et w or k s (D I an d C C M ). 62 all three cases, although the distinction is more prominent in effective networks. In correlation networks, the difference between regions is less noticeable, making it harder to discriminate key hubs using just correlation analysis. However, dlPFC has a considerably higher outdegree in effective networks estimated from causal interactions, followed by temporal lobe and ventrolateral PFC (vlPFC). 6.3.4 Subject-Specific Task Engagement Networks Task engagement networks (TENs) for each subject are presented in this section. Fig. 6.7 , Fig. 6.8 and Fig. 6.9, respectively, show subject-specific TENs de- rived using correlation coefficient (R), DI and CCM. By visual inspection, we can observe that dlPFC and the temporal lobe are major hubs in the graphs. This observation is more pronounced in causal networks (DI and CCM). The regions of interest are labelled as follows: Acc– accumbens, Amyg– amygdala, caudate, Hipp– hippocampus, dACC– dorsal anterior cingulate cortex, dlPFC– dorsolat- eral prefrontal cortex, dlPFC– dorsomedial prefrontal cortex, lOFC– lateral or- bitofrontal cortex, mOFC– medial orbitofrontal cortex, parahipp– parahippocam- pus, postCC– posterior cingulate cortex, rACC– rostal anterior cingulate cortex, temporal lobe, vlPFC– ventral lateral prefrontal cortex. 63 F ig u re 6. 7: T as k en ga ge m en t n et w or k s d er iv ed u si n g in te r- re gi on co rr el at io n al n et w or k s. 64 F ig u re 6. 8: T as k en ga ge m en t n et w or k s d er iv ed u si n g in te r- re gi on D I n et w or k s. 65 F ig u re 6. 9: T as k en ga ge m en t n et w or k s d er iv ed u si n g in te r- re gi on C C M n et w or k s. 66 6.3.5 TENs in Left and Right Hemispheres Task engagement networks for both the hemispheres are shown in Fig. 6.10. These directed networks were constructed using CCM. The corresponding node central- ities are also presented in Fig. 6.11. The networks depicted in Fig. 6.12(a) include intrahemispheric (within-hemisphere) and interhemispheric (cross-hemisphere) links. The node centrality values indicate no significant differences in the main hub re- gions in both hemispheres: dlPFC, temporal and vlPFC. Some interhemispheric links can be observed between the high centrality nodes. However, these connec- tions should not be confused with true information flow or structural connectiv- ity. They may indicate functional correlations or zero-lag interactions between the nodes. 6.3.6 Increased Theta Band Activity During Task Perfor- mance Greater theta band activity is known to be associated with cognitive control [55, 56, 58–60, 72]. Using bandpass filtering, the neural activity in the task and non- task periods was divided into five subbands: theta (4-8 Hz), alpha (8-13 Hz), beta (13-30 Hz), gamma-1 (30-55 Hz), and gamma-2 (65-100 Hz). Relative power in the theta band was computed as a ratio of the spectral power in the 4-8 Hz frequency range to the total power spectral density in 4-100 Hz. Fig. 6.13 shows a distribution of relative power in theta band in the left dlPFC of subject-2 in task vs. non-task segments. Theta power during the task and non-task states has a log-normal distribution, with means 0.28 and 0.21, respectively, demonstrating an increased theta band power during this cognitive control task. We also observed an enhanced dlPFC theta activity in seven out of the ten subjects. The increased mean relative power in the seven subjects is also associated with p-values < 0.05 and Bayes factor > 9 (in 6 of the 7 cases) according to the two-sample t-test. This is consistent with a wide range of prior reports implicating modulated theta-band activity during cognitive control [55, 56,58–60,112]. 67 (a) Left hemisphere. (b) Right hemisphere. Figure 6.10: Task engagement networks in left and right hemispheres generated using CCM. 68 LA my g LC au da te LH ipp Ld aC C Ld lPF C Ld mP FC LlO FC Lm OF C LrA CC Lte mp ora l Lv lPF C 0 1 2 3 4 5 6 7 8 9 10 O ut de gr ee (a) Left hemisphere. RA cc RA my g RC au da te RH ipp Rd aC C Rd lPF C Rd mP FC RlO FC Rr AC C Rte mp ora l Rv lPF C 0 1 2 3 4 5 6 7 8 9 10 O ut de gr ee (b) Right hemisphere. Figure 6.11: Node centrality of each region in the left and right task engagement networks. 69 (a) Task engagement network. LA my g LC au da te LH ipp Ld aC C Ld lPF C Ld mP FC LlO FC Lm OF C Lp os tCC LrA CC Lte mp ora l Lv lPF C RA cc RA my g RC au da te RH ipp Rd aC C Rd lPF C Rd mP FC RlO FC Rp ara hip p Rr AC C Rte mp ora l Rv lPF C 0 5 10 15 20 25 O ut de gr ee (b) Node centrality. Figure 6.12: Task engagement network including inter-hemispheric connections. 70 Figure 6.13: Histogram of relative theta band power in left dorsolateral PFC of subject-2 for task and non-task periods. 6.3.7 Theta Band Network Interactions Differentiate Task and Non-task States To evaluate the role of inter-region subband activity of the five subbands in the MSIT, we decode the task states using the subband networks. The aim is to discern and compare the discriminative ability of network dynamics associated with each of the five subbands. Since CCM networks have the best performance, CCM based subband networks are used. We observe that the median classification accuracy of the subband networks is 79.5% – marginally less than the networks without frequency filtering. This implies that subband dynamics also encode task- related information. By measuring the number of significant edges corresponding to each subband, we observe that 4-8 Hz activity distinguishes task states better than other frequencies. This is illustrated in Fig. 6.14. TENs built using only significant theta subband edges also highlight the in- fluence of dlPFC and temporal lobes in the MSIT, as shown in Fig. 6.15. The 71 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 0 20 40 60 80 100 % of top features Pa tie nt Id . Gamma_2 Gamma_1 Beta Alpha Theta Figure 6.14: Proportion of optimal features from each subband. dorsomedial PFC (dmPFC) attains the third highest centrality. This shows that theta oscillations (especially in dlPFC) and their network-level interactions can act as biomarkers for task engagement. 6.4 Discussion Despite its pressing need, devising an effective neurological mechanism-based treatment to enhance cognitive control is still a major challenge. This might be achieved through adaptive brain stimulation that intervenes when cognitive control lapses [69, 70,113]. Developing such a treatment requires a deeper under- standing of cognitive control that may be provided by network analysis. Authors in [13] describe a unique approach to construct inter-region networks using canonical correlation analysis, called fixed canonical coherence analysis (FCHA). Even though both FCHA and MVNM involve estimating a linear com- bination of multiple variables, there are fundamental distinctions between them. For a given pair of regions X and Y , the FCHA maximizes the coherence between 72 Am yg Ca ud ateHip p da CC dlP FC dm PF C lOF C m OF C pa rah ipp po stC C rA CC tem po ral vlP FC 0 2 4 6 8 10 12 O ut de gr ee Figure 6.15: Outdegree of the regions in theta band TEN. multivariate time-series in X and Y . This optimization is accomplished by esti- mating the optimal linear combination of multiple channels within each region. MVNM is a novel method that optimizes the variance of the network interactions over time. Unlike FCHA, MVNM converts (possibly informative) channel-level networks into more interpretable region-level networks. This method can be ap- plied to correlational or causal networks. Averaging time-series recordings is a common approach to combine neural signals from the same anatomical region [109, 114]. Such techniques may dis- card valuable network dynamics. Classification results presented in the Table 6.2 demonstrate that the averaged connectivity is not well-suited to uncover the opti- mal causal interactions between the regions. Table S1 shows that MVNM attains a 10% increase in accuracy compared to the time-series averaging. The proposed methods help detect task engagement with approximately 85% median accuracy, a notable improvement over [13]. In addition, the classifiers are characterized by low variation across cross-validation folds and across subjects indicating the ro- bustness of the approach. However, it should be noted that these classifiers are 73 subject-specific. A generalized decoder for all subjects may lead to lower accuracy. A high classification accuracy may not always lead to advances in understand- ing underlying brain mechanisms. Hence, we use graph analysis of the networks to determine TENs presented in Fig. 6.5. TENs highlight changes in network inter- actions across the subjects due to task-related effort. The causal TENs highlight dorsolateral PFC, temporal lobe, and ventrolateral PFC as the most influential regions of interest with high outdegree. The observations from CCM and DI net- works – independent causality measures – corroborate each other, supporting the validity of our approach. The involvement of dlPFC in cognitive control, espe- cially in tasks involving conflict or inhibition of irrelevant information, is reported in prior research [45, 50, 70]. Schizophrenia and OCD have been linked to dys- functional modulation in dlPFC and vlPFC [94, 95]. Our TENs emphasize the activation of dlPFC better than [13]. dlPFC was also used as a stimulation site to enhance cognitive control successfully [53]. Several studies suggest that increased frontal theta activity is associated with cognitive control [57–60, 70]. Fig. 6.14 shows that the majority of the discriminative features used in the classifiers orig- inate from theta band activity in all the subjects. Fig. 6.15 implies that theta interactions in dlPFC, dmPFC and the temporal lobe play an important role in cognition. The alignment of our data-driven findings with prior studies validates the ability of our approach to discover true mechanistic network structure. The role of the temporal lobe, which is outside the canonical frontoparietal cognitive control networks, is not entirely understood [115, 116]. This is partly because most cognitive control studies are focused on the frontal regions instead of observing a global brain network. It is shown in [117] that the temporal lobe had the largest response to a cognitive task among non-frontal regions. Alternately, the temporal lobe weighting could be related to the participants’ epilepsy. Pa- tients with temporal lobe epilepsy have been reported to suffer from dysfunctional control characterized by interactions between the epileptogenic temporal lobe and the PFC [118,119]. 74 Even though the MSIT has been shown to activate cognition/attention net- works [66, 85, 120], clinical translation towards a viable treatment for psychiatric disorders presents significant challenges. First, any task performance can include several other behavioral and physiological mechanisms that may not be related to cognitive effort. For example, task engagement can provoke anxiety, leading to engagement of emotional arousal networks. Therefore, further research involving multiple tasks and a larger cohort of subjects is needed to form robust conclusions about the neural encoding of cognitive control. Another challenge is that the effi- cacy of adaptive DBS on psychiatric disorders is not sufficiently understood. We believe that objective network biomarkers at an electrophysiological level can help detect and rectify cognitive dysfunction [69,104]. Future studies may also explore non-invasive recording/stimulating modalities 6.5 Conclusion Network representations can decode cognitive control and other mental functions by identifying relevant cross-region interactions. We show that network connec- tions between dorsolateral PFC, temporal lobe and ventrolateral PFC were dis- criminative through graph analysis of causal task engagement networks. More- over, independent causal inference techniques (DI and CCM) indicate a higher outdegree in those regions, supporting the potential for dlPFC as a stimulation site [53, 96, 104, 121]. Subband network analysis reveals that enhanced cognitive control is associated with modulated theta band activity. There is substantial evidence supporting stimulation-induced modulation of pathological network activity as a therapeutic mechanism of treatments such as deep brain stimulation [122–124]. The methods developed in this chapter enable the discovery of objective biomarkers, localized regions of interest, and band os- cillatory activity associated with task engagement. This knowledge can facilitate a transition from symptom-based treatments to more effective mechanism-based treatments for mental illness. 75 The network analysis methods described in this chapter may be used to de- scribe other neurological phenomena beyond the specific task considered here. For example, the algorithm used to determine task engagement networks can also be used to identify network interactions that are dysfunctional in specific disorders. Such an inference can be made by comparing patients to healthy controls (instead of task vs. non-task). 76 PART II: PARKINSON’S DISEASE DETECTION Chapter 7 Introduction to Parkinson’s Disease 7.1 Parkinson’s Disease Parkinson’s disease (PD) is the second most common neurodegenerative disease, affecting 1 in every 100 adults over the age of 65. PD is characterized by a gradual loss of dopaminergic nigrostriatal neurons in a part of the brain called substan- tia niagra. Its common clinical symptoms include progressive tremor, rigidity, bradykinesia (slowness of movement), unstable posture, balance and gait abnor- malities, and dysphonia (voice disorders). Eventually, the neural loss can spread to non-motor regions leading to cognitive defects ranging from mild cognitive im- pairment (MCI) to dementia. Parkinson’s disease can be challenging to diagnose in its early stages. We currently have no definitive diagnostic test for PD; the diagnosis is clinical, and an autopsy is considered necessary for disease confirma- tion. This lack of reliable diagnosis is partly because patients’ symptoms vary vastly. PD has traditionally been considered idiopathic (no known cause), and there is no cure. Reliable biomarkers of symptomatic and presymptomatic dis- ease can help with an automatic and accurate diagnosis. In addition, biomarkers provide objective targets for developing and evaluating the clinical efficacy of new 77 78 treatments. 7.2 Network Biomarkers of Parkinson’s Disease A biological marker or biomarker is defined as a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention [125, 126]. One of the main goals for biomarker identification is facilitating or improving dis- ease diagnosis. This objective is critical in the case of PD because the accuracy of gold-standard clinical diagnosis is only about 80% and has not improved in the last 30 years [127]. This accuracy is much lower in patients with no severe symp- toms. Since PD is a degenerative disease, longitudinal treatment after diagnosis determines the patients’ well-being. Another application of biomarkers in PD is monitoring disease progression and demonstrating treatment efficacy. Therefore, a necessary characteristic of biomarkers is to accurately differentiate patients and susceptible individuals from healthy controls. Several studies have shown that whole-brain network connectivity estimated from resting-state or task-related functional MRI are altered in patients with PD [128–132]. Some functional imaging studies found disrupted connectivity in crucial networks such as sensorimotor network [133, 134] and default mode net- work (DMN) [135] in PD patients [136]. Recent findings also suggest that network measures such as node centrality of specific regions (modeled as nodes in a graph) can differentiate PD patients from healthy controls [129,136–138]. The majority of these studies are focused on functional imaging data and functional (undirected) networks. This dissertation is focused on novel methods to extract those net- work biomarkers by estimating and analyzing effective (causal) brain networks in Parkinson’s patients. 79 Table 7.1: Parkinson’s dataset-1 subject demographics (mean ± SD) reproduced from [2]. Each control was matched to a person with PD with respect to their age, sex and handedness. Condition PD Control Age (years) 62.62±8.32 63.50±9.66 Sex 8M 8F 7M 9F Handedness All R All R MMSE 28.94±1 29.19±1.10 NAART 46±6.27 49.12±7.14 BDI Off medication 9.44±5.07 3.27±3.20 BDI On medication 7.76±5.05 - UPDRS III Off medication 41.5±12.95 - UPDRS III On medication 33.68±10.86 - Abbreviations: MMSE = Mini Mental State Exam; UPDRS = United Parkinson’s Disease Rating Scale (motor); Dx = Parkinson’s diagnosis;BDI=Beck Depression Inventory; NAART=North American Adult Reading Test 7.3 Parkinson’s Data Used in the Dissertation 7.3.1 Subjects We used two Parkinson’s datasets in this study. The first dataset includes EEG data from 15 PD patients (mean age 63.2± 8.2 years) and 16 healthy, age-matched control participants (mean age 63.5 ± 9.6 years). The PD and control groups include eight and nine females, respectively. All PD patients were diagnosed by a movement disorder specialist at Scripps Clinic in La Jolla, California. All participants provided written consent as per the Institutional Review Board of the University of California, San Diego, and the Declaration of Helsinki. Additional information about this data can be found in [2,139]. The demographic and rating scale measures of the subjects are presented in Table 7.1. The second dataset includes scalp EEG recordings from 27 patients with PD 80 Table 7.2: Parkinson’s dataset-2 subject demographics (mean ± SD) reproduced from [3]. Each control was age and sex matched to a person with PD. Condition PD Control Sex 17M 10F 17M 10F Age 69.5±8.7 69.5±9.3 MMSE 28.7±1 28.8±1 UPDRS 22.2±10.3 - Year since Dx 5.7±4.2 - EEG recording (min) 3.59±1 3.63±1.8 BDI 7.6±5.3 4.8±4.8 Year of Ed 17.3±3.3 16.6±3.1 Year of Ed (Parents) 12.5±3.8 12.5±3.1 LED (mg) 707.4±448.6 - NAART 45.2±10.3 47.1±7.5 Abbreviations: MMSE = Mini Mental State Exam; UPDRS = United Parkinson’s Disease Rating Scale (motor); Dx = Parkinson’s diagnosis;BDI=Beck Depression Inventory; NAART=North American Adult Reading Test; LED=L-Dopa equivalence dose in mg. who were recruited from the Albuquerque, New Mexico community and an equal number of demographically matched (sex and age) controls. The PD and control groups did not differ on education or premorbid intelligence measurements. All participants were evaluated using Mini-Mental State Exam (MMSE) and achieved a score above 26. All procedures were approved by the University of New Mexico Office of the Institutional Review Board, and the participants were paid $20/hour. The data were also reported in previous studies [3, 140, 141], and can be down- loaded from [142]. Subject demographics and and assessment scores are presented in Table 7.2. Both datasets include data collected from PD patients in ‘on medication’ (PD- ON) and ‘off medication’ (PD-OFF) states. Data from on and off medication were collected on different days. The patients discontinued their dopaminergic 81 medicines at least 12 hours before the experiment for the PD-OFF phase. 7.3.2 EEG Recordings In PD dataset-1, EEG data were acquired using a 32-channel BioSemi ActiveTwo system, sampled at 512 Hz. Resting data were recorded for at least 3 min while the participants were told to fixate on a cross presented on a screen. PD dataset- 2 consists of EEG signals recorded via Ag/AgCl electrodes with a sampling rate of 500 Hz on a 64-channel Brain Vision system. The signals were referenced the ‘CPz’ channel, resulting in 63 timeseries. This analysis considers resting-state EEG signals of one-minute duration recorded while the participants had their eyes closed (unlike PD dataset-1). The EEG channel locations in the two datasets are depicted in Fig.7.1. We high-pass filtered the signals at 0.5 Hz cut-off to remove low frequency drift. We also filtered all signals using a 6th order IIR filter, to remove the power- line noise and its harmonics. We used two-way (bidirectional) filtering to avoid any phase shifting that can affect causal inference between the signals. 82 (a) PD Dataset-1 (32 channels) (b) PD Dataset-2 (63 channels) Figure 7.1: EEG channel locations plotted on a 2-D head diagram. Channels plotted beyond the head limit extend below the head center’s horizontal plane. Chapter 8 Distinguishing Parkinson’s Disease Patients Using Functional Brain Networks 8.1 Introduction Parkinson’s disease (PD) is a neurodegenerative disorder that predominately af- fects dopamine-producing (“dopaminergic”) neurons in a specific area of the brain called substantia nigra, which impacts communication pathways of the brain. PD affects the lives of more than 10 million people worldwide and is expected to be- come more prevalent in the future [143]. The main symptoms of PD are tremor, muscle stiffness, bradykinesia (slowness of movement), unstable posture, balance and gait abnormalities, and dysphonia (voice disorders). Non-motor symptoms can range from mild cognitive impairment (MCI) to dementia. Diagnosis of PD remains complicated, especially in patients without severe symptoms. Accuracy of gold-standard clinical diagnosis is only about 80% and has not improved in the last 30 years [127]. Considering that most PD patients develop dementia in 15-20 years, there is an urgent need to identify biomarkers for early diagnosis, monitor disease progression, and establish efficacious therapies. It 83 84 is established that cognitive dysfunction in neurological disorders can be described as aberrant patterns of interactions between neural elements in a large-scale brain network [101, 102, 144]. We hypothesize that network analysis may hold the key to understanding the electrophysiological basis of PD. Although functional network analysis of Parkinson’s was addressed in past literature, majority of these studies are limited to functional magnetic resonance imaging (fMRI) [134,145–147]. Scalp EEG is optimal for clinical, commercial, and research purposes because of its non-invasive nature and wide availability. More importantly, EEG can sample neural activity at 100–1000x higher time resolu- tion than fMRI, making it more suitable to assess temporal dynamics. Previous EEG research on Parkinson’s was focused on spectral features, or event-related potentials [3, 148–151]. However, these approaches do not consider simultaneous interactions between multiple brain areas, i.e., EEG network functional connec- tivity. This chapter presents a functional network analysis to decode scalp EEG sig- nals and detect node centrality modulations indicative of PD. Recent studies demonstrated statistical differences in network measures such as node central- ity between PD patients and healthy controls [129, 136–138]. Here, we present perhaps the first EEG-based machine learning analysis that utilizes node central- ity features to differentiate between Parkinson’s patients and healthy controls to the best of our knowledge. 8.2 Materials and Methods 8.2.1 Data The dataset used in this chapter is Parksinon’s dataset-2, described in Chapter 7. The data included scalp EEG recordings from 27 patients with PD who were recruited from the Albuquerque, New Mexico community and an equal number of demographically matched (sex and age) controls. All participants were evaluated 85 using Mini-Mental State Exam (MMSE) and achieved a score above 26. The PD and control groups did not differ on any education or premorbid intelligence measurements. All procedures were approved by the University of New Mexico Office of the Institutional Review Board, and the participants were paid $20/hour. Each PD patient visited the lab twice: on medication (PD-ON) and off medication (PD-OFF). In the PD-OFF phase, the patients took their most recent dose of dopaminergic medicines at least 12 hours before the experiment. EEG was recorded from Ag/AgCl electrodes with a sampling rate of 500 Hz on a 64-channel Brain Vision system. The signals were referenced to the ‘CPz’ channel, resulting in 63 timeseries. The original dataset consisted of two one- minute segments per subject: eyes-open and closed conditions. This analysis uses resting-state EEG signals of one-minute duration recorded under the eyes-open condition. The data were also reported in previous studies [3,140,141], and can be downloaded from https://narayanan.lab.uiowa.edu/article/datasets. Power line noise and its harmonics were removed using 6th order IIR filters. 8.2.2 Feature Extraction Using Network Analysis Functional Networks were constructed by computing absolute Pearson’s correla- tion coefficient between all pairs of channels. In this case, each EEG channel is a node. Therefore, an edge with high connectivity displays a strong correla- tion between the interacting channels. Since a value between 0 and 1 represents each connection, these are weighted-undirected networks. For each subject, the one-minute recording was divided into 11 30-second segments with 90% overlap. Functional networks for each of the 11 segments were computed. The final repre- sentative network was the mean of these 11 networks. The mean networks were considered to minimize the effects of non-stationarity. The node centrality of a given node measures its importance within the net- work. We compute three node centrality metrics: betweenness centrality, node degree, and eigenvector centrality. Betweenness centrality measures the extent to 86 (a) Healthy controls (HC) (b) Parkinson’s– on medication (PD-ON) (c) Parkinson’s– off medication (PD-OFF) Figure 8.1: Scalp topographical maps of average betweenness centrality. 87 which a given node falls in the shortest path between any two other nodes [152]. Node degree quantifies the number of connections to a node. Eigenvector central- ity is a measure of the influence a node has on a network and was found to be linked to firing rates of neurons [153]. The node centrality values were used as features for classification. Fig. 8.1 shows the two-dimensional scalp topographic maps depicting the average betweenness centrality of nodes in the three groups: healthy controls (HC), PD-ON, and PD-OFF. All three scalp maps were plotted on the same scale for comparison. We observe that the betweenness centrality in HC is higher in the mid-frontal region compared to PD-ON and PD-OFF. 8.2.3 Feature Selection and Classification Naive Bayes classifiers with Gaussian kernel were trained to differentiate PD pa- tients from healthy controls. The classifiers were modeled separately for PD-ON and PD-OFF patients. For single-channel classification, the betweenness central- ity of a single node was used as the input feature. For multi-channel classifi- cation, sequential forward feature selection was employed to select the optimal features/channels. That is, we first start with an empty candidate set. In each iteration, a new feature is sequentially added to the set to minimize classification error. The process is stopped when the accuracy cannot be improved further. The classifiers were evaluated using leave-one-subject-out cross-validation. The cross-validation prevents overestimating the accuracy due to over-fitting of train- ing data and ensures the models were evaluated on all subjects. 88 8.3 Results 8.3.1 Single-Channel Classification Fig. 8.2(a) and Fig. 8.2(b) depict single-channel classification performance for HC vs. PD-ON and HC vs. PD-OFF, respectively. Each value is the mean cross- validation accuracy. The best HC vs. PD-ON accuracy was 75.93% for channel C2. For HC vs. PD-OFF, the highest accuracy of 79.63% was attained by channel PO7. 8.3.2 Multi-Channel Classification The results presented here are based on Naive Bayes classifiers with Gaussian ker- nels. We observed that these models performed the best with minimal overfitting compared with other machine learning models such as support vector machines, linear discriminant analysis, and decision trees. The receiver operating character- istic (ROC) curves comparing the three node-centrality measures are depicted in Fig. 8.3. ROC curve can be used to evaluate the performance of binary classi- fiers. The higher the area under the ROC curve (AUC), the better the models distinguish between the two classes. The plots illustrate that betweenness cen- trality differentiates the healthy subjects from PD patients, with the highest AUC in PD-ON and PD-OFF conditions. The betweenness centrality-based classifiers achieved an AUC of 91.63% and 88.6% for HC vs. PD-ON and PD-OFF, respec- tively. The individual classification results are presented in Table 8.1. It can be observed that betweenness centrality (BC) outperforms node degree (ND) and eigenvector centrality (EC) in both cases. For HC vs. PD-ON, the accuracy, sensitivity, and specificity were 88.9% each. Betweenness centrality of nodes rep- resented by EEG electrodes FT9, PO5 and PO7 were the most discriminatory between the two classes. In the case of HC vs. PD-OFF, the accuracy, sensitiv- ity, and specificity were 88.89%, 92.59%, and 88.6%, respectively. Betweenness 89 (a) Healthy controls vs. PD-ON (b) Healthy controls vs. PD-OFF Figure 8.2: Single-channel classification performance comparison. 90 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Tr ue p os iti ve ra te Betweenness Node Degree Eigen Vector (a) Healthy controls vs. PD-ON 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Tr ue p os iti ve ra te Betweenness Node Degree Eigen Vector (b) Healthy controls vs. PD-OFF Figure 8.3: Receiver operating characteristic (ROC) curves comparison between 3 node centrality measures. 91 Table 8.1: Leave-one-subject-out cross validation results summary for HC vs. PD classification. Sensitivity and specificity represent the healthy control and PD accuracies, respectively. Random label-assignment would result in a baseline accuracy of 50%. Classifier HC vs. PD-ON HC vs. PD-OFF HC vs. PD (ON+OFF) Features BC ND EC BC ND EC LEAPD [3] Accuracy(%) 88.9 74.1 83.3 88.9 83.3 63 85.2 Sensitivity(%) 88.9 63 85.2 92.6 77.8 63 88.9 Specificity(%) 88.9 85.2 81.5 88.2 88.9 63 81.5 AUC(%) 91.6 63.92 77.2 88.6 82.7 61.2 93.8 Abbreviations: HC=Healthy controls, PD=Parkinson’s disease, BC = Betweenness centrality, ND=Node degree, EC=Eigenvector centrality, LEAPD=Linear-predictive-coding EEG Algorithm for PD [3] centralities of P5, PO7 and PO3 were chosen as the optimal features for HC vs. PD-OFF. 8.4 Discussion There is a growing consensus in modern neuroscience that human brain function is encoded as complex small-world networks. In other words, functional brain networks contain a combination of dense local connectivity and sparse yet efficient long-distance (global) connectivity. Some nodes are more important (hub nodes) than other as a result of this small-worldness. This node importance can be measured using node centrality metrics such as betweenness centrality. Our work showed that betweenness centrality differentiates PD patients from age-matched controls. We also show that this effect is independent of the patients’ medication status. Besides PD, modulated betweenness centrality has been also implicated in other neurological disorders such as Alzheimer’s, Schizophrenia and Epilepsy 92 [144,154–156]. Our method achieved higher leave-one-out cross-validation accuracy (88.9%) than the state-of-the-art (85.2%) on this dataset [3]. The decoders in [3] are based on spectral properties of individual channels but do not take into account the interactions between the channels. One limitation of the proposed approach is that the neural activity is recorded from the scalp. Scalp EEG is typically affected by confounding factors such as volume conduction. Also, compared to fMRI, EEG has a lower spatial resolution, making it difficult to localize the source of the activity. However, the proposed approach is amenable to real-time applications since it only requires 1-minute resting-state EEG recordings. All existing methods use longer recordings or employ computationally complex algorithms like deep learning [3, 150,157]. 8.5 Conclusion This work demonstrates that metrics like betweenness centrality can measure how functional networks encode PD [158]. We employed graph analysis to develop neural classifiers that accurately separate PD patients from healthy age-matched controls regardless of their medication status. Such decoders can assist clinicians as a cost-effective diagnostic tool, crucial for prognostic and therapeutic purposes. These decoders can also be used to find biomarkers of PD for developing inter- ventional therapies such as adaptive deep-brain stimulation or transcranial direct- current stimulation [159]. Future research can be directed towards validating this approach on multiple datasets. Automatic monitoring of disease progression in Parkinson’s can also be explored using similar network-based methods. Chapter 9 Frequency-Domain Convergent Cross-Mapping: A Novel Causal Connectivity Measure 9.1 Introduction Data-driven effective connectivity measures can vary from linear to nonlinear, model-based to model-free, and time-domain to frequency-domain [1, 160]. Ma- jority of the existing measures are based on Granger’s approach [1, 18, 19]. How- ever, these Granger-based measures such as Granger causality, directed transfer function (DTF), and partial directed coherence (PDC) are only applicable to strongly-coupled, linear, and stochastic systems [23,24]. Granger causality is also model-based and does not always reveal causal interactions [12,23]. A newly developed method of assessing causation between timeseries, conver- gent cross-mapping (CCM) [12], has been shown to identify causal relationships that Granger causality may miss due to nonlinearity or deterministic nature. CCM has been applied in a limited number of studies to characterize neurological dis- orders [161–164]. CCM is based on time-domain dynamics and does not include spectral information. Frequency-based methods such as PDC [27], DTF [26] and 93 94 dynamic causal modeling [17] have been shown to be useful in detecting brain dis- orders [165–169]. However, these methods are derived using Granger’s approach and are model-based [170]. This chapter introduces the first frequency-based causality measure that can infer nonlinear causal interactions between neuronal populations without requiring any model estimation. The proposed method called frequency-domain convergent cross-mapping (FD- CCM) is a nonlinear state-space reconstruction technique similar to CCM [171]. For two causally coupled neuronal dynamics, the rationale behind causal infer- ence using FDCCM is based on the following intuition. Since the cause-variable influences the effect-variable, we can reconstruct the causal timeseries by finding its signature in the power spectrum of the effect timeseries. In other words, we utilize the mapping between the power spectra of the two timeseries to estimate the causal influence of one on the other. This chapter illustrates the principle of our approach on simple toy data, coupled logistic maps; and show that FDCCM reliably estimates interactions. In this chapter, we introduce a spectral measure of causality called frequency- domain convergent cross-mapping or FDCCM. Unlike existing spectral measures of effective connectivity, FDCCM is model-free and can infer nonlinear interac- tions. We describe the algorithm and validate our method on a toy dataset. We illustrate the effect of coupling strength and noise on the quality of causal infer- ence; and demonstrate that FDCCM is more robust to external noise than CCM. 9.2 A Brief Introduction to Convergent Cross- Mapping (CCM) In dynamical systems, causally interacting variables (e.g., two electrode record- ings) share a trajectory in the underlying state-space called ‘attractor’ space. In other words, each time-point corresponds to a location in this space. Mathe- matical theorems guarantee that the temporal sequence of a single variable has 95 sufficient information about the entire system’s dynamics. Accordingly, the dy- namics of one variable constrain the dynamics of other variables, and can be used to reconstruct the original global attractor topology. Consider two timeseries, X and Y , part of a deterministic dynamical sys- tem denoted by M . We can then express the temporal dynamics of X in a delay-coordinate state-space that consist of the set of D-dimensional state vec- tors: x(t) = {x(t), x(t− 1), . . . , x(t− (D − 1))}. The time delays are assumed to be 1 for simplicity. This transformed state-space of X is called its attractor mani- fold MX . This process of transforming a sequence into its delay-coordinate space is called time-delay embedding. As proven by Takens theorem [172], a general principle in dynamical systems is that the states of the global attractor M have a one-to-one mapping to the the local attractors MX and MY . Consequently the local attractors (also known as shadow manifolds) MX and MY have a one-to-one correspondence with each other. Based on this property, a protocol for inferring causation in complex systems was proposed by Sugihara et. al., using K-nearest-neighbor state-space recon- struction [12]. To understand the intuition behind this method, called convergent cross-mapping (CCM), consider two variables X and Y with asymmetric interac- tion. That is, X influences Y but not vice-versa. The aim is to infer the causal interactions from observational timeseries X and Y . Since there is a causal con- nection from X to Y , the history of Y has information about X. In other words, a local neighborhood in MY corresponds to a local neighborhood in MX . Therefore the ‘cause-variable’ X can be accurately reconstructed using the nearest neighbors in the shadow manifold MY , if and only if there is causal connection from X to Y . As the causal influence of X on the dynamics of Y increases, more informa- tion about X is encoded in the manifold MY constructed from a fixed number of observations of Y. This rationale acts as the basis for causal inference using CCM. 96 9.3 Frequency-Domain Convergent Cross-Mapping (FDCCM) 9.3.1 The Basic Concept In the proposed frequency-domain convergent cross-mapping (FDCCM), we ex- tend the idea of causal inference using nonlinear state-space reconstruction to frequency-domain. Intuitively, any linear transformation of a manifold should preserve its topology: a corollary of random projection theory [173]. For a given frequency, Fourier transform is a linear transformation. Hence, we can preserve the geometry by transforming the delay-coordinate space to frequency space, by computing the power spectrum of x(t). This transformation is equiv- alent to computing short-time Fourier transform (STFT) with pre-defined fre- quency bands. The time-delayed embedding in CCM is now replaced by spectro- grams, such that each point in the resultant attractor manifold (MX) is x(t) = {xf1(t), xf2(t), . . . , xfD(t))}, where the subscripts represent D different frequency bands. Before outlining the algorithm of FDCCM, we describe two key ingredients of the method: cross-mapping and convergence. 9.3.2 Cross-Mapping in FDCCM If X has a causal influence on Y , then X will influence the frequency dynamics of Y . This ‘imprint’ of X on Y means that topology of MY obtained from Y can be used to estimate values of X. This estimate at a given time instant t is called the cross-map of x(t) given MY , and is denoted xˆ(t)|MY . If X and Y are causally coupled, then each point x(t) in MX is can be mapped to a unique point in MY [12]. To compute the cross-mapped estimates xˆ(t)|MY , we use simplex-projection al- gorithm as described by (9.1) and (9.2). We first obtain a small region around y(t), represented by its k nearest neighbors: {y1(t),y2(t), . . . ,yk(t)}. This neighbor- hood is then mapped to a set of points inMX , represented as {x1(t),x2(t), . . . ,xk(t)}. 97 To form a bounding simplex in D-dimensional space, we need k ≥ D + 1. The weighted mean of {x1(t),x2(t), . . . ,xk(t)} provides the estimate of x(t) as shown in the equation, xˆ(t)|MY = D+1∑ i=1 wix1(t). (9.1) The weighting wi is based on the distance between y(t) and its i th nearest neighbor yi(t) as given by the equations, wi = ui/ D+1∑ j=1 uj and ui = exp [ − d(y(t),yi(t) d(y(t),y1(t) ] (9.2) where d(., .) represents the Euclidean distance. The cross-mapping was implemented for L points in the timeseries X algo- rithm. The correlation coefficient between the original timeseries and the esti- mated timeseries, i.e., ρxxˆ is used as an indicator of the influence of X on Y . The cross-mapping yˆ(t)|MX can be estimated analogously. Fig. 9.1 illustrates the cross-mapping at time instant t for a bi-variate example where X influences Y , but there is no causal link from Y to X. The state-space reconstruction of X using MY (i.e., xˆ(t)|MY ) would be accurate due to the effect X has on Y, but not vice-versa. 9.3.3 Convergence For practical application, the cross-mapping estimates of timeseries are evaluated using correlation coefficient, mean absolute error, or similar metrics. We use correlation coefficient (ρ) as the accuracy metric in this study. The total number of time points, i.e., the ‘library’ of points in the attractors used for cross-mapping is called library length (L). As L increases, attractors get more dense in the state-space, resulting in closer nearest neighbors, and more accurate estimation. This increasing cross-mapping accuracy with increase in L is a key property for 98 (a) Estimating x(t) from MY with high accuracy (b) Estimating y(t) from MX with low accuracy Figure 9.1: Illustration of cross-mapping between X and Y when X influences Y but Y has no (or minimal) effect on X. 99 causal interactions [12]. We use the cross-mapping accuracy at maximum L to infer network interactions. 9.3.4 Algorithm Here, we outline the main steps in computing FDCCM between two timeseries. For a given library length L, the basic algorithm for cross-mapping X using the shadow attractor MY is given by: • Compute the time-frequency spectrograms MX and MY . Note that L is the number of time windows and D is the number of frequency bands. • For each time index t = 1 to L, find D+1 nearest neighbors of y(t) in MY . • Generate weights wi according to (9.2). • Estimate xˆ(t)|MY using (9.1). • Calculate correlation coefficient ρXˆ|MY between X = {x(t) : 1 ≤ t ≤ L} and Xˆ = {xˆ(t)|MY : 1 ≤ t ≤ L}. 9.4 Toy Data Analysis 9.4.1 Data Generation Three datasets are used in this analysis. The first is a toy dataset that simulates nonlinear causal interactions between two variables. The second and third datasets contain experimental resting-state scalp EEG recordings from Parkinson’s patients and healthy controls. 100 9.4.2 Toy Data The logistic map is a well-established nonlinear dynamic equation that generates periodic and chaotic behavior [174]. Despite its mathematical simplicity, a lo- gistic map exhibits a high degree of complexity. Two logistic maps with chaotic dynamics can be coupled to create a complex system akin to biological systems. Two timeseries in a coupled logistic map can be correlated, uncorrelated, or anti- correlated at different times [12]. The coupled logistic maps timeseries X and Y were synthesized using the following equations: x(t+ 1) = x(t)[rx(1− x(t))− βxyy(t)] y(t+ 1) = y(t)[ry(1− y(t))− βyxx(t)]. (9.3) The variables X and Y have a nonlinear dependence on their own past values parameterized by the growth rates rx and ry. The coupling constants βyx and βxy characterize the coupling strengths from X to Y and from Y to X, respectively. The variables have chaotic dynamics at values of rx and ry above 3.57. We chose the rx = 3.65 and ry = 3.77, so we are in the chaotic regime of the logistic map [111]. 9.4.3 Convergence We synthesized coupled logistic maps, X and Y , using (9.3) with parameters: rx = 3.65, ry = 3.77, βxy = 0.05, andβyx = 0.5. 25,000 samples were generated and the first 10,000 were discarded to avoid the effects of transient behaviour of the model. The remaining 15,000 points were used for the analysis, assuming a 500 Hz sampling rate— a total duration of thirty seconds. To estimate spectrograms and FDCCM, we used a 5Hz frequency resolution, up to 200 Hz, analogous to the Parkinson’s data. Fig. 9.2 illustrates how the cross-mapping accuracy between X and Y grows with increasing library length L. The increasing trend of ρXˆ|MY and ρYˆ |MX validate the causal influence from X to Y and Y to X, respectively. It can be observed that the difference ρXˆ|MY − ρYˆ |MX is always positive, supporting the 101 500 1000 1500 2000 2500 Library length L 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Figure 9.2: Correlation coefficients of the two estimated timeseries, Xˆ|MY and Yˆ |MX , with respect to library length (L). Coupling strengths: βxy = 0.05 and βyx = 0.5. Xˆ|MY represents the influence X → Y , and Yˆ |MX represents the influence Y → X. fact that βyx > βxy. 9.4.4 Effect of Coupling Strength We denote the difference ρXˆ|MY − ρYˆ |MX by ρdiff , which is a measure of relative causal influence from X to Y . To test how the relative causality varies with coupling strength, we fix βxy at 0.5, and estimate ρdiff at different values of βyx. Fig. 9.3 demonstrates that ρdiff increases with increasing βyx. More importantly, when βyx < 0.5 = βxy, ρdiff is negative. Each value in the plot is the average result of 100 simulations. Although the direction of causality is correct, the magnitude of causal strength is not reliable when there is weak coupling. Note that there is a monotonous increase in ρdiff only for coupling strengths higher than 0.35. 102 0 0.2 0.4 0.6 0.8 1 yx -0.05 0 0.05 0.1 0.15 0.2 0.25 di ff Figure 9.3: Difference between ρX→Y and ρY→X (denoted by ρdiff ) as a function of coupling strength βyx. βxy = 0.5. 9.4.5 Effect of Volume Conduction Scalp EEG signals tend to be highly correlated with each other due to volume conduction. The volume conduction effect is a result of a mixture of concurrently active brain and non-brain electrical sources. One way to simulate the effect of volume conduction is to use a mixture matrix M that transforms source activity into measured activity as described in [175]. We assume that x(t) and y(t) syn- thesized using (9.3) are two cortical sources. The mixing matrix M containing the weights that mix the source activities (x(t) and y(t)) to two scalp activities (p(t) and q(t)). This transformation can be expressed as( p(t) q(t) ) =M ( x(t) y(t) ) (9.4) 103 0 0.2 0.4 0.6 0.8 1 yx -0.05 0 0.05 0.1 0.15 0.2 0.25 di ff Figure 9.4: Difference between ρP→Q and ρQ→P (denoted by ρdiff ) as a function of coupling strength βyx, with volume conducting. βxy = 0.5. where M is a 2x2 matrix. We assume M = ( 15 1.5 2 20 ) . When βxy = 0.5 and βyx = 0.6, the Pearson’s correlation coefficient between between x and y is 0.7. As a result of volume conduction the correlation between p and q increases to 0.79, a 9% increase. We repeat the analysis shown in Fig. 9.3 with volume conduction by fixing βxy at 0.5, and estimating ρdiff at different values of βyx. These results are presented in Fig. 9.4. Comparing the results with Fig. 9.3, it can be noticed that volume conduction does not have a notable effect on FDCCM estimates. 9.4.6 Effect of Noise Since real-world signals such as electrophysiological data are affected by environ- mental and measurement noise, it is important to study the effect of noise. It is known that cross-mapped estimates of CCM deteriorate as more noise is present in the data [12, 111]. To characterize the effect of noise on FDCCM, we simulate 104 noisy timeseries given by the equations, x(t+ 1) = x(t)[rx(1− x(t))− βxyy(t)] + ϵx(t) y(t+ 1) = y(t)[ry(1− y(t))− βyxx(t)] + ϵy(t). (9.5) Here, ϵx and ϵy are the noise terms, that were modeled as additive Gaussian noise with zero mean and standard deviation σ. We repeated the simulations at different signal-to-noise ratios (SNR), and a different coupling strengths βyx = {0.6, 0.7, 0.8, 0.9}. Note that βxy was kept constant 0.5. We evaluated (time- domain) CCM and FDCCM by quantifying the effect of noise level on ρdiff . This effect of noise on CCM and FDCCM is presented in Fig. 9.5. For βyx > 0.5, we expect ρdiff > 0, a demonstrated in Fig. 9.1. Additionally, we expect ρdiff to be higher for higher values of βyx as the influence of X on Y becomes stronger. When the noise is high, i.e., SNR is zero (dB) the two timeseries become uncorrelated resulting in ρdiff = 0. As the SNR increases, and the signal tends to dominate the noise, we notice that ρdiff converges to more accurate values that are greater than zero. We can observe that both methods showcase the expected trend at high SNR: ρdiff ∝ βyx. However, empirical data often contains a moderate level of noise. It is, therefore, important to determine the threshold at which the methods become unreliable. As shown in Fig. 9.5(a), when the SNR is less than 16 dB, CCM does not perform as expected at different coupling strengths. Fig. 9.5(b) shows that as long as the SNR is not too low, i.e., for SNR ≥ 3 dB FDCCM results in reliable estimates of ρdiff . These plots illustrate that FDCCM is more robust to noise. 105 0 5 10 15 20 25 30 SNR (dB) -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 di ff 16 dB yx=0.6 yx=0.7 yx=0.8 yx=0.9 yx=1 (a) CCM 0 5 10 15 20 25 30 SNR (dB) -0.05 0 0.05 0.1 0.15 0.2 0.25 di ff 3 dB yx=0.6 yx=0.7 yx=0.8 yx=0.9 yx=1 (b) FDCCM Figure 9.5: The effect of noise on CCM and FDCCM estimates. ρdiff as a function of increasing signal-to-noise ratio (SNR) for different coupling strengths βyx. Chapter 10 Distinguishing Parkinson’s Patients Using Causal Networks 10.1 Introduction The human brain is an efficient organization of 100 billion (1011) neurons anatom- ically connected by about 100 trillion (1014) synapses over multiple scales of space and functionally interactive over multiple scales of time [176]. The recent mathe- matical and conceptual development of network science combined with the tech- nological advancement of measuring neuronal dynamics motivated the field of network neuroscience. Network science provides a particularly appropriate frame- work to study several mechanisms in the brain by treating neural elements (a population of neurons, a sub-region) as nodes in a graph and neural interactions (synaptic connections, information flow) as its edges. The central goal of network neuroscience is to link macro-scale human brain network topology to cognitive functions and clinical disorders. It is established that neurological disorders and cognitive phenomena can be described as aberrant patterns of interactions be- tween neural elements in a large-scale brain network [101,102,144]. Apart from the anatomical/structural connectivity, measured neural dynam- ics can be used to estimate functional and effective networks. While functional 106 107 connectivity is defined as a statistical dependence between the neurophysiological signals, effective connectivity characterizes patterns of causal interactions [152]. While the underlying anatomical pathways between brain regions or populations of neurons are bidirectional, we cannot assume that the connectivity is symmet- rical. Although there are several linear and nonlinear measures of undirected correlation, estimating the directionality in brain networks is an important and largely unaddressed issue [8]. This chapter applies FDCCM to real-world datasets and evaluate its performance in detecting biomarkers for Parkinson’s disease using resting-state EEG. The specific contributions of this chapter are outlined below: • We estimate effective networks by applying FDCCM on resting-state scalp EEG recordings from two Parkinson’s datasets. We employ graph analysis to showcase the difference in betweenness centrality between the patients and controls. • We demonstrate that machine learning classifiers based on FDCCM causal networks can differentiate between Parkinson’s patients and demographi- cally matched healthy controls with 97% accuracy. The performance of FD- CCM is shown to be better than correlational networks and CCM networks on both datasets. novel machine-learning approach to diagnose Parkinson’s disease with EEG. 10.2 Materials Methods 10.2.1 Data The dataset used in this chapter is described in Chapter 7. 108 10.2.2 Network Features The analysis for Parkinson’s data consists of three steps: construct networks, compute node centrality of every node in the network, and learn HC vs. PD classifiers using the node centralities as features. Each of these steps is described below. First, we construct networks using thirty-second epochs of the EEG signals. We split the recordings into multiple thirty-second epochs with 90% overlap. Since each subject PD dataset-1 has signals of duration three minutes, there are 51 such epochs per subject leading to 51 networks. In the case of PD dataset-2, the one-minute recordings produce 11 networks. The average of these networks de- fines the resting-state connectivity for each subject. Each network was estimated using three network measures: correlation coefficient, CCM (time-domain) and FDCCM. To construct networks using FDCCM, we determine the spectrograms (time- frequency matrices) for each EEG recording. Fig. 10.1 presents spectrograms associated with two different electrodes from a PD patient in PD dataset-1. To generate the spectrograms, we compute power spectrum with 0.5-second sliding windows with 95% overlap. The frequency resolution used was 5 Hz, up to 200 Hz. These spectrograms are then used to estimate the causal connectivity between all pairs of channels. Thus, each healthy control subject has a representative functional or effective connectivity graph. Each PD patient has two graphs (on and off medication). We then compute betweenness centrality of the nodes in these graphs. For a subject with N channels, effective networks can have up to N(N − 1) connections. That is, 992 connections if N = 32, and 3906 connections if N = 63. Node centrality is a way to extract interpretable information from the networks, while reducing the number of features. Betweenness centrality measures the extent to which a given node falls in the shortest path between any two other nodes [152]. Therefore, it is a measure of importance of the node acting as a bridge between other nodes in the graph. 109 5 10 15 20 25 Time (s) 50 100 150 200 Fr eq ue nc y (H z) -30 -20 -10 0 10 Po w er (d B) (a) Channel-X 5 10 15 20 25 Time (s) 50 100 150 200 Fr eq ue nc y (H z) -30 -20 -10 0 10 Po w er (d B) (b) Channel-Y Figure 10.1: Example time-frequency representations (spectrogram) derived from two arbitrarily chosen channels. The spectrograms correspond to a Parkinson’s patient from PD dataset-1. Time resolution: 0.5-second windows with 95% over- lap. Frequency resolution: 5Hz, up to 200 Hz. 110 10.2.3 Classification We train separate classifiers to differentiate PD-ON and PD-OFF from controls. Betweenness centrality of the nodes in each subject’s network are used as fea- tures. Since each subject has 32 or 63 features, sequential forward selection was employed to select the optimal features/channels. We employ Naive Bayes clas- sifiers with Gaussian kernels which learns a nonlinear decision boundary between the two classes. We observed that these models perform better than other linear models such as support-vector machines (SVM) and linear discriminant analysis; and nonlinear models such as polynomial SVM and decision trees. The classifiers were evaluated using leave-one-subject-out cross-validation. The cross-validation prevents overestimating the accuracy due to over-fitting of training data and en- sures the models were evaluated on all subjects. 10.3 Results We employ three network connectivity measures: functional (correlational) net- works based on Pearson’s correlation coefficient, and effective (causal) networks based on CCM and FDCCM. We compare the three methods by evaluating their performance in differentiating PD (ON and OFF) patients from healthy controls. Receiver operation characteristic (ROC) curves for the classification models based on the three network types are compared in Fig. 10.2 and Fig. 10.3 for PD dataset-1 and PD dataset-2, respectively. The ROC curves plot true-positive rate vs. false-positive rates for a given binary classifier at different model thresholds. As classifiers attain higher true-positive rates and lower false-positive rates, the curve moves closer to the top-left corner of the plot. Fig. 10.2 and Fig. 10.3 show that FDCCM outperforms the CCM and correlation as the preferred connectivity measure. We report the leave-one-patient-out classification accuracy, sensitivity and specificity of the models in Tables 10.1 and 10.2. FDCCM outperforms the 111 other two methods in both datasets. In PD dataset-1, FDCCM-based decoders can distinguish PD patients (both on and off medication) with 96.8% accuracy (AUC=0.98): a substantial improvement over the other two methods. The opti- mal features selected by the feature selection correspond to EEG electrodes F7 and PO4, for HC vs. PD-ON; and CP5, O1, and Oz, for HC vs. PD-OFF. The op- timal features and the decoding performance varies depending on the connectivity measure used. In PD dataset-2, FDCCM achieves 96.23% accuracy (AUC=0.96) and 88.68% accuracy (AUC=0.92) for HC vs. PD-ON and HC vs. PD-OFF, respectively. These accuracies are significantly higher than the best accuracy of 85.2% (AUC=0.94) previously reported on this dataset [3]. 10.4 Discussion Most PD patients develop dementia in 15-20 years. Accuracy of gold-standard clinical diagnosis of PD is only about 80% and has not improved in the last 30 years [127]. Existing studies on Parkinson’s were either limited to functional magnetic resonance imaging (fMRI) data [134, 145–147], or focused on spectral features [3,148–151]. However, these approaches do not consider simultaneous interactions between multiple brain areas. We believe that network analysis may hold the key to identify biomarkers for early diagnosis, monitor disease progression, and establish efficacious therapies. Modern neuroscience has shown that human brain networks exhibit high lev- els of clustering, a pattern indicative of a small-world architecture [4, 177]. In other words, some nodes (hub nodes) play a more important role in information transfer between different regions. This node importance can be measured using node centrality metrics such as betweenness centrality. Modulated betweenness centrality has been implicated in a wide variety of neurological disorders such as Alzheimer’s, Schizophrenia and Epilepsy [144,154–156]. Some prior studies on PD also showed an altered hub organization between PD and HC subjects. In par- ticular, they showed that important nodes, i.e., nodes with higher betweenness 112 0 0.2 0.4 0.6 0.8 1 False positive rate 0 0.2 0.4 0.6 0.8 1 Tr ue p os iti ve ra te Correlation Coeff. CCM FDCCM (a) HC vs. PD-ON 0 0.2 0.4 0.6 0.8 1 False positive rate 0 0.2 0.4 0.6 0.8 1 Tr ue p os iti ve ra te Correlation Coeff. CCM FDCCM (b) HC vs. PD-OFF Figure 10.2: HC vs. PD classification receiver operating characteristic (ROC) curves - PD dataset-1. 113 0 0.2 0.4 0.6 0.8 1 False positive rate 0 0.2 0.4 0.6 0.8 1 Tr ue p os iti ve ra te Correlation Coeff. CCM FDCCM (a) HC vs. PD-ON 0 0.2 0.4 0.6 0.8 1 False positive rate 0 0.2 0.4 0.6 0.8 1 Tr ue p os iti ve ra te Correlation Coeff. CCM FDCCM (b) HC vs. PD-OFF Figure 10.3: HC vs. PD classification receiver operating characteristic (ROC) curves - PD dataset-2. 114 T ab le 10 .1 : S u m m ar y of H C v s. P D cl as si fi ca ti on re su lt s fo r P D D at as et -1 . T h e ta b le p re se n ts cl as si fi ca ti on ac cu ra cy , se n si ti v it y (P D ac cu ra cy ), sp ec ifi ci ty (H C ac cu ra cy ) an d A U C fo r ea ch of th e th re e n et w or k co n - st ru ct io n m et h o d s. R an d om la b el -a ss ig n m en t w ou ld re su lt in a b as el in e ac cu ra cy of 50 % . T h e h ig h es t va lu es b et w ee n th e th re e m et h o d s ar e sh ow n in b ol d . H C v s. P D -O N C on n ec ti v it y ↓ A cc .( % ) S en s. (% ) S p ec .( % ) A U C (0 -1 ) S el ec te d C h an n el s C or re la ti on 70 .9 7 46 .6 7 93 .7 0. 51 C P 1 C C M 80 .6 4 80 81 .2 5 0. 75 F C 1, O 1 F D C C M 9 6 .8 9 3 .3 1 0 0 0 .9 8 F 7, P O 4 H C v s. P D -O F F C or re la ti on 87 .1 86 .6 7 87 .5 0. 78 C 3, O z C C M 70 .9 7 60 81 .2 5 0. 55 P 3, P z F D C C M 9 6 .8 9 3 .3 1 0 0 0 .9 8 C P 5, O 1, O z A b b re v ia ti on s: H C = H ea lt h y co n tr ol s, P D = P a rk in so n ’s d is ea se , A cc . = A cc u ra cy , S en s. = S en si ti v it y, S p ec . = S p ec ifi ci ty , A U C = A re a u n d er th e R O C cu rv e. 115 T ab le 10 .2 : S u m m ar y of H C v s. P D cl as si fi ca ti on re su lt s fo r P D D at as et -2 . T h e ta b le p re se n ts cl as si fi ca ti on ac cu ra cy , se n si ti v it y (P D ac cu ra cy ), sp ec ifi ci ty (H C ac cu ra cy ) an d A U C fo r ea ch of th e th re e n et w or k co n - st ru ct io n m et h o d s. R an d om la b el -a ss ig n m en t w ou ld re su lt in a b as el in e ac cu ra cy of 50 % . T h e h ig h es t va lu es b et w ee n th e th re e m et h o d s ar e sh ow n in b ol d . H C v s. P D -O N C on n ec ti v it y ↓ A cc .( % ) S en s. (% ) S p ec .( % ) A U C (0 -1 ) S el ec te d C h an n el s C or re la ti on 71 .7 81 .4 8 61 .5 4 0. 73 F C 1, C P 1 C C M 79 .2 4 88 .8 9 69 .2 3 0. 74 5 F C 2, P O 3 F D C C M 9 6 .2 3 9 6 .3 9 6 .1 5 0 .9 6 C P 1, O z, F C z, C 5 H C v s. P D -O F F C or re la ti on 84 .9 1 85 .1 8 84 .6 2 0. 86 F C 1, P 7, C P 2, A F 7 C C M 86 .7 9 85 .1 8 88 .4 6 0. 84 F T 9, C 3, P 4, F C 2, A F 7, P O 4, P 6 F D C C M 8 8 .6 8 8 1 .4 8 9 6 .1 5 0 .9 2 F C 1, T P 9, F 4, P 2 A b b re v ia ti on s: H C = H ea lt h y co n tr ol s, P D = P a rk in so n ’s d is ea se , A cc . = A cc u ra cy , S en s. = S en si ti v it y, S p ec . = S p ec ifi ci ty , A U C = A re a u n d er th e R O C cu rv e. 116 centrality, lost significance and the nodes with a less central role have become more important [129,178,179]. Fig. 10.4 and Fig.10.5 illustrate this distinction between patients and healthy age-matched controls on two independent resting-state datasets. The scalp topo- graphical maps of average betweenness centrality in each group— HC, PD-ON and PD-OFF— show that the spatial distribution of betweenness centrality varies between the HC and PD groups. The HC controls have higher node centrality in the mid-frontal regions while, the PD groups show higher values in mid-parietal regions. Note that these scalp maps are derived from FDCCM networks. We quantify these changes by building classifiers that accurately differentiate the two groups, independent of the patients’ medication status. Our method achieved more than 96% leave-one-out cross-validation accuracy on both datasets. In PD dataset-2, FDCCM attains an accuracy of 96.2% and AUC of 0.96, which is significantly higher than the best accuracy (85.2%) on this dataset using existing methods [3]. The decoders in [3] are based on spectral properties of individual channels but do not take into account the interactions between the channels. The neural activity analyzed here was recorded from the scalp, which is affected by confounding factors such as volume conduction [7]. Also, compared to fMRI, EEG has a lower spatial resolution, making it difficult to localize the source of the activity. Table 4.1 shows that the optimal EEG channels selected by our algorithm can change depending on the classification problem and the connectivity metric used. Future research can be focused on network analysis of source-localized EEG signals or invasive recordings such as local field potentials to gain more insight into the underlying neural mechanisms of PD. A benefit of using EEG is that it can sample neural activity at 100–1000x higher time resolution than fMRI, making it more suitable to assess temporal dynamics. While more research is necessary to develop clinically applicable decoders for PD diagnosis, our work indicates that FDCCM characterizes separability between PD patients and controls. 117 (a) HC (b) PD-ON (c) PD-OFF Figure 10.4: Scalp topographical maps of average betweenness centralities of healthy controls (HC) and PD patients (ON and OFF) from PD dataset-1. 118 (a) HC (b) PD-ON (c) PD-OFF Figure 10.5: Scalp topographical maps of average betweenness centralities of healthy controls (HC) and PD patients (ON and OFF) from PD dataset-2. 119 10.5 Conclusion In conclusion, this study provides a novel strategy for constructing causal net- works by utilizing the spectral dynamics of electrophysiological signals [171]. We showed that our method could be applied to recognize altered network patterns in patients with PD. We conducted graph analysis and classification analysis, and demonstrated that FDCCM helps quantify these changes between patients and healthy controls. Given its excellent classification performance in distinguishing between healthy individuals and PD patients on and off dopaminergic medication, FDCCM could detect abnormalities and track disease progression using EEG sig- nals. These decoders, in combination with graph theory, can also be used to develop interventional therapies such as adaptive deep-brain stimulation or tran- scranial direct-current stimulation [159]. Due to its non-invasiveness and wide availability, scalp EEG is also optimal for clinical, commercial, and research pur- poses. Further research on causal connectivity of cortical activity and comparison with source-level connectivity can help understand the underlying pathophysiol- ogy of neurodegenerative and neuropsychiatric disorders. Chapter 11 Conclusion and Future Directions 11.1 Implications for Causal Network Analysis Causality is an epistemological concept whose mathematical definition is not well- established. True causality is often difficult to estimate from a model or a set of equations because one’s intuitive understanding of causality becomes inherently constrained when one tries to build a model. The heterogeneity of neural interac- tions makes it challenging to determine a unifying causal model. Electrophysiolog- ical signals such as electroencephalogram (EEG) measure direct brain electrical activity (unlike fMRI, which measures blood flow) at relatively high temporal resolution, making it more suitable for causal inference. Hence, this dissertation studies effective (causal) connectivity using model-free methods using electrophys- iological signals. There are two distinct ways (at the least) to define causality [17]. First, in terms of time-precedence, i.e., causes precede their effects: the intuition behind Granger’s prediction [19]. Second, in terms of physical influence/control, i.e., changing one (the cause) changes the other (the effect). We adopt the latter definition and develop a state-space reconstruction technique that infers causality using the spectral dynamics of electrophysiological recordings. 120 121 The proposed method in chapter 9, called frequency-domain convergent cross- mapping (FDCCM) [12], is based on the rationale used in convergent cross- mapping (CCM) to infer dynamic causality. While CCM relies on time-delay embedding a timeseries into a higher-dimensional space, the proposed method, FDCCM, incorporates spectral information through time-frequency embedding of timeseries. By using power spectra instead of raw time-domain data, FDCCM blurs the effect of noise as illustrated in Fig. 9.5. Thus, FDCCM overcomes a significant weakness of CCM [111]. There is no universal model or metric to infer causality in complex networks such as the brain. The choice depends on the merits and demerits of various approaches. As observed from Fig. 9.3, FDCCM may not be reliable at low coupling strengths; this lack of reliability at some coupling strengths is also a limitation of CCM as demonstrated in [111]. In the current form, FDCCM is a bi-variate measure of relative causality. Further research is required to extend it to a multivariate measure that can infer the causal effect between more than one timeseries, e.g., multiple channels in a brain region [164,180]. 11.2 Implications for Neurological and Psychi- atric Disorders It is established that a wide variety of neurological disorders can be described as aberrant patterns of interactions between neural elements in a large-scale brain network [102,136,144,181–184]. The clinical application of brain network analysis confirms that pathological patterns accumulate in network hubs, and the network topology constrains their spread/neurodegeneration. Identifying dysfunctional brain circuits can aid in detecting, tracking, and predicting patterns of disease in brain disorders. A variety of psychiatric disorders are linked to cognitive dysfunction. These in- clude, but not limited to, major depressive disorder [99,185], obsessive-compulsive 122 disorder (OCD) [186], schizophrenia [187], autism spectrum disorders [188] and addiction [97]. One of the critical bottlenecks in diagnosing such disorders is the lack of well-established neural mechanisms that explain the pathology. We believe that effective network analysis, a less explored approach than functional network analysis, may hold the key to this problem. From a physiological standpoint and an empirical or statistical standpoint, directed networks provide twice the information provided by functional networks. The methods proposed in this dissertation, combined with prior neuroscience knowledge, may help understand the network patterns reflective of the neurolog- ical basis of these disorders. For example, the involvement of dlPFC in cognitive control, especially in tasks involving conflict or inhibition of irrelevant informa- tion, is reported in prior research [45, 50, 70]. Schizophrenia and OCD have been linked to dysfunctional modulation in dlPFC and vlPFC [94, 95]. Our research also implies that theta interactions in dlPFC, dmPFC, and the temporal lobe play an important role in cognition. Many patients with mental illnesses characterized by impaired cognitive con- trol have no relief from gold-standard clinical treatments resulting in a pressing need for new alternatives. Enhancing cognitive control can dramatically impact the health and well-being of millions of patients suffering from refractory mental disorders. Modulating the neural activity, e.g., through deep brain stimulation, during mental states associated with executive function can achieve this goal. Detecting such complex states of engagement in mentally demanding tasks is a challenge. Since these tasks adopt multiple brain regions, fundamental mecha- nisms underlying cognitive control cannot be observed solely from localized neu- ral recordings. The findings presented in this research demonstrates that network science can help discover these patterns. Although initial studies of deep brain stimulation (DBS) have shown promise, clinical trials have yielded inconsistent results [79–81, 189, 190]. Recent studies suggest that adaptive deep-brain stimulation (aDBS) can be used to improve cognitive control in psychiatric patients prospectively [122–124]. Developing such 123 closed-loop neuromodulation treatments requires automatic and reliable detection of cognitive effort in humans. However, therapeutic efficacy is critically related to the connectivity of its target site or target brain region [191]. Task engagement networks (see chapter 6) emphasize the activation of dlPFC due to its relatively higher node centrality. Similarly, in chapter 10, we observed that certain brain regions have higher betweenness centrality, implying they are more connected to other brain regions. Although further research is required to generalize these findings [129, 156, 178], our work indicates that causal network analysis could help answer the following questions that are crucial for stimulation therapies [70, 80, 81, 192]. What is a good biomarker that acts as the trigger for intervention? What is the optimal target site for a specific clinical outcome? Fu- ture work can apply the proposed methods to other disorders to discover network biomarkers. These biomarkers can be used to develop and evaluate adaptive brain stimulation therapies. Besides cognitive disorders, this dissertation also demonstrates the applica- bility of the proposed methods on neurodegenerative disorders. We show that network features of the estimated causal networks, namely betweenness centrality of different brain regions, are modulated in patients suffering from Parkinson’s disease. The classification models built from these network features can assist clinicians as a cost-effective diagnostic tool, crucial for prognostic and therapeu- tic purposes. Note that these classifiers utilize scalp-EEG signals for identifying biomarkers of Parkinson’s disease. Due to its non-invasiveness and wide avail- ability, scalp EEG is also optimal for clinical, commercial, and research purposes. These methods can also be used to study disease progression, and develop/evaluate stimulation therapies [159]. References [1] Hanieh Bakhshayesh, Sean P. Fitzgibbon, Azin S. Janani, Tyler S. Grum- mett, and Kenneth J. Pope. Detecting connectivity in EEG: A comparative study of data-driven effective connectivity measures. Computers in Biology and Medicine, 111:103329, 2019. [2] Jobi S George, Jon Strunk, Rachel Mak-McCully, Melissa Houser, Howard Poizner, and Adam R Aron. Dopaminergic therapy in parkinson’s disease decreases cortical beta band coherence in the resting state and increases cortical beta band power during executive control. NeuroImage: Clinical, 3:261–270, 2013. [3] Md Fahim Anjum, Soura Dasgupta, Raghuraman Mudumbai, Arun Singh, James F. Cavanagh, and Nandakumar S. Narayanan. Linear predictive cod- ing distinguishes spectral EEG features of parkinson’s disease. Parkinsonism & Related Disorders, 79:79–85, 2020. [4] Danielle S. Bassett and Edward T. Bullmore. Small-world brain networks revisited. The Neuroscientist, 23(5):499–516, 2017, https://doi.org/10.1177/1073858416667720. PMID: 27655008. 124 125 [5] Liang Shi, Jiangzhou Sun, Xinran Wu, Dongtao Wei, Qunlin Chen, Wen- jing Yang, Hong Chen, and Jiang Qiu. Brain networks of happiness: dy- namic functional connectivity among the default, cognitive and salience net- works relates to subjective well-being. Social Cognitive and Affective Neu- roscience, 13(8):851–862, 08 2018, https://academic.oup.com/scan/article- pdf/13/8/851/25891643/nsy059.pdf. [6] Scott Marek, Kai Hwang, William Foran, Michael N. Hallquist, and Beatriz Luna. The contribution of network organization and integration to the development of cognitive control. PLOS Biology, 13(12):1–25, 12 2016. [7] Andre Bastos and Jan-Mathijs Schoffelen. A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Frontiers in Systems Neuroscience, 9, 01 2016. [8] V. Sakkalis. Review of advanced techniques for the estimation of brain con- nectivity measured with EEG/MEG. Computers in Biology and Medicine, 41(12):1110–1117, 2011. Special Issue on Techniques for Measuring Brain Connectivity. [9] Timothy Ham, Alex Leff, Xavier de Boissezon, Anna Joffe, and David J. Sharp. Cognitive control and the salience network: An investigation of error processing and effective connectivity. J. of Neuroscience, 33(16):7091–7098, 2013, https://www.jneurosci.org/content/33/16/7091.full.pdf. [10] Ekaterina Dobryakova, Maria Assunta Rocca, Paola Valsasina, John DeLuca, and Massimo Filippi. Altered neural mechanisms of cognitive control in patients with primary progressive multiple sclerosis: An effec- tive connectivity study. Human Brain Mapping, 38(5):2580–2588, 2017, https://onlinelibrary.wiley.com/doi/pdf/10.1002/hbm.23542. [11] Karl J Friston. Functional and effective connectivity: a review. Brain connectivity, 1(1):13–36, 2011. 126 [12] George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. Detecting causal- ity in complex ecosystems. Science, 338(6106):496–500, 2012, https://science.sciencemag.org/content/338/6106/496.full.pdf. [13] Nicole Provenza, Angelique Paulk, Noam Peled, Maria Restrepo, Sydney Cash, Darin Dougherty, Emad Eskandar, David Borton, and Alik Widge. Decoding task engagement from distributed network electrophysiology in humans. J. of Neural Engineering, 16:056015, 08 2019. [14] Edward Bullmore and Olaf Sporns. Complex brain networks: Graph theo- retical analysis of structural and functional systems. Nature reviews. Neu- roscience, 10:186–98, 03 2009. [15] Christopher J. Honey, Rolf Ko¨tter, Michael Breakspear, and Olaf Sporns. Network structure of cerebral cortex shapes func- tional connectivity on multiple time scales. Proceedings of the National Academy of Sciences, 104(24):10240–10245, 2007, https://www.pnas.org/content/104/24/10240.full.pdf. [16] Danielle S. Bassett, Ankit N. Khambhati, and Scott T. Grafton. Emerg- ing frontiers of neuroengineering: A network science of brain connec- tivity. Annual Review of Biomedical Engineering, 19(1):327–352, 2017, https://doi.org/10.1146/annurev-bioeng-071516-044511. [17] Pedro A. Valdes-Sosa, Alard Roebroeck, Jean Daunizeau, and Karl Fris- ton. Effective connectivity: Influence, causality and biophysical modeling. NeuroImage, 58(2):339–361, 2011. [18] Steven L. Bressler and Anil K. Seth. Wiener–granger causality: A well established methodology. NeuroImage, 58(2):323–329, 2011. [19] C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3):424–438, 1969. 127 [20] Olivier David, Isabelle Guillemain, Sandrine Saillet, Sebastien Reyt, Colin Deransart, Christoph Segebarth, and Antoine Depaulis. Identifying neu- ral drivers with functional mri: An electrophysiological validation. PLOS Biology, 6(12):1–15, 12 2008. [21] Margarita Papadopoulou, Karl Friston, and Daniele Marinazzo. Estimating directed connectivity from cortical recordings and reconstructed sources. Brain Topography, 32(4):741–752, Jul 2019. [22] Teresa Murta, Alberto Leal, Marta I. Garrido, and Patr´ıcia Figueiredo. Dynamic causal modelling of epileptic seizure propagation pathways: A combined EEG–fmri study. NeuroImage, 62(3):1634–1642, 2012. [23] Mike Cohen. Analyzing Neural Time Series Data. MIT Press, 2004. [24] Patrick A. Stokes and Patrick L. Purdon. A study of problems encoun- tered in granger causality analysis from a neuroscience perspective. Pro- ceedings of the National Academy of Sciences, 114(34):E7063–E7072, 2017, https://www.pnas.org/content/114/34/E7063.full.pdf. [25] Marcin Jan Kaminski and Katarzyna J Blinowska. A new method of the description of the information flow in the brain structures. Biological cyber- netics, 65(3):203–210, 1991. [26] Maciej Kamin´ski, Mingzhou Ding, Wilson A Truccolo, and Steven L Bressler. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biolog- ical cybernetics, 85(2):145–157, 2001. [27] Luiz A Baccala´ and Koichi Sameshima. Partial directed coherence: a new concept in neural structure determination. Biological cybernetics, 84(6):463– 474, 2001. [28] James L. Massey. Causality, feedback and directed information. In Intl. Symposium on Information Theory and its applications (ISITA), 1990. 128 [29] R. Malladi, G. Kalamangalam, N. Tandon, and B. Aazhang. Identifying seizure onset zone from the causal connectivity inferred using directed in- formation. IEEE J. of Selected Topics in Signal Processing, 10(7):1267–1283, Oct 2016. [30] Pierre Olivier Amblard and Olivier J. J. Michel. On directed information theory and granger causality graphs. J. of Computational Neuroscience, 30(1):7–16, 2011. [31] Thomas Schreiber. Measuring information transfer. Phys. Rev. Lett., 85:461–464, Jul 2000. [32] Y. Liu and S. Aviyente. The relationship between transfer entropy and directed information. In 2012 IEEE Statistical Signal Processing Workshop (SSP), pages 73–76, 2012. [33] Akira Miyake and Naomi P Friedman. The nature and organization of indi- vidual differences in executive functions: Four general conclusions. Current directions in psychological science, 21(1):8–14, 2012. [34] Naomi P Friedman and Akira Miyake. Unity and diversity of executive functions: Individual differences as a window on cognitive structure. Cortex, 86:186–204, 2017. [35] Randall W Engle and Michael J Kane. Executive attention, working memory capacity, and a two-factor theory of cognitive control. he psychology of learning and motivation: Advances in research and theory, 44:145—-199, 2004. [36] Earl K Miller and Jonathan D Cohen. An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1):167–202, 2001. [37] Randall C O’Reilly. Biologically based computational models of high-level cognition. science, 314(5796):91–94, 2006. 129 [38] Matthew M Botvinick, Todd S Braver, Deanna M Barch, Cameron S Carter, and Jonathan D Cohen. Conflict monitoring and cognitive control. Psycho- logical review, 108(3):624, 2001. [39] Patricia S Goldman-Rakic. The prefrontal landscape: implications of func- tional architecture for understanding human mentation and the central ex- ecutive. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 351(1346):1445–1453, 1996. [40] Etienne Koechlin and Christopher Summerfield. An information theoretical approach to prefrontal executive function. Trends in cognitive sciences, 11(6):229–235, 2007. [41] Jonathan D. Cohen. Cognitive Control, chapter 1, pages 1–28. John Wiley and Sons, Ltd, 2017. [42] J. Bruce Morton, Fredrick Ezekiel, and Heather A. Wilk. Cognitive control: Easy to identify but hard to define. Topics in Cognitive Science, 3(2):212– 216, 2011. [43] Gabriele Gratton, Patrick Cooper, Monica Fabiani, Cameron S. Carter, and Frini Karayanidis. Dynamics of cognitive control: Theoretical bases, paradigms, and a view for the future. Psychophysiology, 55(3):e13016, 2018, https://onlinelibrary.wiley.com/doi/pdf/10.1111/psyp.13016. [44] Caterina Gratton, Haoxin Sun, and Steven E Petersen. Control networks and hubs. Psychophysiology, 55(3):e13032, 2018. [45] Elliot Smith, Guillermo Horga, Mark Yates, Charles Mikell, Garrett Banks, Yagna Pathak, Catherine Schevon, Guy McKhann, Benjamin Hayden, Matthew Botvinick, and Sameer Sheth. Widespread temporal coding of cog- nitive control in the human prefrontal cortex. Nature Neuroscience, 22:1–9, 11 2019. 130 [46] Etienne Koechlin, Chryste`le Ody, and Fre´de´rique Kouneiher. The ar- chitecture of cognitive control in the human prefrontal cortex. Science, 302(5648):1181–1185, 2003. [47] K. Richard Ridderinkhof, Markus Ullsperger, Eveline A. Crone, and Sander Nieuwenhuis. The role of the medial frontal cortex in cognitive control. Science, 306(5695):443–447, 2004. [48] Fre´de´rique Kouneiher, Sylvain Charron, and Etienne Koechlin. Motivation and cognitive control in the human prefrontal cortex. Nature neuroscience, 12(7):939–945, 2009. [49] Daniel HWeissman, AS Perkins, and Marty GWoldorff. Cognitive control in social situations: a role for the dorsolateral prefrontal cortex. Neuroimage, 40(2):955–962, 2008. [50] Angus W. MacDonald, Jonathan D. Cohen, V. Andrew Stenger, and Cameron S. Carter. Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288(5472):1835–1838, 2000, https://science.sciencemag.org/content/288/5472/1835.full.pdf. [51] Maria Medalla and Helen Barbas. Synapses with inhibitory neurons differ- entiate anterior cingulate from dorsolateral prefrontal pathways associated with cognitive control. Neuron, 61(4):609–620, 2009. [52] Stefanie M. Beck, Hannah S. Locke, Adam C. Savine, Koji Jimura, and Todd S. Braver. Primary and secondary rewards differentially modulate neural activity dynamics during working memory. PLOS ONE, 5(2):1–13, 02 2010. [53] Nili Metuki, Tal Sela, and Michal Lavidor. Enhancing cognitive control com- ponents of insight problems solving by anodal tdcs of the left dorsolateral prefrontal cortex. Brain Stim., 5(2):110–115, 2012. 131 [54] Mohammad Ali Salehinejad, Elham Ghanavai, Reza Rostami, and Vahid Nejati. Cognitive control dysfunction in emotion dysregulation and psy- chopathology of major depression (MD): Evidence from transcranial brain stimulation of the dorsolateral prefrontal cortex (DLPFC). Journal of Af- fective Disorders, 210:241–248, 2017. [55] Hanlin Tang, Hsiang-Yu Yu, Chien-Chen Chou, Nathan E Crone, Joseph R Madsen, William S Anderson, and Gabriel Kreiman. Cascade of neural processing orchestrates cognitive control in human frontal cortex. eLife, 5:e12352, feb 2016. [56] Patrick S. Cooper, Aaron S.W. Wong, W.Ross Fulham, Renate Thienel, Elise Mansfield, Patricia T. Michie, and Frini Karayanidis. Theta frontopari- etal connectivity associated with proactive and reactive cognitive control processes. NeuroImage, 108:354–363, 2015. [57] D. Dvorak, A. Shang, S. Abdel-Baki, W. Suzuki, and A. A. Fenton. Cog- nitive behavior classification from scalp eeg signals. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(4):729–739, 2018. [58] James F. Cavanagh and Michael J. Frank. Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8):414–421, 2014. [59] Angelos Angelidis, Muriel Hagenaars, Dana van Son, Willem van der Does, and Peter Putman. Do not look away! spontaneous frontal EEG theta/beta ratio as a marker for cognitive control over attention to mild and high threat. Biological Psychology, 135:8–17, 2018. [60] Patrick S. Cooper, Frini Karayanidis, Montana McKewen, Samuel McLellan- Hall, Aaron S.W. Wong, Patrick Skippen, and James F. Cavanagh. Frontal theta predicts specific cognitive control-induced behavioural changes beyond general reaction time slowing. NeuroImage, 189:130–140, 2019. 132 [61] Adele Diamond. Executive functions. Annual Review of Psychol- ogy, 64(1):135–168, 2013, https://doi.org/10.1146/annurev-psych-113011- 143750. [62] Lisa M. McTeague, Julia Huemer, David M. Carreon, Ying Jiang, Simon B. Eickhoff, and Amit Etkin. Identification of com- mon neural circuit disruptions in cognitive control across psychi- atric disorders. American J. of Psychiatry, 174(7):676–685, 2017, https://doi.org/10.1176/appi.ajp.2017.16040400. [63] Lisa M McTeague, Madeleine S Goodkind, and Amit Etkin. Transdiagnostic impairment of cognitive control in mental illness. Journal of psychiatric research, 83:37–46, 2016. [64] Amit Etkin, Anett Gyurak, and Ruth O’Hara. A neurobiological approach to the cognitive deficits of psychiatric disorders. Dialogues in clinical neu- roscience, 15(4):419, 2013. [65] Vinod Menon. Brain networks and cognitive impairment in psychiatric dis- orders. World Psychiatry, 19(3):309, 2020. [66] George Bush, Lisa Shin, J Holmes, Bruce Rosen, and B Vogt. The multi- source interference task: Validation study with fMRI in individual subjects. Molecular psychiatry, 8:60–70, 01 2003. [67] George Bush and Lisa M Shin. The multi-source interference task: an fMRI task that reliably activates the cingulo-frontal-parietal cognitive/attention network. Nature protocols, 1(1):308–313, 2006. [68] Alberto J Gonza´lez-Villar and Maria T Carrillo-de-la Pen˜a. Brain electrical activity signatures during performance of the multisource interference task. Psychophysiology, 54(6):874–881, 2017. 133 [69] Ishita Basu, Ali Yousefi, Britni Crocker, Rina Zelmann, Angelique C Paulk, Noam Peled, Kristen K Ellard, Daniel S Weisholtz, G. Rees Cosgrove, Thilo Deckersbach, Uri T Eden, Emad N Eskandar, Darin D Dougherty, Sydney S Cash, and Alik S Widge. Closed loop enhance- ment and neural decoding of human cognitive control. bioRxiv, 2020, https://www.biorxiv.org/content/early/2020/04/25/2020.04.24.059964.full.pdf. [70] A. Widge, S. Zorowitz, Ishita Basu, Angelique Paulk, S. Cash, E. Eskandar, Thilo Deckersbach, E. Miller, and Darin Dougherty. Deep brain stimulation of the internal capsule enhances human cognitive control and prefrontal cortex function. Nature Communications, 10:1536, 04 2019. [71] Youssef Ezzyat, Paul Wanda, Deborah Levy, Allison Kadel, Ada Aka, Isaac Pedisich, Michael Sperling, Ashwini Sharan, Bradley Lega, Alexis Burks, Robert Gross, Cory Inman, Barbara Jobst, Mark Gorenstein, Kathryn Davis, Gregory Worrell, Michal Kucewicz, Joel Stein, Richard Gorniak, and Michael Kahana. Closed-loop stimulation of temporal cortex rescues functional networks and improves memory. Nature Communications, 9, 12 2018. [72] Elliot Smith, Garrett Banks, Charles Mikell, Syndey Cash, Shaun Patel, Emad Eskandar, and Sameer Sheth. Frequency-dependent representation of reinforcement-related information in the human medial and lateral pre- frontal cortex. J. of Neuroscience, 35:15827–15836, 12 2015. [73] Carina R. Oehrn, Simon Hanslmayr, Juergen Fell, Lorena Deuker, Nico A. Kremers, Anne T. Do Lam, Christian E. Elger, and Nikolai Axmacher. Neural communication patterns underlying conflict detection, resolu- tion, and adaptation. J. of Neuroscience, 34(31):10438–10452, 2014, https://www.jneurosci.org/content/34/31/10438.full.pdf. [74] O. Felsenstein, N. Peled, E. Hahn, A. P. Rockhill, L. Folsom, T. Gholipour, K. Macadams, N. Rozengard, A. C. Paulk, D. Dougherty, S. S. Cash, A. S. 134 Widge, M. Ha¨ma¨la¨inen, and S. Stufflebeam. Multi-modal neuroimaging analysis and visualization tool (mmvt), 2019, 1912.10079. [75] Andrew R. Dykstra, Alexander M. Chan, Brian T. Quinn, Rodrigo Zepeda, Corey J. Keller, Justine Cormier, Joseph R. Madsen, Emad N. Eskandar, and Sydney S. Cash. Individualized localization and cortical surface-based registration of intracranial electrodes. NeuroImage, 59(4):3563–3570, 2012. [76] http://surfer.nmr.mgh.harvard.edu/. [77] https://github.com/pelednoam/ieil. [78] Rahul S. Desikan et al. An automated labeling system for subdividing the human cerebral cortex on mri scans into gyral based regions of interest. NeuroImage, 31(3):968–980, 2006. [79] Joa˜o Massano and Carolina Garrett. Deep brain stimulation and cognitive decline in parkinson’s disease: A clinical review. Frontiers in Neurology, 3:66, 2012. [80] Alik S. Widge, Donald A. Malone, and Darin D. Dougherty. Closing the loop on deep brain stimulation for treatment-resistant depression. Frontiers in Neuroscience, 12:175, 2018. [81] Alik Widge and Darin Dougherty. Deep brain stimulation for treatment- refractory mood and obsessive-compulsive disorders. Current Behavioral Neuroscience Reports, 2, 08 2015. [82] Amy F.T. Arnsten and Katya Rubia. Neurobiological circuits regulating attention, cognitive control, motivation, and emotion: Disruptions in neu- rodevelopmental psychiatric disorders. J. of the American Academy of Child & Adolescent Psychiatry, 51(4):356–367, 2012. [83] Nicole Provenza, Evan Matteson, Anusha Allawala, Adriel Barrios- Anderson, Sameer Sheth, Ashwin Viswanathan, Elizabeth McIngvale, Eric 135 Storch, Michael Frank, Nicole Mclaughlin, Jeffrey Cohn, Wayne Goodman, and David Borton. The case for adaptive neuromodulation to treat severe intractable mental disorders. Frontiers in Neuroscience, 13, 02 2019. [84] Alik S. Widge, Kristen K. Ellard, Angelique C. Paulk, Ishita Basu, Ali Yousefi, Samuel Zorowitz, Anna Gilmour, Afsana Afzal, Thilo Deckersbach, Sydney S. Cash, Mark A. Kramer, Uri T. Eden, Darin D. Dougherty, and Emad N. Eskandar. Treating refractory mental illness with closed-loop brain stimulation: Progress towards a patient-specific transdiagnostic approach. Experimental Neurology, 287:461–472, 2017. [85] Christopher Davey, Murat Yucel, Nicholas Allen, and Ben Harrison. Task- related deactivation and functional connectivity of the subgenual cingulate cortex in major depressive disorder. Frontiers in psychiatry / Frontiers Research Foundation, 3:14, 02 2012. [86] Stephan Heckers, Anthony P. Weiss, Thilo Deckersbach, Donald C. Goff, Robert J. Morecraft, and George Bush. Anterior cingulate cortex activation during cognitive interference in schizophrenia. American J. of Psychiatry, 161(4):707–715, 2004. [87] T. Xu, K. R. Cullen, A. Houri, K. O. Lim, S. C. Schulz, and K. K. Parhi. Classification of borderline personality disorder based on spectral power of resting-state fMRI. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 5036–5039, 2014. [88] Keshab K. Parhi and Zisheng Zhang. Discriminative ratio of spectral power and relative power features derived via frequency-domain model ratio with application to seizure prediction. IEEE Transactions on Biomedical Circuits and Systems, 13(4):645–657, 2019. [89] Yun Park, Lan Luo, Keshab K. Parhi, and Theoden Netoff. Seizure prediction with spectral power of eeg using cost-sensitive 136 support vector machines. Epilepsia, 52(10):1761–1770, 2011, https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1528-1167.2011.03138.x. [90] Zisheng Zhang and Keshab K. Parhi. Seizure detection using regression tree based feature selection and polynomial SVM classification. In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 6578–6581, 2015. [91] Jeffrey Rouder, Paul Speckman, Dongchu Sun, Richard Morey, and G.J. Iverson. Bayesian t test for accepting and rejecting the null hypothesis. Psychonomic bulletin & review, 16:225–37, 05 2009. [92] Steven Goodman. A dirty dozen: Twelve p-value misconceptions. Sem- inars in Hematology, 45(3):135–140, 2008. Interpretation of Quantitative Research. [93] Christian Keysers, Valeria Gazzola, and Eric-JanWagenmakers. Using bayes factor hypothesis testing in neuroscience to establish evidence of absence. Nature Neuroscience, 23:788–799, 07 2020. [94] Matilde Vaghi, Petra Ve´rtes, Manfred Kitzbichler, Annemieke Apergis- Schoute, Febe Flier, Naomi Fineberg, Akeem Sule, Rashid Zaman, Valerie Voon, Prantik Kundu, Edward Bullmore, and Trevor Robbins. Specific fron- tostriatal circuits for impaired cognitive flexibility and goal-directed plan- ning in obsessive-compulsive disorder: Evidence from resting-state func- tional connectivity. Biological Psychiatry, 81, 08 2016. [95] Tyler A. Lesh, Andrew J. Westphal, Tara A. Niendam, Jong H. Yoon, Michael J. Minzenberg, J. Daniel Ragland, Marjorie Solomon, and Cameron S. Carter. Proactive and reactive cognitive control and dorso- lateral prefrontal cortex dysfunction in first episode schizophrenia. Neu- roImage: Clinical, 2:590–599, 2013. 137 [96] Laura Dubreuil-Vall, Peggy Chau, Giulio Ruffini, Alik S. Widge, and Joan A. Camprodon. tDCS to the left DLPFC modulates cognitive and physiological correlates of executive function in a state-dependent manner. Brain Stimulation, 12(6):1456–1463, 2019. [97] Ning Ma, Ying Liu, Nan Li, Chang-Xin Wang, Hao Zhang, Xiao-Feng Jiang, Hu-Sheng Xu, Xian-Ming Fu, Xiaoping Hu, and Da-Ren Zhang. Ad- diction related alteration in resting-state brain connectivity. NeuroImage, 49(1):738–744, 2010. [98] Joel T. Nigg, G. Mark Knottnerus, Michelle M. Martel, Molly Nikolas, Kevin Cavanagh, Wilfried Karmaus, and Marsha D. Rappley. Low blood lead levels associated with clinically diagnosed attention-deficit/hyperactivity disorder and mediated by weak cognitive control. Biological Psychiatry, 63(3):325– 331, 2008. [99] Christina L. Fales, Deanna M. Barch, Melissa M. Rundle, Mark A. Mintun, Abraham Z. Snyder, Jonathan D. Cohen, Jose Mathews, and Yvette I. She- line. Altered emotional interference processing in affective and cognitive- control brain circuitry in major depression. Biological Psychiatry, 63(4):377– 384, 2008. [100] S. H. Chu, C. Lenglet, M. W. Schreiner, B. Klimes-Dougan, K. Cullen, and K. K. Parhi. Classifying treated vs. untreated MDD adolescents from anatomical connectivity using nonlinear SVM. In 2018 40th Annual Interna- tional Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1–4, 2018. [101] Bhaskar Sen, Gail A. Bernstein, Bryon A. Mueller, Kathryn R. Cullen, and Keshab K. Parhi. Sub-graph entropy based network approaches for classi- fying adolescent obsessive-compulsive disorder from resting-state functional MRI. NeuroImage: Clinical, 26:102208, 2020. 138 [102] Tingting Xu, Kathryn R. Cullen, Bryon Mueller, Mindy W. Schreiner, Kelvin O. Lim, S. Charles Schulz, and Keshab K. Parhi. Network anal- ysis of functional brain connectivity in borderline personality disorder using resting-state fMRI. NeuroImage: Clinical, 11:302–315, 2016. [103] Vikram Patel, Dan Chisholm, Rachana Parikh, Fiona J Charlson, Louisa Degenhardt, Tarun Dua, Alize J Ferrari, Steve Hyman, Ramanan Laxmi- narayan, Carol Levin, Crick Lund, Mar´ıa Elena Medina Mora, Inge Petersen, James Scott, Rahul Shidhaye, Lakshmi Vijayakumar, Graham Thornicroft, and Harvey Whiteford. Addressing the burden of mental, neurological, and substance use disorders: key messages from disease control priorities, 3rd edition. The Lancet, 387(10028):1672–1685, 2016. [104] Widge AS Sullivan C, Olsen S. Deep brain stimulation for psychiatric dis- orders: from focal brain targets to cognitive networks. NeuroImage, page Accepted, 2020. [105] Quentin Huys, Tiago Maia, and Michael Frank. Computational psychiatry as a bridge from neuroscience to clinical applications. Nature Neuroscience, 19:404–413, 02 2016. [106] Thomas Goschke. Dysfunctions of decision-making and cog- nitive control as transdiagnostic mechanisms of mental disor- ders: advances, gaps, and needs in current research. Interna- tional J. of Methods in Psychiatric Research, 23(S1):41–57, 2014, https://onlinelibrary.wiley.com/doi/pdf/10.1002/mpr.1410. [107] Edward Bullmore and Olaf Sporns. The economy of brain network organi- zation. Nature reviews. Neuroscience, 13:336–49, 04 2012. [108] Sandeep Avvaru, Nicole Provenza, Alik Widge, and Keshab K. Parhi. De- coding human cognitive control using functional connectivity of local field 139 potentials. In 43rd Annual International Conference of the IEEE Engineer- ing in Medicine and Biology Society (EMBC), Oct 31st – Nov 1st 2021. [109] Matthew Stanley, Malaak Moussa, Brielle Paolini, Robert Lyday, Jonathan Burdette, and Paul Laurienti. Defining nodes in complex brain networks. Frontiers in Computational Neuroscience, 7:169, 2013. [110] Q. Yu, Y. Du, J. Chen, J. Sui, T. Adale¯, G. D. Pearlson, and V. D. Calhoun. Application of graph theory to assess static and dynamic brain connectivity: Approaches for building brain graphs. Proceedings of the IEEE, 106(5):886– 906, 2018. [111] Dan Mønster, Riccardo Fusaroli, Kristian Tyle´n, Andreas Roepstorff, and Jacob F. Sherson. Causal inference from noisy time-series data — testing the convergent cross-mapping algorithm in the presence of noise and external influence. Future Generation Computer Systems, 73:52–62, 2017. [112] Sandeep Avvaru, Nicole Provenza, Alik Widge, and Keshab K. Parhi. Spec- tral features based decoding of task engagement: The role of theta and high gamma bands in cognitive control. In 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Oct 31st – Nov 1st 2021. [113] A. Koenig, D. Novak, X. Omlin, M. Pulfer, E. Perreault, L. Zimmerli, M. Mi- helj, and R. Riener. Real-time closed-loop control of cognitive load in neu- rological patients during robot-assisted gait training. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 19(4):453–464, 2011. [114] Shu-Hsien Chu, Keshab Parhi, and Christophe Lenglet. Function-specific and enhanced brain structural connectivity mapping via joint modeling of diffusion and functional mri. Scientific Reports, 8, 03 2018. [115] Theodore P. Zanto and Adam Gazzaley. Fronto-parietal network: flexible hub of cognitive control. Trends in Cognitive Sciences, 17(12):602–603, 2013. 140 [116] Ian H. Harding, Murat Yu¨cel, Ben J. Harrison, Christos Pantelis, and Michael Breakspear. Effective connectivity within the frontoparietal control network differentiates cognitive control and working memory. NeuroImage, 106:144–153, 2015. [117] John Kerns, Jonathan Cohen, Angus MacDonald, Raymond Cho, V Stenger, and Cameron Carter. Anterior cingulate conflict monitoring and adjust- ments in control. Science, 303:1023–6, 03 2004. [118] J. Stretton and P.J. Thompson. Frontal lobe function in temporal lobe epilepsy. Epilepsy Research, 98(1):1–13, 2012. [119] John M. Hudson, Kenneth A. Flowers, and Kerri L. Walster. Attentional control in patients with temporal lobe epilepsy. J. of Neuropsychology, 8(1):140–146, 2014, https://bpspsychub.onlinelibrary.wiley.com/doi/pdf/10.1111/jnp.12008. [120] Luca Cocchi, Ben Harrison, Jesus Pujol, Ian Harding, Alex Fornito, Christos Pantelis, and Murat Yucel. Functional alterations of large-scale brain net- works related to cognitive control in obsessive-compulsive disorder. Human brain mapping, 33:1089–106, 05 2012. [121] Alik S. Widge and Earl K. Miller. Targeting Cognition and Networks Through Neural Oscillations: Next-Generation Clinical Brain Stimulation . JAMA Psychiatry, 76(7):671–672, 07 2019. [122] Chioma Anidi, Johanna J. O’Day, Ross W. Anderson, Muhammad Furqan Afzal, Judy Syrkin-Nikolau, Anca Velisar, and Helen M. Bronte-Stewart. Neuromodulation targets pathological not physiological beta bursts during gait in parkinson’s disease. Neurobiology of Disease, 120:107–117, 2018. [123] Gabriel Dippel, Moritz Mu¨ckschel, Tjalf Ziemssen, and Christian Beste. Demands on response inhibition processes determine modulations of theta band activity in superior frontal areas and correlations with pupillometry – 141 implications for the norepinephrine system during inhibitory control. Neu- roImage, 157:575–585, 2017. [124] Cameron Mcintyre, Warren Grill, David Sherman, and N.v Thakor. Cellular effects of deep brain stimulation: Model-based analysis of activation and inhibition. J. of neurophysiology, 91:1457–69, 05 2004. [125] A. W. Michell, S. J. G. Lewis, T. Foltynie, and R. A. Barker. Biomarkers and Parkinson’s disease. Brain, 127(8):1693–1705, 06 2004, https://academic.oup.com/brain/article- pdf/127/8/1693/842701/awh198.pdf. [126] Diane B. Miller and James P. O’Callaghan. Biomarkers of parkinson’s dis- ease: Present and future. Metabolism, 64(3, Supplement 1):S40–S46, 2015. Biomarkers: Current Status and Future Trends. [127] G. Rizzo, M. Copetti, S. Arcuti, D. Martino, A. Fontana, and G. Logroscino. Accuracy of clinical diagnosis of parkinson disease. Neurology, 86(6):566– 576, 2016. [128] Martin Go¨ttlich, Thomas F Mu¨nte, Marcus Heldmann, Meike Kasten, Jo- hann Hagenah, and Ulrike M Kra¨mer. Altered resting state brain networks in parkinson’s disease. PloS one, 8(10):e77336, 2013. [129] Hugo-Cesar Baggio, Roser Sala-Llonch, Ba`rbara Segura, Maria-Jose´ Marti, Francesc Valldeoriola, Yaroslau Compta, Eduardo Tolosa, and Carme Junque´. Functional brain networks and cognitive deficits in parkinson’s disease. Human Brain Mapping, 35(9):4620–4634, 2014. [130] Hugo C Baggio, Ba`rbara Segura, and Carme Junque. Resting-state func- tional brain networks in parkinson’s disease. CNS neuroscience & therapeu- tics, 21(10):793–801, 2015. 142 [131] Masafumi Fukuda, Christine Edwards, and David Eidelberg. Functional brain networks in parkinson’s disease. Parkinsonism & related disorders, 8(2):91–94, 2001. [132] Patr´ıcia Klobusˇiakova´, Radek Marecˇek, Jan Fousek, Eva Vy`tvarova´, and Irena Rektorova´. Connectivity between brain networks dynamically reflects cognitive status of parkinson’s disease: A longitudinal study. Journal of Alzheimer’s Disease, 67(3):971–984, 2019. [133] Alessandro Tessitore, Alfonso Giordano, Rosa De Micco, Antonio Russo, and Gioacchino Tedeschi. Sensorimotor connectivity in parkinson’s disease: the role of functional neuroimaging. Frontiers in neurology, 5:180, 2014. [134] Alessandro Tessitore, Mario Cirillo, and Rosa De Micco. Functional con- nectivity signatures of parkinson’s disease. Journal of Parkinson’s disease, 9(4):637–652, 2019. [135] Olaia Lucas-Jime´nez, Natalia Ojeda, Javier Pen˜a, Mar´ıa Dı´ez-Cirarda, Al- berto Cabrera-Zubizarreta, Juan Carlos Go´mez-Esteban, Mar´ıa A´ngeles Go´mez-Beldarrain, and Naroa Ibarretxe-Bilbao. Altered functional connec- tivity in the default mode network is associated with cognitive impairment and brain anatomical changes in parkinson’s disease. Parkinsonism & re- lated disorders, 33:58–64, 2016. [136] Jinping Fang, Huimin Chen, Zhentang Cao, Ying Jiang, Lingyan Ma, Huizi Ma, and Tao Feng. Impaired brain network architecture in newly diag- nosed parkinson’s disease based on graph theoretical analysis. Neuroscience Letters, 657:151–158, 2017. [137] Rene L. Utianski, John N. Caviness, Elisabeth C.W. van Straaten, Thomas G. Beach, Brittany N. Dugger, Holly A. Shill, Erika D. Driver- Dunckley, Marwan N. Sabbagh, Shyamal Mehta, Charles H. Adler, and 143 Joseph G. Hentz. Graph theory network function in parkinson’s disease as- sessed with electroencephalography. Clinical Neurophysiology, 127(5):2228– 2236, 2016. [138] Haiyan Liao, Jinyao Yi, Sainan Cai, Qin Shen, Qinru Liu, Lin Zhang, Junli Li, Zhenni Mao, Tianyu Wang, Yuheng Zi, Min Wang, Siyu Liu, Jun Liu, Chunyu Wang, Xiongzhao Zhu, and Changlian Tan. Changes in degree centrality of network nodes in different frequency bands in parkinson’s dis- ease with depression and without depression. Frontiers in Neuroscience, 15, 2021. [139] Nicko Jackson, Scott R Cole, Bradley Voytek, and Nicole C Swann. Char- acteristics of waveform shape in parkinson’s disease detected with scalp electroencephalography. eneuro, 6(3), 2019. [140] Arun Singh, Sarah Pirio Richardson, Nandakumar Narayanan, and James F. Cavanagh. Mid-frontal theta activity is diminished during cognitive control in parkinson’s disease. Neuropsychologia, 117:113–122, 2018. [141] Arun Singh, Rachel C. Cole, Arturo I. Espinoza, Darin Brown, James F. Ca- vanagh, and Nandakumar S. Narayanan. Frontal theta and beta oscillations during lower-limb movement in parkinson’s disease. Clinical Neurophysiol- ogy, 131(3):694–702, 2020. [142] https://narayanan.lab.uiowa.edu/article/datasets. [143] https://www.parkinson.org/. [144] Cornelis J Stam. Modern network science of neurological disorders. Nature Reviews Neuroscience, 15(10):683–695, 2014. [145] Lin-lin Gao and Tao Wu. The study of brain functional connectivity in parkinson’s disease. Translational neurodegeneration, 5(1):1–7, 2016. 144 [146] Linqiong Sang, Jiuquan Zhang, Li Wang, Jingna Zhang, Ye Zhang, Pengyue Li, Jian Wang, and Mingguo Qiu. Alteration of brain functional networks in early-stage parkinson’s disease: A resting-state fMRI study. PLOS ONE, 10(10):1–19, 10 2015. [147] Caterina Gratton, Jonathan M Koller, William Shannon, Deanna J Greene, Baijayanta Maiti, Abraham Z Snyder, Steven E Petersen, Joel S Perlmutter, and Meghan C Campbell. Emergent Func- tional Network Effects in Parkinson Disease. Cerebral Cortex, 29(6):2509–2523, 05 2018, https://academic.oup.com/cercor/article- pdf/29/6/2509/28650418/bhy121.pdf. [148] B.T. Klassen, J.G. Hentz, H.A. Shill, E. Driver-Dunckley, V.G.H. Evidente, M.N. Sabbagh, C.H. Adler, and J.N. Caviness. Quantitative EEG as a predictive biomarker for parkinson disease dementia. Neurology, 77(2):118– 124, 2011, https://n.neurology.org/content/77/2/118.full.pdf. [149] Menorca Chaturvedi, Florian Hatz, Ute Gschwandtner, Jan G. Bogaarts, Antonia Meyer, Peter Fuhr, and Volker Roth. Quantitative EEG (QEEG) measures differentiate parkinson’s disease (PD) patients from healthy con- trols (HC). Frontiers in Aging Neuroscience, 9, 2017. [150] Rajamanickam Yuvaraj, U Rajendra Acharya, and Yuki Hagiwara. A novel parkinson’s disease diagnosis index using higher-order spectra features in EEG signals. Neural Computing and Applications, 30(4):1225–1235, 2018. [151] Claudia Lainscsek, Manuel Hernandez, Jonathan Weyhenmeyer, Terrence Sejnowski, and Howard Poizner. Non-linear dynamical analysis of EEG time series distinguishes patients with parkinson’s disease from healthy in- dividuals. Frontiers in Neurology, 4, 2013. 145 [152] Mikail Rubinov and Olaf Sporns. Complex network measures of brain con- nectivity: Uses and interpretations. NeuroImage, 52(3):1059–1069, 2010. Computational Models of the Brain. [153] Jack McKay Fletcher and Thomas Wennekers. From structure to activity: Using centrality measures to predict neuronal activity. International Journal of Neural Systems, 28(02):1750013, 2018. PMID: 28076982. [154] Christopher Wilke, Gregory Worrell, and Bin He. Graph analysis of epilep- togenic networks in human partial epilepsy. Epilepsia, 52(1):84–93, 2011. [155] Marjolein MA Engels, Cornelis J Stam, Wiesje M van der Flier, Philip Scheltens, Hanneke de Waal, and Elisabeth CW van Straaten. Declining functional connectivity and changing hub locations in alzheimer’s disease: an EEG study. BMC neurology, 15(1):1–8, 2015. [156] Hu Cheng, Sharlene Newman, Joaqu´ın Gon˜i, Jerillyn S Kent, Josselyn How- ell, Amanda Bolbecker, Aina Puce, Brian F O’Donnell, and William P Hetrick. Nodal centrality of functional network in the differentiation of schizophrenia. Schizophrenia research, 168(1-2):345–352, 2015. [157] Shu Lih Oh, Yuki Hagiwara, U Raghavendra, Rajamanickam Yuvaraj, N Arunkumar, M Murugappan, and U Rajendra Acharya. A deep learn- ing approach for parkinson’s disease diagnosis from EEG signals. Neural Computing and Applications, 32(15):10927–10933, 2020. [158] Sandeep Avvaru and Keshab K. Parhi. Betweenness centrality in resting- state functional networks distinguishes parkinson’s disease. In 44rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), July 2022. [159] M Beudel and P Brown. Adaptive deep brain stimulation in parkinson’s disease. Parkinsonism & related disorders, 22:S123–S126, 2016. 146 [160] Jun Cao, Yifan Zhao, Xiaocai Shan, Hua-liang Wei, Yuzhu Guo, Liangyu Chen, John Ahmet Erkoyuncu, and Ptolemaios Georgios Sarrigiannis. Brain functional and effective connectivity based on electroencephalogra- phy recordings: A review. Human Brain Mapping, 43(2):860–879, 2022. [161] Karin Schiecke, Britta Pester, Diana Piper, Martha Feucht, Franz Ben- ninger, Herbert Witte, and Lutz Leistritz. Advanced nonlinear approach to quantify directed interactions within eeg activity of children with temporal lobe epilepsy in their time course. EPJ Nonlinear Biomedical Physics, 5:3, 2017. [162] Christine Beauchene, Subhradeep Roy, Rosalyn Moran, Alexander Leonessa, and Nicole Abaid. Comparing brain connectivity metrics: a didactic tutorial with a toy model and experimental data. Journal of Neural Engineering, 15(5):056031, 2018. [163] Sandeep Avvaru, Noam Peled, Nicole R. Provenza, Alik S. Widge, and Keshab K. Parhi. Region-level functional and effective network analysis of human brain during cognitive task engagement. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29:1651–1660, 2021. [164] Joseph McBride, Xiaopeng Zhao, Nancy Munro, Gregory Jicha, Freder- ick Schmitt, Richard Kryscio, Charles Smith, and Yang Jiang. Sugihara causality analysis of scalp EEG for detection of early alzheimer’s disease. NeuroImage. Clinical, 7:258–65, 12 2015. [165] Abdolkarim Saeedi, Maryam Saeedi, Arash Maghsoudi, and Ahmad Shal- baf. Major depressive disorder diagnosis based on effective connectivity in EEG signals: A convolutional neural network and long short-term memory approach. Cognitive Neurodynamics, 15(2):239–252, 2021. [166] Dong Wang, Doutian Ren, Kuo Li, Yiming Feng, Dan Ma, Xiangguo Yan, and Gang Wang. Epileptic seizure detection in long-term EEG recordings 147 by using wavelet-based directed transfer function. IEEE Transactions on Biomedical Engineering, 65(11):2591–2599, 2018. [167] Yunfeng Lu, Lin Yang, Gregory A Worrell, and Bin He. Seizure source imag- ing by means of fine spatio-temporal dipole localization and directed transfer function in partial epilepsy patients. Clinical Neurophysiology, 123(7):1275– 1283, 2012. [168] Ana Paula S de Oliveira, Ma´ıra Arau´jo de Santana, Maria Karoline S An- drade, Juliana Carneiro Gomes, Marcelo CA Rodrigues, and Wellington P dos Santos. Early diagnosis of parkinson’s disease using EEG, machine learning and partial directed coherence. Research on Biomedical Engineer- ing, 36(3):311–331, 2020. [169] Gang Wang, Zhongjiang Sun, Ran Tao, Kuo Li, Gang Bao, and Xiangguo Yan. Epileptic seizure detection based on partial directed coherence analysis. IEEE journal of biomedical and health informatics, 20(3):873–879, 2015. [170] D Chicharro. On the spectral formulation of granger causality. Biological cybernetics, 105(5):331–347, 2011. [171] Sandeep Avvaru and Keshab K. Parhi. Effective brain connectivity ex- traction by frequency-domain convergent cross-mapping (FDCCM) and its application in parkinson’s disease classification. Submitted to IEEE Trans- actions on Biomedical Engineering (TBME)), 2022. [172] Floris Takens. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, pages 366–381. Springer, 1981. [173] Surya Ganguli and Haim Sompolinsky. Compressed sensing, sparsity, and dimensionality in neuronal information processing and data analysis. Annual review of neuroscience, 35:485–508, 2012. [174] Robert M May. Simple mathematical models with very complicated dynam- ics. Springer, 2004. 148 [175] Clemens Brunner, Martin Billinger, Martin Seeber, Timothy R. Mullen, and Scott Makeig. Volume conduction influences scalp-based connectivity estimates. Frontiers in Computational Neuroscience, 10, 2016. [176] Alex Fornito, Andrew Zalesky, and Edward T. Bullmore, editors. Chapter 1 - An Introduction to Brain Networks. Academic Press, San Diego, 2016. [177] Olaf Sporns, Christopher J. Honey, and Rolf Ko¨tter. Identification and classification of hubs in brain networks. PLOS ONE, 2(10):1–14, 10 2007. [178] Sue-Jin Lin, Tobias R. Baumeister, Saurabh Garg, and Martin J. McKeown. Cognitive profiles and hub vulnerability in parkinson’s disease. Frontiers in Neurology, 9, 2018. [179] Yuko Koshimori, Sang-Soo Cho, Marion Criaud, Leigh Christopher, Mark Jacobs, Christine Ghadery, Sarah Coakeley, Madeleine Harris, Romina Mizrahi, Clement Hamani, Anthony E. Lang, Sylvain Houle, and Antonio P. Strafella. Disrupted nodal and hub organization account for brain network abnormalities in parkinson’s disease. Frontiers in Aging Neuroscience, 8, 2016. [180] Adam B Barrett, Lionel Barnett, and Anil K Seth. Multivariate granger causality and generalized variance. Physical Review E, 81(4):041907, 2010. [181] Mary-Ellen Lynall, Danielle S. Bassett, Robert Kerwin, Peter J. McKenna, Manfred Kitzbichler, Ulrich Muller, and Ed Bullmore. Functional connec- tivity and brain networks in schizophrenia. J. of Neuroscience, 30(28):9477– 9487, 2010, https://www.jneurosci.org/content/30/28/9477.full.pdf. [182] Jason S. Nomi and Lucina Q. Uddin. Developmental changes in large-scale network connectivity in autism. NeuroImage: Clinical, 7:732–741, 2015. 149 [183] Betty M. Tijms, Alle Meije Wink, Willem de Haan, Wiesje M. van der Flier, Cornelis J. Stam, Philip Scheltens, and Frederik Barkhof. Alzheimer’s dis- ease: connecting findings from graph theoretical studies of brain networks. Neurobiology of Aging, 34(8):2023–2036, 2013. [184] Gemma Modinos, Johan Ormel, and Andre´ Aleman. Altered activation and functional connectivity of neural systems supporting cognitive control of emotion in psychosis proneness. Schizophrenia Research, 118(1):88–97, 2010. [185] Jw Hwang, Natalia Egorova, Xq Yang, Wy Zhang, Jing Chen, Xy Yang, Lj Hu, S Sun, Y Tu, and Jian Kong. Subthreshold depression is associated with impaired resting- state functional connectivity of the cognitive control network. Translational Psychiatry, e683, 11 2015. [186] Froukje E. de Vries, Stella J. de Wit, Odile A. van den Heuvel, Dick J. Velt- man, Danielle C. Cath, Anton J. L. M. van Balkom, and Ysbrand D. van der Werf. Cognitive control networks in ocd: A resting-state connectivity study in unmedicated patients with obsessive-compulsive disorder and their unaf- fected relatives. The World J. of Biological Psychiatry, 20(3):230–242, 2019, https://doi.org/10.1080/15622975.2017.1353132. [187] Alex Fornito, Jong Yoon, Andrew Zalesky, Edward T. Bullmore, and Cameron S. Carter. General and specific functional connectivity distur- bances in first-episode schizophrenia during cognitive control performance. Biological Psychiatry, 70(1):64–72, 2011. [188] Marjorie Solomon, Sally J. Ozonoff, Neil Cummings, and Cameron S. Carter. Cognitive control in autism spectrum disorders. International J. of Devel- opmental Neuroscience, 26(2):239–247, 2008. [189] Donald Malone, Darin Dougherty, Ali Rezai, Linda Carpenter, Ger- hard Friehs, Emad Eskandar, Scott Rauch, Steven Rasmussen, Andre 150 Machado, Cynthia Kubu, Audrey Tyrka, Lawrence Price, Paul Sty- pulkowski, Jonathon Giftakis, Mark Ph.D, Paul Malloy, Stephen Salloway, and Benjamin Greenberg. Deep brain stimulation of the ventral cap- sule/ventral striatum for treatment-resistant depression. Biological psychi- atry, 65:267–75, 11 2008. [190] Paul E Holtzheimer, Mustafa M Husain, Sarah H Lisanby, Stephan F Tay- lor, Louis A Whitworth, Shawn McClintock, Konstantin V Slavin, Joshua Berman, Guy M McKhann, Parag G Patil, Barry R Rittberg, Aviva Abosch, Ananda K Pandurangi, Kathryn L Holloway, Raymond W Lam, Christo- pher R Honey, Joseph S Neimat, Jaimie M Henderson, Charles DeBat- tista, Anthony J Rothschild, Julie G Pilitsis, Randall T Espinoza, Georgios Petrides, Alon YMogilner, Keith Matthews, DeLea Peichel, Robert E Gross, Clement Hamani, Andres M Lozano, and Helen S Mayberg. Subcallosal cin- gulate deep brain stimulation for treatment-resistant depression: a multisite, randomised, sham-controlled trial. The Lancet Psychiatry, 4(11):839–849, 2017. [191] Michael D Fox, Randy L Buckner, Hesheng Liu, M Mallar Chakravarty, Andres M Lozano, and Alvaro Pascual-Leone. Resting-state networks link invasive and noninvasive brain stimulation across diverse psychiatric and neurological diseases. Proceedings of the National Academy of Sciences, 111(41):E4367–E4375, 2014. [192] Adrian W. Laxton, Nir Lipsman, and Andres M. Lozano. Chapter 25 - deep brain stimulation for cognitive disorders. In Andres M. Lozano and Mark Hallett, editors, Brain Stimulation, volume 116 of Handbook of Clinical Neurology, pages 307–311. Elsevier, 2013.