A novel explainable machine learning approach for EEG-based brain-computer interface systems

Electroencephalographic (EEG) recordings can be of great help in decoding the open/close hand’s motion preparation. To this end, cortical EEG source signals in the motor cortex (evaluated in the 1-s window preceding movement onset) are extracted by solving inverse problem through beamforming. EEG sources epochs are used as source-time maps input to a custom deep convolutional neural network (CNN) that is trained to perform 2-ways classification tasks: pre-hand close (HC) versus resting state (RE) and pre-hand open (HO) versus RE. The developed deep CNN works well (accuracy rates up to 89.65±5.29%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$89.65 \pm 5.29\%$$\end{document} for HC versus RE and 90.50±5.35%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$90.50 \pm 5.35\%$$\end{document} for HO versus RE), but the core of the present study was to explore the interpretability of the deep CNN to provide further insights into the activation mechanism of cortical sources during the preparation of hands’ sub-movements. Specifically, occlusion sensitivity analysis was carried out to investigate which cortical areas are more relevant in the classification procedure. Experimental results show a recurrent trend of spatial cortical activation across subjects. In particular, the central region (close to the longitudinal fissure) and the right temporal zone of the premotor together with the primary motor cortex appear to be primarily involved. Such findings encourage an in-depth study of cortical areas that seem to play a key role in hand’s open/close preparation.


Introduction
Scalp electroencephalography (EEG) is a noninvasive technique that collects the electrical fields produced by the brain and indirectly reveals its underlying activity [40]. EEG is the gold-standard diagnostic technique for several neurological diseases, as well as in neuroscience and cognitive research [9,13,20,21,54]. EEG is commonly exploited in brain-computer interface systems (BCI), whose ultimate goal is to allow the brain to directly communicate with an external device by decoding subject's intentions from EEG signals and converting them into a set of suitable commands [31,44]. EEG is relatively affordable, widely spread, easy to use, and generally well tolerated by subjects. Unfortunately, EEG entails some non trivial limitations: (1) a poor signal-to-noise ratio (SNR) which reflects on the the brain's waves of interest to be corrupted by multiple sources of noise called artifacts [36]; (2) EEG recordings are non-stationary signals, and thus their statistical characteristics vary across time [58]; (3) poor spatial resolution caused by volume conduction effects [34]; (4) high inter-subject variability which limits the ability of a classifier, trained over a cohort of subjects, to generalize well across subjects [45]. One of the greatest potentials of Deep Learning (DL) is the ability to generalize well even in presence of complex inputs [28]. In the context of the EEG analysis, this results in the possibility of identifying patterns relevant to classification even in presence of additional irrelevant waves in the EEGs.
However, the main limitation in the application of DL to EEG processing lays in the relatively small number of samples generally available in EEG databases, as compared to the number of samples that are typically available in databases meant for computer vision or natural language processing (NLP) applications, which made DL so powerful in such fields [7,59]. The DL has been applied so far to EEG mainly in the following fields: motor imagery (MI) (22%), mental workload (16%), emotion recognition (16%), seizure detection (14%), event related potential detection (10%), sleep stage scoring (9%) and other studies (13%) [8]. MI is the topic more related to the focus of the present study: motor preparation.
BCI systems based on MI generally require the user to imagine performing a given movement, in a sustained way, in order to allow the system to learn how to classify the imagined movement with good accuracy [32]. However, sustained motor imagery is not natural neither comfortable for the user; moreover, it requires an intensive training and causes a delay between the onset of imagination and the time the desired control is issued [38]. Conversely, in motor preparation investigation, the subject performs or attempts to perform the movement and the behavior of EEG signals collected before motion onset/attempt is investigated to predict the intended movement [52]. Decoding the preparation of the movement, whether it is actually implemented or just attempted (for example, in case the subject has a motor disability hindering motor implementation), would be far more natural and immediately decodable [42]. Furthermore, the mechanisms of motor preparation are still not clear to scientists. Previous studies showed that premotor cortex is activated contralaterally during motor preparation, which was observed by fMRI/NIRS as well as in EEG signals [56], however it is not clear whether and how the different sub-areas of premotor cortex work together to develop motion planning. For the all the aforementioned motivations, it is reasonable to consider motor preparation a key field of research and, in line with such consideration, motor preparation of different sub-movements of upper limbs was investigated in [35]. Frames of EEG signals preceding motion's onset were compared with frames of EEGs collected in absence of any motion planning (resting) [35]. Mammone et al. [35] reached an accuracy of 90:30 AE 5:6% in pre-movement versus resting discrimination and of 62:47 AE 6:7% in the discrimination of the preparation of different sub-movements. A deep convolutional neural network (CNN) was designed and trained through stratified time-frequency maps of 210 EEG source locations in the premotor and primary motor cortex. However, no interpretation of the achieved results was provided. In the present work, we aim at achieving good accuracy in motor planning classification of hands' movements, by designing a deep CNN to be trained over single channel images (time vs. sources), and also to provide an intrpretation of the achieved results in terms of involvement of the different cortical areas. As artificial neural networks act as a black-box, an explainable machine learning (EML) approach is here proposed to interpret the achieved results by exploring the behavior of the trained network [37]. In summary, the aim of the present work is twofold: (1) To design a novel deep CNN that, by processing EEG source signals in the motor cortex, is able to discriminate the phases of preparation of hands' movements (open/close) from resting (no movement planning); (2) To explain the achieved results by means of EML, in order to assess which EEG sources (i.e., which cortical locations) play a decisive role in the classification of hand's motor preparation phases. The final aim is to find out possible areas in the motor cortex that are mainly involved in planning hands' movements. In fact, while it is well known which areas are most involved in the implementation of movements of the different parts of the body [5], it is indeed known that the activation of the movement is triggered by relatively well localized areas in the primary motor cortex, it is not well known whether and how motor planning is spatially organized. EML could yield a significant contribution in this field [11]. To this end, a deep CNN was designed and trained to discriminate hand's opening (HO) and hand's closing (HC) motion preparation phases from resting (RE) phases. The present analysis is focused on the classification of HC versus RE and HO versus RE. The training database was constructed by processing EEG signals collected from 15 subjects recruited within a BCI study conducted by Ofner et al. [41]. The paradigm introduced in [41] provided that the subject performed cue-based movements starting from a neutral rest position. The developed CNN receives as input epochs of EEG source signals in the time interval of 1 s preceding motion onset (named time-source maps herein). Such source signals were estimated by solving the inverse problem starting from EEG scalp signals. Source locations belonging to the motor cortex are then included in the analysis. The developed CNN was able to discriminate premov (HC or HO) versus RE with an average accuracy of 90%. An occlusion sensitivity analysis was subsequently carried out by passing the time-source maps as input to the network to evaluate which sources (i.e., cortical locations) are estimated to be more relevant in the classification of HC/HO from RE. We could observe a recurrent spatial pattern across subjects that show greater activation of the left part of the motor cortex in the central area, close to the longitudinal fissures between the two hemispheres, together with the extreme right part of the motor cortex belonging to the temporal lobe.
The paper is organized as follows: In Sect. 2 the proposed method is presented. The preprocessing steps including beamforming technique and the cortical sources extraction are also described. Section 3 shows the proposed deep CNN for the pre-movements tasks classification; whereas, Sect. 4 introduces the salient cortical source recovery procedure by means of EML (i.e., occlusion sensitivity analysis). In Sect. 5 experimental results are reported. Section 6 discusses the achieved findings and Sect. 7 concludes the paper.

Methodology
The proposed methodology is shown in Fig. 1. It includes the following processing modules:

Extraction of premotor EEG epochs
In the present research, a publicly available database of EEG recordings co-registered with signals collected from motion sensors [41] was used to construct the train and test dataset. The database can be found at http://bnci-horizon-2020.eu/database/data-sets together with detailed information about channel layout, recording settings and paradigm description. The study involved 15 healthy subjects (aged 27 AE 5 years, nine of the them are females). EEG signals were acquired by Ofner et al. [41] by means of 61 active EEG electrodes and four 16-channel amplifiers (g.tec medical engineering GmbH, Austria). Right mastoid channel was used as reference one and AFz was set as ground channel. EEG signals were band-pass-filtered between 0.01 Hz and 200 Hz (eighth-order Chebyshev filter), notch filtered at 50 Hz and sampled at 512 Hz. The database consists of a motor execution and a motor imagery part. Since the goal of the present study was to investigate motor preparation, the first part was included in the analysis. During the experiment, subjects remained seated on a comfortable chair and an anti-gravity exoskeleton (Hocoma, Switzerland) supported their right arm. The paradigm consisted in executing cue-based movements of the right upper limb starting from a neutral position (lower arm extended to 120 degree and in a neutral rotation, hand half open) [41]. The experiment consisted of 10 runs, every run included 6 trials and each trial included one hand open (HO), one hand close (HC) and one rest (RE) cues. The timeline of the paradigm can be summarized as follows: at second 0, a fixation cross appeared on a computer screen, positioned in front of the subject, to attract her/his gaze on it and limit eye movements. At second 2, the cue of the task to be performed (HC/HO/RE) appeared on the computer screen. After task execution, the subject moved her/his hand back to the starting neutral position. In order to train a neural network to decode motor preparation phases from EEG signals, a dataset of EEG epochs preceding motion onset was necessary. To this purpose, the onset of movement was estimated by processing motion data collected by glove sensors. Specifically, the onset of movement was detected by processing the signals recorded from motion sensors embedded in the glove, following the procedure described in [41]. The marked onset timing was manually checked for all of the 1800 pre-motion epochs under examination and the frames (epochs) of EEG signals preceding the marked onset were extracted accordingly. Such epochs were included in the analysis together with a balanced set of resting EEG epochs. Specifically, 900 EEG epochs (derived from 10 runs Â 6 trials Â 15 subjects) per movement class (hand open/hand close) were taken into account. In order to generate a balanced dataset, a comparable number of resting state EEG epochs was extracted. In the end, 2700 EEG epochs (derived from 10 runs Â 6 trials Â 15 subjects Â 3 classes) were extracted from the EEG recordings and included in the dataset. As regards the choice of the length of the frame preceding motion onset, it was set at 1 s after taking into account the typical timeline of motor related cortical potentials (MRCP) which are brain waves that arise together with movements' preparation and initiation [35,47].

Inverse problem solution and extraction of the cortical EEG sources
It is known that the EEG has a very good temporal resolution but a poor spatial resolution, due to volume conduction effects [14,16,40]. Inverse problem solution is a possible way to deal with such effects. In the proposed methodology, EEG signals are used to reconstruct a set of source signals where every source signal represents the contribution of a source location (current dipole) located in the cortex [14,16]. Solving the ''inverse problem'' means reconstructing source locations' contribution to the overall EEG signals collected at the scalp. EEGs can be hypothesized to be the projection of sources' contributions from cortical locations to scalp sensors through a ''forward model'' [10]. Such forward model takes into account the structural and conductive properties of brain tissues. In the frequency range of EEG signals, the quasi-static approximation of Maxwell's equations can be assumed hence the forward model becomes linear [17] and be formulated as follows: q r ðtÞ is the 3 dimensional directed current dipole associated to cortical location ''r'' (where r ¼ 1; . . .; Ns and Ns is the number of possible source locations in the cortex); L is knows as ''lead field'' matrix, which represents the head model that projects the current dipole q r ðtÞ into the scalp potential x(t) [17]. The number of sources Ns is typically larger than the number of channels Nc thus estimating q r ðtÞ from x(t) is inherently an ill-posed problem. The adopted head model consists of 2000 cortical locations (Ns ¼ 2000) whereas the number of scalp channels of the EEG recordings analysed in the present work is Nc ¼ 61. In this work, the New York Head (NYH) forward model, developed by Haufe et al. [18], was adopted. Such head model is based on the popular ICBM152 anatomy, a nonlinear average of T1-weighted structural MR images collected from 152 adults. By solving the inverse problem, cortical current dipoles q r (t)) are estimated starting from the recorded EEG signals x(t) and from the lead field matrix L. Several inverse problem solution approaches can be found in the literature on EEG source imaging: minimumnorm solutions, beamformers, and dipole modeling [14,49]. Beamforming solves the inverse problem by maximizing the contribution of a given source location while suppressing contributions from the other ones and was proved very effective in BCI applications by Grosse-Wentrup et al. [15]. The premotor and primary motor cortex are considered crucial in movement planning and execution [19,22]. Such regions fall in the Brodmann's Areas 4 and 6 of the brain [3]. Each one of the 2000 available source locations was associated to the corresponding Brodmann Area through its Montreal Neurological Institute (MNI) stereotaxic coordinates. MNI coordinates of every source locations were known. First of all, they were indeed converted into Talairach coordinates [27] and then matched with Talairach Atlas labels [23], in order to come up with the corresponding Brodmann area of every source location. In the end, 210 locations belonging to Brodmann areas 4 and 6 were selected out of the 2000 available ones.
3 Deep learning-based system for premovements tasks classification

Convolutional neural network
CNN is a well-known deep learning model widely used especially in computer vision [4,30,48,50], image classification [2,6,33] and pattern recognition [12,29,55]. It is composed of subsequence layers of convolution, activation, pooling followed by a multilayer fully connected neural network for classification purpose [26]. The convolutional layer includes a bank of J filters used to estimate the dot product (i.e., covolution operation) with the input map T sized t 1 Â t 2 . More specifically, each filter (sized j 1 Â j 2 ) performs the convolution with the selected local area and sweeps over the input representation with a specific stride using the same values of weights. This operation results in the so-called features maps Z of size z 1 Â z 2 : and where p is the zero padding parameter. The activation layer introduces nonlinearity in the model. Specifically, here, the Rectified Linear Unit (ReLU) is employed for its ability to achieve good generalization and training time [39]. The pooling performs a downsampling operation of the feature maps resulting from the previous layer. The max pooling operation is used for its good translation-invariant properties [46]. It has a filter sized j 1 Â j 2 that scans the input feature map with stride s. This operation outputs a reduced map sized t 1 Â t 2 , with: and The CNN ends with a standard feed-forward MLP composed of a softmax output function used for performing the discrimination tasks.

Design of the deep CNN
The proposed CNN is developed to accept as input timesource maps (i.e., EEG sources epochs) sized t 1 Â t 2 , where t 1 ¼ 210 represents the number of sources taken into account in this study, and t 2 ¼ 512 represents the number of samples included in 1s temporal epoch before the movement onset. The deep learning model consists of three stacked modules of convolutional (conv i , with i ¼ 1; 2; 3), ReLU and max pooling (mpool i ) layers followed by a common MLP for performing the 2-ways classification: HC versus RE and HO versus RE. Figure 2 shows

Learning parameters set-up
The Adaptive Moment (Adam) optimization procedure [25] was used to train the proposed deep CNN (Fig. 2), using mini-batches size of 28. Training options were set up by using the practical recommendations reported in [1,25], specifically: learning rate a ¼ 10 À2 , first moment decay rate b 1 ¼ 0:9, and second moment decay rate b 2 ¼ 0:999. 4 Explainable deep learning system: salient cortical source recovery

Occlusion sensitivity analysis
Occlusion analysis has been widely used in image classification to show the sensitivity of a pre-trained CNN to different areas of an input image [57]. It consists in systematically occluding different patches of the input data with a grey mask and estimating the related effect on the network output. For each mask location, the discrimination is performed using a pre-trained CNN and estimating the change in classification score for a specific class than the initial prediction (input without occlusion). Such changes in classification result in the so called heatmap or saliency map H with a coloration ranging from blue to red and with the same input dimension. This representation reveals which area of the image is the most essential for the classification. Specifically, red color corresponds to higher values and consequently represents the most significant area that contributed to identify the specified class. When this region is occluded, the classification performance decreases. Blue color corresponds to lower values and represents the areas not relevant during the discrimination task. In this study, the occlusion technique is applied to recover the cortical sources that are activated during the (open/close) hand's movement preparation. Given a subject under analysis, the eth EEG source epoch (sized 210 Â 512) is repeatedly occluded with a 42 Â 256 pixel grey mask that moves across the input data with a vertical and horizontal stride of 21 and 51, respectively. It is worth noting that the dimension and stride of the mask has been set-up empirically, after several experiments. For each position of the mask, the 2-way discrimination task (i.e., HC vs. RE or HO vs. RE) is performed by using the Fig. 2 Architecture of the proposed deep CNN. It consists of three convolutional layers (followed by ReLU nonlinearity) and three max pooling layers. The network ends with a two-hidden layers MLP employed to perform the two-way classifications: HC versus RE and HO versus RE proposed pre-trained CNN (Sect. 3.1). The output is a heatmap sized 210 Â 512. As an example, Fig. 3a shows the input map (i.e., EEG sources epoch) when a portion is occluded by a gray mask, whereas Fig. 3b reports the achieved saliency representation map. In this case, as can be seen, the red area roughly corresponding to the sources ranged between (130-180) in the temporal window (0.7-0.9), denotes the most relevant zone in the classification process. Further considerations and analyses are reported in Sect. 5.2.

Saliency maps segmentation through k-means
In order to provide a deeper understanding of which subareas in the motor cortex gave the largest contribution to decode movements' planning, a segmentation of the saliency maps was necessary in order to extract the high saliency zones automatically. To this end, the k-means clustering algorithm was applied to partition each saliency map to k ¼ 10 clusters. K-means is a widely applied clustering algorithm [53]. Its aim is to gather data points into a given number of clusters by following an iterative four-step procedure: 1. the initial cluster centers are set randomly; 2. data points are assigned to the nearest cluster by estimating the euclidean distance between the data point and the cluster centers, in this way clusters are redefined; 3. update the clusters' centers; 4. go back to step 2. and repeat the procedure from 2. to 4.
until the cluster centers do not change or the specified iteration number is reached.
After applying k-means to the saliency maps, the cluster associated to the highest saliency values was detected, the corresponding highly salient sources were extracted and mapped onto the cortex by red dots.

Classification performance of the premovements tasks
The classification performance of the proposed deep CNN were evaluated using standard metrics (accuracy, recall, precision, F1-score): Recall where TP and TN are true positive and negative, respectively, whereas, FP and FN are false positive and negative, respectively [43].  cross-validation technique was applied (with k ¼ 10), in particular: the train set included 70% of data (i.e., EEG sources epochs) and the test set the remaining 30%. Table 1 reports results of the HC versus RE classification. Remarkable discrimination values were observed in all subjects, reporting average recall, precision, F1-score and accuracy of 89:14 AE 7:24% and 91:19 AE 7:88%, 89:69 AE 4:98%, 89:65 AE 5:29%, respectively. It is worth noting that the highest individual classification performance was achieved by Sb 08 with accuracy of 98:02 AE 2:10%, F1-score of 97:94 AE 2:21%, recall of 96:03 AE 4:20% and precision of 100%, while the lowest individual classification performance was achieved by Sb 07 . However, also in this case high discrimination scores were observed, but with higher standard variation: accuracy of 79:76 AE 10:11%, F-score of 79:86 AE 9:60%, recall of 80:16 AE 12:77% and precision of 80:83 AE 12:36%. Table 2 reports results of the HO versus RE classification. Also in this scenario very good performance were observed (average recall of 89:31 AE 8:02%, average precision of 93:04 AE 7:66%, average F1-score of 90:41 AE 5:32% and average accuracy of 90:50 AE 5:35%). Notably, Sb 08 and Sb 02 achieved the best performances in terms of accuracy and F1-score; while, Sb 07 reported the worst  individual classification performance with an accuracy value of 75:79 AE 9:72% and F1-score of 76:95 AE 5:87%. Hence, the proposed deep CNN reported very good performance for both HC versus RE and HO versus RE discrimination task. This result was also confirmed by the analysis of the Precision-Recall Curve (PRC, shown in Fig. 4) and in particular by measuring the corresponding Area Under the PRC. Specifically, the average AUPRC over all subjects was estimated, reporting AUPRC rates up to 89:36 AE 9:02% and 90:1 AE 8:93%, for the HC versus RE and HO versus RE classification tasks, respectively.

Salient cortical source locations recovery
Occlusion sensitivity analysis was performed by using the proposed pre-trained deep CNN to estimate the saliency of every EEG source. Specifically, for each subject, an averaged saliency map was estimated by averaging the saliency maps corresponding to the HC/HO EEG epochs correctly classified during the testing procedure and herein denoted asH tk Sb i , where tk represents the pre-movement task (i.e., HC/HO) and Sb i is the subject under analysis (with i=1,2,..15). As an example, Fig. 5a illustrates the average saliency representation of Subject 08 while preparing to perform hand closing (H HC Sb 0 8 ). Saliency is encoded with a coloration going from blue (low saliency) to red (high saliency). Highly salient EEG source locations can be recovered by detecting the EEG sources associated to red areas. Notably, red areas denote that the classification score decreases when the corresponding local regions of the input were hidden by the mask, which means that the occluded area is relevant to classification. In the example map shown in Fig. 5a, the area located around 0.85s and approximately associated to the EEG sources ranging from 70 to 170, looks colored in red. Hence, it resulted relevant to the decision making. Average saliency maps were then segmented as described in Sect. 4.2. Figure 5b depicts the clustered saliency map of Subject 08. Notably, the red area represents the cluster with the highest saliency and refers to the most relevant EEG sources, which were then mapped onto the cortex (red dots in Fig. 5c). In the example shown in Fig. 5c), EEG sources located in the left central zone (close to the longitudinal fissure) and in the right-temporal zone contributed the most to decoding hand closing motor planning. Following the aforementioned procedure, salient source locations in HC and HO motor preparation were estimated for every subject and are shown in Fig. 6. It is worth to note the a recurrent spatial pattern of cortical activation (left central zone close to the longitudinal fissure and right-temporal zone) which occurred similarly during HC or HO motor preparation. Such pattern occurred in 10 out of 15 subjects during hand closing preparation and in 11 out of 15 subject in hand opening preparation. Nine out of 15 subjects exhibited the same spatial pattern in HC as well as in HO motor planning.

Discussion
The present research aims at exploring the interpretability and explainability of the proposed DL-based system in order to provide further insight into the hidden mechanism of cortical sources activation when the brain is preparing hand's open/close movement. To this end, the dataset [41] composed of EEG signals recorded from 15 subjects who performed several repetitions of hand's open and close always starting from a common neutral resting position. EEG signals recorded during resting condition (i.e., no motion planning) were also analyzed. First, a dataset of EEG epochs of 1s preceding motion onset was constructed. The onset of motion was determined through the signals collected by motion sensors embedded in a glove that the participant had worn throughout the experiment. Beamforming was then applied to EEG epochs to solve the inverse problem and reconstruct the electrical sources in the cortical locations belonging to the primary motor and premotor cortex (i.e., 210 cortical locations, as described in Sect. 2.2). Next, such premotor EEG source epochs (1s width) were used as input to a customized deep CNN to perform the following binary classifications: HC versus RE and HO versus RE, reporting very good discrimination performance: average accuracy rate up to 89:65 AE 5:29% and 90:50 AE 5:35%, respectively. Hence, the temporal trend of electrical sources in the motor cortex allows in principle for motion planning discrimination from resting phases. However, since the ultimate goal of the present study was to provide an in-depth understanding of which cortical locations contributed the most to the discrimination of HC/HO motion planning from resting phase in the frame of 1s preceding the execution of the movement, an occlusion sensitivity analysis was proposed. Specifically, after training and testing the proposed CNN, EEG time-source epochs were systematically occluded with a grey mask and used as input to the pre-trained deep CNN, producing the so called heatmaps or saliency maps. This technique allowed to highlight which areas of the input map (i.e., EEG time-sources epochs) were relevant to the decision making process. In order to detect the high saliency regions in the map, k-means clustering technique was applied and the high-saliency cluster was identified as described in Sect. 5.2. By detecting the high saliency areas in the timesources maps, the corresponding highly relevant source locations could be pinpointed. The more relevant sources were then mapped onto the cortical surface and represented with red dots. As can been seen in Fig. 6, a recurring pattern can be detected in each subject. Specifically, the cortical sources located in the central area of the motor cortex (close to the longitudinal fissure) and the temporal zone of the right motor cortex resulted highly relevant during HC/HO movement's planning preparation. It is to be noted that, to date, it is still not well known whether and how motion planning is spatially organized over the motor cortex. A contralateral involvement of the premotor cortex in motion planning was reported in the literature [56] but further details about sub-areas involvement are still to be investigated. Hence, our findings may shed a new light on motor preparation and suggest that the aforementioned motor cortical regions (i.e., central and temporal right) are the mostly involved in the HC/HO sub-movement preparation. It is also worth noting that intra-subject differences can be observed. For example, in Subjects 01 and 07 (Fig. 6), only the right-central subregion resulted highly relevant to HC detection; whereas, the left-central and also the right-temporal subregions look involved in HO detection. To the best of our knowledge, this is the first attempt to study motor preparation through explainable machine learning. Furthermore, this is the first work that attempts to detect the subareas of motor cortex that are more salient to the preparation open/close hand's movement. Recurrent spatial patterns of cortical activation could be detected across subjects, namely, the central area close to the longitudinal fissure and the right temporal area of the premotor and primary motor cortex. However, the proposed methodology has some limitations. First, the number of EEG channels used in this study was 61, we think that using a higher number of electrodes would have a positive impact on inverse problem solution, leading to a more accurate cortical source reconstruction. Second, the number of EEG epochs used to train the proposed CNN was limited. Overall, each class (HC, HO, RE) included only 60 EEG epochs. Third, movement's onset was marked by processing the data collected by the motion sensors embedded in the glove that the participant used to wear during the experiment. Motion data collected through the glove are smooth and do not allow to detect onset instantaneously, which means the epochs used for training may have captured the early ms of motion implementation, causing, in principle, the similar activation patterns visible in HC and HO (Fig. 6). For the aforementioned reasons, in the future, we intend not only to enroll a larger cohort of subject and record high-density EEGs (128-256 channels) but, for a more precise motion onset detection, EEG will be co-registered with electromiography (EMG). In addition, the analysis of EEG data is always a spatio-temporal process that is first related to the spiking activities of cortical circuits, i.e., by individual neurons cooperating to a task. This process has also a spectral component that is superimposed in the global approach proposed in [24], the NeuCube computational architecture based on brain-Inspired approaches. The concept of the spiking neural networks (SNN) is at the basis of the complex model that allows for on-going learning and classification over time [23,51]. NeuCube allows for generating a deep unsupervised learning spatiotemporal spike sequences in a scalable 3D SNN reservoir. This can be relevant for adaptation to new data, possibly in real-time modality, which is one of the future objectives also for the architecture here proposed.

Conclusion
In this paper, we proposed a novel deep CNN capable of classifying time-source maps (i.e., EEG sources epochs) related to hands' sub-movements (open/close) phase from resting state, achieving remarkable results, namely average accuracy of 89:65 AE 5:29% and 90:50 AE 5:35% in HC versus RE and HO versus RE discrimination task, respectively. Furthermore, in order to investigate which cortical source has mostly contributed in the classification of hand's motor preparation phase, EML was applied. Occlusion sensitivity analysis allowed to produce suitable saliency maps, from which to identify the most relevant areas of the input. The highest saliency region was detected though k-means clustering technique and the enclosed cortical sources were mapped onto the cortical surface. Experimental results mainly showed that the central and the right-temporal cortical sub-regions are activating while the subject was planning hand's movements (i.e., HC/HO). It is to be noted that the cortical activation rules that govern the motion planning are still not well known. Hence, on the basis of the achievements here reported, we believe that the proposed approach may be considered to be an interesting breakthrough in BCI applications.
Funding This work was co-funded by the European Commission, the European Social Fund and the Calabria Region (code: C39B18000080002). The authors are the only responsible for this publication and the European Commission and the Calabria Region decline any responsibility for the use that may be made of the information in it held. This work was also supported by the UK Engineering and Physical Sciences Research Council (EPSRC) (EP/ M026981/1, EP/T021063/1, EP/T024917/1).