Recognition of Human Inner Emotion Based on Two-Stage FCA-ReliefF Feature Optimization

Currently, there is a growing interesting in emotion recognition. Representation of emotional states is a very challenging issue. Considering the calculation cost and generalization capability for practical application, a series of features which contain common time and frequency domain are extracted from physiological signals to represent different emotional states. To reduce feature dimensionality and improve the emotion recognition accuracy, a two-stage feature optimization method based on feature correlation analysis (FCA) and ReliefF algorithm is proposed to select critical features. Firstly, FCA is employed to analyze the redundancy between features, then ReliefF is adopted to analyze the correlation between features and categories, and the optimal feature subset is obtained using the two-stage FCA-ReliefF feature optimization method. Support vector machine is employed as the classifier to evaluate classification performance in this investigation. The effectiveness of the method which is proposed is validated by testing on two publicly available multimodal emotion datasets, Augsburg Biosignal Toolbox (AuBT) and Database for Emotion Analysis Using Physiological Signals (DEAP). Compared with recent similar reported studies, the method developed in this research for emotion recognition is stable and competitive, and its accuracy reaches to 98.40% (AuBT) and 92.34% (DEAP).


Introduction
Emotion plays an important role in people's daily life and work as a general physiological and psychological phenomenon. Different emotional states can affect people's learning, memory and decision-making [23]. Studies have shown that positive emotional state is conducive to enhance people's body-mind health, improve work efficiency, and make correct decisions, while negative emotional state often adversely affects people's normal judgment, work and life [37]. In recent years, research on emotion recognition has received high attention in the field of human-computer interaction, such as distance education, driver status evaluation and rehabilitation medical treatment [24].
Emotion state can be identified by external features such as facial expressions, phonetic intonation, and body posture, or by internal features such as physiological signals. The method based on a space-time geometric representation of human landmark configurations and derive tools was proposed to achieve identification of emotional states by the analysis of the dynamics of body parts [16]. Su et al. [33] employed the features of facial expression and speech response to represent emotion state by a cell-coupled long shortterm memory (LSTM) network with an L-skip fusion mechanism, achieving an optimal accuracy with 76.9%. However, external features are easily disguised artificially, so the results obtained by physiological signals are relatively objective. In recent years, research on emotion recognition based on physiological signals has made great achievements. According to Gannouni et al. [9], the method of zero-time windowing-based epoch estimation and relevant electrode identification was proposed for emotion recognition with electroencephalography. Ganapathy et al. [8] used electrodermal activity signals with multiscale convolutional neural networks to achieve identification of valence and arousal states. The deep convolution neural network was adopted for emotion detection on a physiological signal dataset (AMIGOS), from the perspective of arousal and valence, and the final results were 76% and 75%, respectively [28].
Usually dimensionality reduction of features is an important step in emotion recognition based on physiological signals, which is helpful to aid data visualization and understanding, reduce model training time, overcome the curse of dimension, and improve model predictive performance [44]. It is generally believed that the features of single type of physiological signal have certain limitations in characterizing the strength of the original signal. Studies have shown that the recognition accuracy based on the fused features of various types of physiological signals is better than that of single type of physiological signal [25,45]. The fused features often lead to a high feature dimension, which not only makes the calculation more complicated, but also the irrelevant and redundant features make the feature representation contain more noise. How to effectively select and optimize features is a crucial issue to significantly build up the efficiency of emotion identification. Common feature reduction and selection methods can be divided into filtering, wrapper and built-in algorithms. No matter what type of method is employed, the purpose is to remove features of irrelevant (or redundant) attributes [36]. A bi-directional long short-term memory (Bi-LSTM) and convolutional neural networks (CNN) is proposed to select features, which is outperformed independent component analysis (ICA) [42]. Shi et al. [31] proposed a method, which made use of the principal component analysis (PCA) combined with the support vector machine classifier to decrease the feature dimensions. Zhang et al. [47] adopted ReliefF algorithm to process the performance of features in high-dimensional electroencephalogram (EEG) data. Ghosh et al. [10] proposed a wrapper-filter combination of ant colony optimization algorithm, where introduced subset evaluation using a filter method instead of using a wrapper method to reduce computational complexity. However, the single feature optimization method has limited performance. Hence, it is necessary to explore more effective feature optimization strategies to achieve better representation of emotional states.
In our previous work [25], we proposed the feature representation method of multi-nonlinear feature integration and multi-channel information feature fusion. Four nonlinear features, namely approximate entropy (ApEn), sample entropy (SaEn), fuzzy entropy (FuEn) and wavelet packet entropy (WpEn) were employed to reflect emotional states deeply with each type of physiological signal, and then the features of different physiological signals were fused to represent the emotional states from multiple perspectives.
Meanwhile, in order to make full use of the advantages of different classifiers and avoid the limitation of single classifier, the team-collaboration identification strategy was suggested to achieve emotion identification, which was based on the fusion of support vector machine (SVM), decision tree (DT) and extreme learning machine (ELM). Emotional state representation is a major issue in emotion identification. Exploring effective emotional state representation methods is very important to improve the accuracy of emotion recognition. Hence, in this research, a novel emotion recognition method based on two-stage feature optimization of feature correlation analysis and ReliefF (FCA-ReliefF) with support vector machine (SVM) is proposed in order to improve the efficiency of emotion identification from the perspective of effective emotional state representation with new feature extraction and optimization methods. Initially, a set of common time-domain and frequency-domain features of multiple types of physiological signals are extracted and integrated so as to more fully reflect the emotional states. Then, a two-stage feature optimization method based on FCA-ReliefF is applied to implement effective feature selection and optimization. FCA is mainly committed to realize the preliminary feature selection by analyzing the correlation between features, and ReliefF is employed to further achieve feature optimization and obtain optimal feature subset for emotion state representation. Finally, the SVM classifier is adopted to verify the performance of the proposed method on AuBT and DEAP datasets. Meanwhile, the decision tree (DT) classifier is additionally used to verify the applicability of the proposed method on the AuBT dataset. The rest of this paper is organized as follows. In Section 2, the experimental databases, selected features, feature optimization method, and model performance evaluation method are described. In Section 3, the achieved experimental results are presented and discussed and compared with other works from literature. Finally, Section 4 concludes this paper.

Experimental Data
In this investigation, in order to effectively realize emotion classification, two publicly available datasets for the research purpose, including AuBT [39] and DEAP [18], are introduced and applied to validate the proposed method which is effective and superior in performance. The flowchart of emotion classification is described as Figure 1.

Figure 1
Emotion recognition flow chart based on proposed method 3 achieved experimental results are presented and discussed and compared with other works from l Finally, Section 4 concludes this paper.

Experimental Data
In this investigation, in order to effectively realize emotion classification, two publicly available dat the research purpose, including AuBT [39] and DEAP [18], are introduced and applied to validate the p method which is effective and superior in performance. The flowchart of emotion classification is desc Figure 1.

AuBT
The public emotional dataset AuBT records four different emotional states, including sadness, pleas and anger, each induced by carefully hand-picked music. At the same time, four kinds of physiologica were recorded, which were skin conductivity (SC), respiration change (RSP), electromyogram (EM electrocardiogram (ECG). Each signal was collected to be two minutes in length after processed, sampling frequency of the ECG was 256Hz, 32Hz for the other three signals. The data experiment laste days. A total of 25 samples were collected for each emotion, and 100 samples were collected for each signal. The content summary of AuBT is summarized in Table 1. There are four types of physiologica (ECG, EMG, RSP, SC) employed to detect the emotion states in this investigation.

AuBT
The public emotional dataset AuBT records four different emotional states, including sadness, pleasure, joy, and anger, each induced by carefully hand-picked music. At the same time, four kinds of physiological signals were recorded, which were skin conductivity (SC), respiration change (RSP), electromyogram (EMG) and electrocardiogram (ECG). Each signal was collected to be two minutes in length after processed, and the sampling frequency of the ECG was 256Hz, 32Hz for the other three signals. The data experiment lasted for 25 days. A total of 25 samples were collected for each emotion, and 100 samples were collected for each type of signal. The content summary of AuBT is summarized in Table 1. There are four types of physiological signals (ECG, EMG, RSP, SC) employed to detect the emotion states in this investigation.

DEAP
The well-known DEAP emotion dataset consists recording of 32 participants (s01-s32), aged between 19 and 37 (50% female). In the dataset, 32 channels of EEG and 8 channels of peripheral physiological signals were collected. Peripheral physiological signal types mainly include RSP, galvanic skin response (GSR), electrooculogram (EOG), EMG, blood volume pressure (BVP) and skin temperature (SKT). These signals of 32 participants were recorded as each watched 40 one-minute long excerpts of music videos. Participants rated each video in terms of the levels of arousal, valence from 1 to 9. Each participant watched 40 trials of videos, and the total number of trials within DEAP dataset was 1280. In this research, the original data is down-sampled to 128Hz. The preprocessed version of DEAP dataset is summarized in Table 2. In this investigation, four types of physiolog-

Preprocessing
Physiological signals are subject to external interference during the acquisition process, including improper operation, instrument noise, electromagnetic interference, etc., which will cause the original signal to be distorted, and affect the subsequent analysis [17]. Wavelet transform (WT) can decompose the effective signal and noise on different scales to achieve the purpose of denoising [13]. As suggested in [3,38], the 'db2' wavelet is employed to decompose the ECG and BVP signals in three layers. A 5-layer decomposition of the EMG and GSR signals is performed using the 'Sym6' wavelet. The 'db4' wavelet is adopted to perform 4-layer decomposition of the RSP signal. The 'db6' wavelet is applied to decompose the SC signal in five layers.
In order to quickly calculate the eigenvalues of physiological signals, each 120s original time series signal in AUBT is divided into 10 non-overlapping 12s samples. In the end, a total of 1000 samples are obtained for each type of signal. Similarly, in order to calculate the eigenvalues of physiological signals in DEAP, after removing the baseline signals of 3s, each trial is divided into 10 non-overlapping 6s samples. Therefore, there are 400 samples for each subject, and the data of 10 subjects (S01~S10) are selected to respectively verify the effectiveness of the proposed method.

Labeling
Emotion reflects the psychological and physiological states of people's various feelings, behaviors and thoughts. However, it is comparatively difficult to determine how to define and distinguish emotions. Since emotions are complex outcomes with many elements, two common types of emotion models were applied in this research. One is a discrete emotional model [15], and the other is a continuous emotional model [27].
In the discrete emotion categories, a variety of emotions including happiness, shyness, interest, fear, joy, sadness, angry, disgust and surprise are defined as basic human emotions. When AUBT was applied, since the emotional state of the experiment was designed as a discrete category, the discrete emotional model was used to classify four different emotions, namely, sadness, pleasure, joy, and anger. In the continuous emotional model, the emotional states are represented in the valence-arousal space. The DEAP dataset belong to this emotion model. Therefore, a 2-dimensional valence-arousal space is used to evaluate the continuous emotion model, with the range of coordinate axes both ranging from 1 to 9, and the median value of 5 is used as the dividing line of four quadrants. According to Koelstra et al. [18], greater than 5 is high, and less than 5 is low, including low valence/high arousal (LVHA), low valence/low arousal (LVLA), high valence/high arousal (HVHA), high valence/low arousal (HVLA).

Feature Extraction for Physiological Signals Representation
Feature extraction is used to extracting characteristic parameters from physiological signals in order to reflect different emotion states. Different types of features are used for emotion classification based on physiological signals, such as frequency, time, time-frequency domain and non-linear features [4]. However, various feature extraction methods have their own applicable conditions. Considering the practicability and generalization ability, the commonly used time-domain and frequency-domain features are extracted, which help to control the total execution time and ensure the overall recognition accuracy.
For time-domain features, seven statistical characteristics, namely the maximum, minimum, mean, standard deviation, variance, median, and range are calculated from each segment based on the waveform of the raw physiological signal. To further make the feature representation robust, the signal sequence is subjected to first order difference and second order difference respectively. The first-order difference reflects the trend of the signal and the speed of change, which can be used to detect the extreme points of the signal locality; the second-order difference can be used to detect the local inflection point of the signal. Then, the features (maximum, minimum, mean, standard deviation, variance, median, and range) are computed from each segment after first order difference and second order difference respectively. After the above processing, a total of 21 time-domain features are obtained. Feature extraction in the time domain involves the following calculation formulas (1-5): where x u represents the mean; x σ represents the standard deviation; x v represents the variance; 1diff represents the first-order difference; 2diff represents the second-order difference; n X represents the n-th sample value of the raw signal; N represents the length of the data to be analyzed.
In order to obtain the frequency components contained in the signal, the Fast Fourier Transform (FFT) [20], which has the characteristics of periodicity and symmetry and greatly improves the computational efficiency, is employed to transform the signal from the time domain to the frequency domain. Then, six features containing maximum, minimum, mean, standard deviation, median, and extreme value are computed. Transforming a signal from the time domain to the frequency domain involves the following formulas [22]: When ( ) f t satisfies the Fourier integral condition, the formula (6) is called the Fourier transform of ( ) f t , and the integral operation of the formula (7) is

Feature Optimization Based on FCA-ReliefF
Studies have shown that the features of a single physiological signal usually cannot fully indicate a certain emotion [40]. On the contrary, the recognition rate based on feature fusion of multiple types of physiological signals is significantly higher than the recognition rate based on single signal feature, which is conducive to using the complementarity of different features to characterize emotional states from multiple perspectives [7]. Therefore, the features obtained from physiological signals of different channels are concatenated and then input into the classifier in this paper. However, the features with correlation and redundancy have an impact on the classification [32]. In addition, the higher the feature dimension is, the more computation is required, which not only increases the computation time, but also reduces the generalization ability of the system. Feature selection and optimization, in terms of the theory of removing some relevant features do not diminish the ability of the remaining features to express information [20], is an essential step to reduce feature dimensions by removing the category-independent features and redundant features, and preserve the most applicable features from the available features for classification [1,11]. In order to remove the category-independent features and redundant features to reduce the time and space complexity of the algorithm and improve the accuracy of the machine learning model, a two-stage feature optimization method based on FCA-ReliefF algorithm is proposed in this investigation. The suggested feature optimization method is as follows: (1) Analyzing the correlation between original features, eliminating the features with high correlation for reducing the dimension of the original feature set; (2) Using ReliefF algorithm to further search in the feature space retained by the correlation dimension reduction, and select the optimal feature subset.

Feature Correlation Analysis
There are two kinds of commonly used correlation analysis, one is the correlation between each feature [35], and the other is the correlation between feature and category [43]. The FCA method is adopted to decrease feature dimension according to the correlation between each feature. The core idea of FCA is to remove the redundant one of the two related features in the original feature fusion set, so as to make the correlation between the remaining features smaller and lower [34]. The proposed FCA steps are as follows: Step 1. Construct a recognition model using a classifier, then respectively calculate the classification ability of individual features, and sort the original features according to the classification results. The SVM is selected as the classifier in this investigation.
Step 2. Calculate the linear correlation coefficient between two features according to equations (8) and (9): where L i x represents the i-th eigenvalue of the L-th sample in the feature set; K is the total number of samples; r ij represents a linear correlation coefficient between the i-th feature and the j-th feature; When i=j, r 1 ij = , it means the autocorrelation coefficient. After the calculation of step 2, the following correlation coefficient matrix is obtained: It is a symmetric matrix, and the correlation coefficient between features and features satisfies r ij ji r = . r nn on the main diagonal of the matrix is the autocorrelation coefficient between the features, and r 1 nn = .
Step 3. Set selection threshold δ ( ) 0 1 δ ≤ ≤ . When the correlation coefficient between two original fea-Information Technology and Control 2022/1/51 tures satisfies r ij δ > , referring to the ranking result of Step 1, the feature with larger contribution rate is retained, and the other feature is eliminated (When one of the two features satisfying the setting threshold was removed before, the other feature was unconditionally retained). When performing FCA, appropriate threshold needs to be selected because the number of retained features varies with the setting threshold. The smaller the threshold, the fewer the number of features retained and the lesser the correlation between features. In general, the choice of δ is in [0.85, 0.95]. In this investigation, the value of δ is set to 0.9 in AuBT database, while the value of δ ranges from [0.85, 0.92] in DEAP database. The features that have been preserved after FCA constitute a new subset of features, and then further reduction with ReliefF to obtain the optimal feature subset.

Secondary Dimension Reduction Using ReliefF
ReliefF is an improved algorithm based on Relief [26]. Relief is a simple and efficient feature selection algorithm based on feature weight. According to the relevance of each feature and category, a weight value is assigned to each feature, and features whose weight value is less than a certain threshold are deleted. However, Relief is limited to the feature selection between two classes. Therefore, ReliefF, an improved algorithm, extends the feature selection between the two classes to the feature selection among multiple classes, and has achieved good experimental results in the fields of gesture recognition and speech classification [12,19]. The implementation steps of this algorithm are as follows: Step 1. Initialize ( 1: ) 0.00 W A p = = .
Step 2(a). Randomly select any sample i S from the sample set S, and select k neighbor samples from the same category as the sample i S .
Step 2(b). Arbitrarily select k neighbor samples from each category different from the sample i S .
Step 3. For 1: . Update the weight of each feature according to the following formula: The selection of the number of sample points m and the number of neighbors k is determined by the actual situation of the data set [46]; A is considered to contribute more to the classification. Then, by setting the threshold to eliminate the invalid and redundant features, the purpose of feature dimension reduction is finally achieved.

Classifiers to Evaluate the Performance
In order to complete the emotion classification task and verify the effectiveness of the proposed feature selection method, the classical classifier SVM based on supervised learning has been proved to be an effective classifier in a variety of pattern recognition tasks. In this research, SVM classifier is used to recognize different emotions.
The central idea of SVM is to map the input data to a high dimensional feature space and then use kernel function (including linear, polynomial, and radial basis function kernel) to determine a hyperplane with the largest possible margin in the feature space. Therefore, SVM is able to achieve both linear and non-linear classification simply by the use of different kernel functions. To reduce the computational cost, the LIBSVM software package developed by Chang and Lin [5], which applied the one-versus-one method for multiple classification tasks, is utilized in our study. In addition, considering solving speed and feasibility, we used radial basis function (RBF) as ker-Information Technology and Control 2022/1/51 nel function. For the parameters of SVM, the penalty parameter and the kernel parameter are obtained by searching the parameter space 10 -2:4 and 10 -4:4 with a step of one to find the optimal value respectively.
In addition, the hold-out method is employed to train and test the classifier model constructed, that is, each experiment will randomly select the training set and testing set. For the AUBT dataset, 80% of the samples in each experiment are used for training and 20% for testing. For the DEAP dataset 75% is used for training and 25% for testing. In order to avoid the contingency of experimental results, the average of 10 experimental results is taken as the final identification result.

Results and Discussion
In this section, the results obtained by applying the methods described above are presented. The public available emotion datasets, AuBT and DEAP, are employed to test the overall classification performance of the proposed method. The experimental results are analysed and compared with the reported studies, which presents the relatively good performance of the suggested method to identify emotions through peripheral physiological signals.

Recognition Rates without Dimensional Reduction
In this section, in order to verify the universality of the extracted features, in addition to SVM classifier, the decision tree (DT) classifier is also employed to conduct the experiments. The average emotion recognition rates of four emotional states based on single signal features is shown in Table 3. It can be concluded that the features of EMG are easier to be identified than the other's whether using SVM or DT classifier, and it's also found that the SVM classifier generally presents better performances than the DT classifier.
The features extracted from four physiological signals (SC, RSP, EMG and ECG) are fused to improve the accuracy of emotion recognition. The fusion features are complementary and can represent emotional states from multiple perspectives. Table 4 shows the results of each emotional state with multi-signal feature fusion. It can be seen that the results demonstrate that the recognition rates of feature fusion are significantly improved compared with that of single  signal features, which shows that the feature fusion method proposed in this paper is an effective method to improve the identification rate. Besides, it can be concluded that the emotional state of anger is easier to identify than the other three emotional states no matter which classification method is used, and this suggests that when people get angry, their physiological signals also change greatly. On the contrary, the emotion of pleasure is more difficult to classify. In terms of the extracted features, the performance of SVM classifier is better than that of DT, which is consistent with the results of single signal features.

Recognition Rates with Dimensional Reduction
As illustrated in Section 2.4, so as to gain the accuracy of emotion recognition, the suggested two-stage feature selection optimization method of FCA and Re-liefF algorithm is employed.
During the process of FCA, setting different threshold will result in different feature dimensions. The smaller the threshold, the fewer the number of features retained and the lesser the correlation between features. However, it does not mean that the smaller the threshold, the better the effect of the selected features for identification. In order to selected a reliable threshold, comparative experiments were conducted with different thresholds. The number of obtained features by setting different thresholds for feature reduction, and the corresponding identification accuracy are shown in Figure 2. Comprehensively considering the two factors of feature dimension and recognition rate, when 0.9 is selected as the threshold for preliminary screening, the performance is the best.
After using the feature selection method based on FCA, the feature dimension was reduced from the original 108 to 80. To further reduce the feature dimensionality, the ReliefF algorithm is adopted to execute secondary dimension reduction on the remained 80 features. Refer to [5,41,46], when determining the threshold of deletion, the feature quantity of 5%, 8%, 10%, 15%, 20%, 25%, 30%, 35%, and 40% with the smallest weight are removed respectively. By observing the experimental results, the recognition rate is the best when the 25% feature number is deleted in this work. Therefore, this work removes the feature quantity of 25% with the smallest weight, and finally obtains the optimal feature set. After two-stage fea- Feature dimension and recognition rate with different thresholds ture optimization, the total dimensions are reduced from 108 to 60, which is 44% lower than the original 108-dimensional features. Meanwhile, the recognition rate after using FCA-ReliefF in combination with SVM is raised from 96.79% up to 98.40%, which indicates that the method of FCA-ReliefF is effective.

Experiments Conducted on DEAP
The major target of this experiment is to further check out the overall performance of the proposed method based on FCA-ReliefF for emotion identification. Therefore, the samples of 10 subjects (S01-S10) were selected from the public available DEAP dataset for validating in this assessment.  Table 5, as the feature dimension increases, the recognition accuracy usually increases. However, this does not mean that the higher the dimension, the better the classification accuracy. When the feature dimension increases, noise will be generated due to the existence of redundant features. It can be seen in Table 5 that the features  of EMG are more effective, and the features of GSR are less representative on the contrary. Besides, due to the difference in sensors, elicitation materials, subjects, age, etc., the performances of the same signal for different persons show difference. Hence, using a single signal for emotion recognition will affect the stability of recognition results, which illustrates the necessity of combining different physiological signals for recognition.
The average classification accuracy over varying number of signal channels using SVM is shown in Table 6. It can be seen from the result analysis that the more channel information is integrated, the higher the identification accuracy is. With the increase of the number of channels, the influence of increasing channels on the identification accuracy gradually weakens. In Table 6, the recognition accuracy with the features fusion of four physiological signals is 14.31% higher than that of single signal feature.
In order to remove the redundant features and improve the emotion recognition accuracy, the proposed FCA-ReliefF feature selection optimization method was employed. The experiments were conducted on each subject respectively. During the process of FCA, comprehensively considering the two factors of feature dimension and recognition rate, for each subject, the optimal threshold was selected for preliminary screening. When determining the number of removed features during ReliefF, the appropriate deletion feature quantity was obtained. Table 7 presents the results of using proposed FCA-ReliefF feature selection method. As can be seen, the classification results were improved by using the proposed feature selection optimization method for each subject, as shown in Figure  3. In Figure 3, no matter with or without dimensional reduction, it can be clearly found that S10 has the highest identification accuracy, while S04 or S05 has the lowest identification efficiency. Due to individual differences, the effects of the methods are different on each subject. In general, compared with the results without dimensional reduction, the average recognition accuracy of the proposed FCA-ReliefF feature selection method is 1.34% higher, as well as about 42% of the raw 108 features removed. The experimental results prove the effectiveness of the proposed feature selection method.

Figure 3
Comparison of the recognition rate before and after feature optimization for each subject In order to remove the redundant features and improve the emotion recognition accuracy, the proposed FCA-ReliefF feature selection optimization method was employed. The experiments were conducted on each subject respectively. During the process of FCA, comprehensively considering the two factors of feature dimension and recognition rate, for each subject, the optimal threshold was selected for preliminary screening. When determining the number of removed features during ReliefF, the appropriate deletion feature quantity was obtained. Table 7 presents the results of using proposed FCA-ReliefF feature selection method. As can be seen, the classification results were improved by using the proposed feature selection optimization method for each subject, as shown in Figure 3. In Figure 3, no matter with or without dimensional reduction, it can be clearly found that S10 has the highest identification accuracy, while S04 or S05 has the lowest identification efficiency. Due to individual differences, the effects of the methods are different on each subject. In general, compared with the results without dimensional reduction, the average recognition accuracy of the proposed FCA-ReliefF feature selection method is 1.34% higher, as well as about 42% of the raw 108 features removed. The experimental results prove the effectiveness of the proposed feature selection method.  Figure 3 Comparison of the recognition rate before and after feature optimization for each subject. Note: Marked with *, means training and testing on each single person, and finally taking the overall average. Without the mark *, it means multi-subject cross training and testing.
In order to indicate the advantages of the proposed FCA-ReliefF feature selection method in this paper, Table 8 shows a comparison of our results with other stateof-the-art studies conducted on the DEAP dataset.
As shown in Table 8, many researchers carried out investigations on emotion identification based on DEAP database. The ensemble convolutional neural network (ECNN) was employed to achieve multimodal emotion recognition, and verified that the multimodal physiological signals combining with EEG signals (EEG+GSR+RB+EOG) performance better than single EEG signals in [14]. Cimtay et al. [6] proposed a hybrid fusion strategy and yielded a maximum one-subject-out accuracy of 91.51% and a mean accuracy of 53.87% based on multiple modalities including facial expressions, GSR and EEG. Shen et al. [30] combined convolutional neural network (CNN) and recurrent neural network with long short term memory (LSTM) cell, and its accuracy obtained 94.22%, 94.58% in valence and arousal respectively. Deep feature clustering (DFC) combined deep neural networks (DNN) with SVM was proposed for emotion recognition in short processing time with performance 81.3% in [6]. The LSTM-based deep learning algorithm with the softmax classifier was used for the emotions classification of the EEG signals and its performance reached 82.01% in [29]. In Table 8, [6,14,30] are for non-intersecting individual emotion recognition, and [2,21,29] for cross-subject emotion recognition. In this investigation, the proposed method is employed for four types of emotion recognition of individual subjects. The 10 subjects were selected continuously and not deliberately. Hence, in terms of individual subjects of four types of emotion recognition, the number of subjects does not affect the overall performance analysis. In this research, the proposed method presents inspiring performance with the recognition accuracy 92.34% for individual subject emotion identification.

Conclusions
Emotion recognition is an essential part to improve the performance of human-machine interaction procedures. A new emotion recognition approach based on FCA-ReliefF feature optimization is proposed in this research. The overall performances of the proposed method are validated by using two publicly available multimodal emotion datasets, AuBT (emotional states labeled as four basic human emotions, including joy, anger, sadness, and pleasure) and DEAP (emotions defined as HAHV, LAHV, LALV, and HALV in an arousal and valence space). Moreover, both the Information Technology and Control 2022/1/51 results of two public datasets were compared with the performances reported with state-of-the-art emotion recognition approaches using the same datasets. The experimental results of multi-view analysis unanimously indicate that the proposed FCA-ReliefF feature optimization method can effectively represent the emotional state and improve the recognition accuracy by 98.40% (AuBT) and 92.34% (DEAP). At present, it doesn't incorporate EEG signals for emotion recognition. Future studies will concentrate on the integration of EEG characteristics and explore to combine the emotion recognition with the robot-aided motion rehabilitation, which is helpful to serve stroke patient with more humanized motion training.

Data Availability
The public emotional dataset, AuBT, used to support the research of this paper is available from: http:// www.informatik.uni-augsburg.de/en/chairs/hcm/ projects/aubt/ The public emotional dataset, DEAP, used to support the research of this paper is available from: http:// www.eecs.qmul.ac.uk/mmv/datasets/deap/download.html