Determining the effective features in
sequential forward feature selection
algorithm
Journal Name:
- Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi
Key Words:
Keywords (Original Language):
Author Name | University of Author |
---|---|
Abstract (2. Language):
In recent years, pattern recognition and machine
learning has become a very active research area due
to the applicability in various kind of subjects,
including brain computer interface, commercial and
financial approaches. Furthermore, it is worthwhile
mentioning that there is no definite technique to
solve any classification problem. Thus, it can be
helpful to get the most discriminative features using
feature selection algorithm. Such approaches are
generally performed in four steps: 1- preprocessing,
2- feature extraction, 3- feature selection and 4-
classification/regression. Among these steps,
feature extraction and feature selection are very
vital for representing input signals in a reduced
feature space and for identifying discriminative
information in order to propose fast and high
performance application.
Because of there are many kind of feature extraction
method, in some cases hundreds of features might be
calculated for identifying input signals. However,
enormous number of features might reduce the
decision performance and speed which are very
crucial two parameters in pattern recognition and
machine learning approaches. In order to eliminate
those disadvantages and reduce the number of
features some techniques have been proposed. Those
techniques are used by machine learning community
for selecting the most suitable feature subset among
all the extracted feature sets in order to increase the
performance of the proposed model.
The sequential forward feature selection (SFFS) and
the sequential backward feature selection (SBFS)
algorithms are very widely used feature selection
techniques in literature. In this study, we extracted
Continuous Wavelet Transform based features from
BCI Competition 2005 Data Set I. Afterwards,
among the extracted features, the most stable and
effective features were selected by SFFS and SBFS
techniques. BCI Competition 2005 Data Set I
includes electrocorticogram (ECoG) based brain
computer interface signals which was taken from an
epilepsy subject on two different days with about one
week of delay. In the both sessions the ECoG signals
were recorded while subject was asked to imagine of
either the left small finger or the tongue movement.
The signals were acquired with an 8x8 ECoG
platinum electrode grid (totally from 64 points)
which was placed on the contralateral (right) motor
cortex. All recordings were performed with a
sampling rate of 1 kHz (acquired 3000 samples per
channel for every trial). Additionally, BCI
Competition 2005 Data Set I consist of 278 training
trials (139 trials for finger movements, 139 trials for
tongue movements) and 100 test trials which were
recorded in the first session and the second session,
respectively. Each trial’s duration was 3 seconds.
In this paper, SFFS and SBFS algorithms tested for
determining of effective features after feature
extraction procedure. Afterwards, they compared in
terms of classification accuracy and speed. While
those methods determine the effective features from
training data set using cross validation method, the
sub-training data sets are selected randomly. So
that, different features might be selected for every
running of those methods. Thus, selecting different
features are influenced the test performance of the
model positively/negatively, as well. Moreover, in
this paper a method is proposed to overcome this
randomly selection disadvantage. In order to show
the robustness of the proposed method the SFFS and
SBFS algorithms were run 1000 times in the training
stage. Afterwards the features, which were selected
more than the determining threshold level, were
selected as effective features. Moreover, SFFS and
SBFS algorithms were compared in terms of the
speed and classification accuracy. The obtained
results showed that, SFFS is approximately 40 times
faster than SBFS and SFFS provides more than 22%
classification accuracy.
Bookmark/Search this post with
Abstract (Original Language):
Bu çalışmada, örüntü tanıma ve makine öğrenmesi uygulamalarında öznitelik çıkarma işleminden sonra
etkin özniteliklerin belirlenmesi için kullanılan yöntemlerden; ardışıl ileri yönlü öznitelik seçme (AİYÖS) ve
ardışıl geri yönlü öznitelik seçme (AGYÖS) algoritmaları sınıflandırma doğruluğu ve hız bakımından
karşılaştırılmıştır. Bu yöntemler, eğitim kümesinden çapraz doğrulama yöntemi ile en yüksek doğrulama
başarısını veren öznitelikleri belirlerken, alt eğitim kümeleri rastgele seçilir. Bundan ötürü bu yöntemlerin
her koşulmasında farklı öznitelikler sonuç olarak seçilebilmektedir. Dolayısıyla farklı özniteliklerin seçimi
ise önerilecek modelin test performansını olumlu/olumsuz etkilemektedir. Bu çalışmada bu rastgele seçimin
dezavantajını ortadan kaldırmak için bir yöntem önerilmiştir. Önerilen yöntemin kararlılığını göstermek
amacıyla eğitim aşamasında AİYÖS ve AGYÖS algoritmaları 1000 defa koşturulmakta ve belirlenen eşik
değerden fazla sayıda seçilen öznitelikler etkin öznitelikler olarak belirlenmektedir. Elde edilen sonuçlara
göre; AİYÖS algoritmasının AGYÖS’e göre yaklaşık 40 kat daha hızlı olduğu ve %22 daha fazla
sınıflandırma doğruluğu sağladığı görülmüştür.
FULL TEXT (PDF):
- 3