Item and Test Parameters Estimations for Multiple Choice Tests in the Presence Of
Missing Data: The Case of SBS4
Journal Name:
- Eğitim Bilimleri Araştırmaları Dergisi (EBAD)
Keywords (Original Language):
Author Name | University of Author | Faculty of Author |
---|---|---|
Abstract (2. Language):
Missing data can be defined as the absence of an intended observation. Missing data
is very common and widespread when collecting data for research, especially within the
behavioral and social sciences. Some participants may miss the questions unconsciously,
omit some questions consciously, or just refuse to answer. As another example, the loss of
data is frequently seen in longitudinal studies. Some participants die or move from one place
to another. Sometimes missing data occurs due to measurement tools or the application
itself. Questions or items may not be understandable or readable. The conditions of the
application may not be eligible. Due to these reasons, missing data can occur in data sets.
Naturally, it is not possible to make any inference towards the measured trait if there
is an absence of an intended observation. Loss of data leads to a range contraction in the data
set and a representativeness problem of the sample. In the presence of missing data,
estimations may include significant bias and the power of statistical analysis may diminish.
Researchers usually tend to study complete data sets. However, it is often quite
difficult to obtain complete data sets. Missing data makes the researchers’ job quite difficult.
Missing data is an important problem in statistical analysis. Moreover, handling missing
data is not easy because of the nature of the standard statistical methods developed in the
early 20th century. These still widely-used methods usually need the complete data sets and
don’t provide any successive solutions for missing data.
Since the 1970s, many methods have been developed for dealing with missing data.
These methods can be classified into four categories: methods based on deletion, methods
based on simple imputation, methods based on maximum likelihood and methods based on
multiple imputation. Methods based on deletion and simple imputation are defined as
traditional methods. These traditional methods are now used in much statistical software.
Other complicated methods include more complex algorithms and require special software.
Missing data patterns and mechanisms are crucial in determining the appropriate
methods of handling missing data problems. Traditional methods based on deletion and
simple imputation can be used if the missing data mechanism is defined as completely
random. The methods based on maximum likelihood and multiple imputation can be used if
the missing data mechanism is defined as completely random or partially random. There is
no meaning to use most of the missing data methods if the mechanism is not ignorable.
Estimations especially may include significant bias if missing data mechanisms are not met
randomly and if there are a lot of missing data in the data set.
Multiple choice tests are usually composed of dichotomously scored items. In this
type of categorical data, it is more difficult to deal with the missing data problem. Welldefined
prior knowledge for the nature of the lost data is needed. Missing data mechanisms
must be examined in detail. An appropriate missing data method has to be determined and
used to minimize the possibility of missing data bias. In this study, by using different missing data methods, estimations of item and test
parameters for multiple choice tests composed of dichotomously scored items are examined.
Its aim is to determine the available and unavailable missing value methods for these kinds
of tests.
Bookmark/Search this post with
Abstract (Original Language):
Bu çalışmada, kayıp verilerin varlığında, çoktan seçmeli testlerde, farklı kayıp veri yöntemleri kullanılarak kestirilen madde ve test parametreleri arasındaki ilişkilerin incelenmesi ve bu tür testlerde kullanılması uygun olan kayıp veri yöntemlerinin belirlenmesi amaçlanmıştır. Temel araştırma türünde, ilişkisel tarama modelinde bir araştırma olarak tasarlanan bu çalışmada, analizler 527517 yanıtlayıcıya yönelik SBS (Seviye Belirleme Sınavı) 2011 Matematik Testi A Kitapçığı verileri üzerinde yürütülmüştür. Veri analizlerinde silmeye dayalı yöntemlerden ‘dizin silme yöntemi’, basit atama yöntemlerinden ‘0 atama’, ‘seri ortalamaları ataması’, ‘gözlem birimi ortalaması ataması’, ‘yakın noktalar ortalama ataması’, ‘yakın noktalar medyan ataması’, ‘doğrusal interpolasyon’ ve ‘dorusal eğilim noktası ataması’ yöntemleri, en çok olabilirlik yöntemlerinden ‘regresyon atama’, ‘beklenti-maksimizasyon algoritması’ ve ‘veri çoğaltma’ yöntemleri, çoklu veri atama yöntemlerinden ise ‘Markov zincirleri Monte Carlo’ yöntemi olmak üzere 12 farklı kayıp veri yöntemi kullanılmıştır. Elde edilen bulgular, kayıp verilerin ihmal edilebilir olmaması durumunda çoktan seçmeli testlere yönelik istatistiksel kestirimlerde, uygun bir kayıp veri yönteminin kullanılmasının gerekli olduğunu göstermektedir. Silmeye dayalı yöntemler ve 0 Atama yöntemi, bu tür veriler için uygun yöntemler değildir. Basit atama yöntemlerinin ise yanlı kestirimler üretme olasılığı yüksektir. En çok olabilirlik ve çoklu veri atama yöntemleri, bu tür verilerde kullanılması en uygun kayıp veri yöntemleri olarak değerlendirilmektedir.
FULL TEXT (PDF):
- 2