Bireyselleştirilmiş Bilgisayarlı Sınıflama Testlerinde Madde Havuzu Özelliklerinin Test Uzunluğu ve Sınıflama Doğruluğu Üzerindeki Etkisi

Bireyselleştirilmiş Bilgisayarlı Sınıflama Testlerinde Madde Havuzu Özelliklerinin Test Uzunluğu ve Sınıflama Doğruluğu Üzerindeki Etkisi

The Effects of Item Pool Characteristics on Test Length and Classification Accuracy in Computerized Adaptive Classification Testings

Arş. Gör. Ceylan GÜNDEĞER & Prof. Dr. Nuri DOĞAN

ÖZET
Bu çalışmada bireyselleştirilmiş bilgisayarlı sınıflama testlerinde (BBST) madde havuzu özelliklerinden dağılım ve büyüklüklerin ortalama test uzunluğu ve ortalama sınıflama doğruluğu üzerindeki etkisi incelenmiştir. Bu amaçla, sivri ve basık dağılımlı 50, 100, 200 ve 300 maddelik madde havuzlarında; tesadüfi madde seçme yöntemi (TMSY), Maksimum Fisher Bilgisi (MFB) ve Kullback-Leibler Bilgisi (KLB) yöntemleri incelenmiştir. 1000 bireye ait yetenek parametreleri -3,3 aralığında N(0,1) olacak şekilde türetilmiştir. Sivri dağılıma sahip madde havuzlarındaki maddelerin a parametresi U[0,5; 2,0] aralığından; b parametresi N(1, 0,4) ve c parametresi N(0,15, 0,05) şeklinde; basık dağılıma sahip madde havuzlarındaki maddeler ise a parametresi U[0,5; 2,0] aralığından; b parametresi N(1, 1,5) ve c parametresi N(0,15, 0,05) şeklinde türetilmiştir. R’da gerçekleştirilen simülasyon sonucunda tüm madde havuzlarında ortalama test uzunluğu bakımından en yüksek değerin TMSY’ye ait olduğu; MFB ve KLB yöntemlerinin birbirine oldukça benzer çalıştıkları söylenebilir. Madde havuzu büyüklüğü arttıkça test uzunluklarının kısaldığı; sınıflama doğruluklarının azaldığı ancak tüm koşullarda 0,90 üstünde yüksek sınıflama doğruluğu elde edildiği görülmüştür. Ayrıca sivri dağılıma sahip madde havuzlarında test uzunluğunun kısaldığı ve test etkililiğinin arttığı; sınıflama doğruluklarının ise değişmediği görülmüştür. Bu sonuçlar dikkate alındığında, BBST’de çok sayıda maddeden oluşan sivri dağılıma sahip madde havuzları ile yüksek sınıflama doğruluğuna sahip daha kısa testlerin oluşturulabileceği söylenebilir.

ABSTRACT
In this study it was investigated that the effects of distrubitions and sizes on avarage test length and avarage classification accuracy in computerized adaptive classification testings (CACT). For that purpose random item selection method (RISM), Maximum Fisher Informatiıon (MFI) and Kullback-Leibler Information (KLI) were studied in boared and peaked item pools with 50 items, 100 items, 200 items and 300 items. Thetas are derived from N(0,1). In peaked item pools items are simulated from U[0,5; 2,0] for a parameters, N(1, 0,4) for b parameters and N(0,15, 0,05) for c parameters; and in broad item pools items are simulated from U[0,5; 2,0] for a parameters, N(1, 1,5) for b parameters and N(0,15, 0,05) for c parameters. The simulation study was performed in R results show that RISM has the maximum value with respect to avarage test length; and MFI and KLI perform similar. The more items in the pool, the shorter test length and fewer the classification accuracy but in all conditions classification accuracy has high rate above 90%. In addition, in peaked item pools it is seen that the avarage test lengths are getting shorter and the test effectiveness is getting higher; but the classification accuracies are not changing. In conclusion it can be said that with the peaked item pools with more items, CACT provides shorter tests and high classification accuracy.

ANAHTAR KELİMELER: Bireyselleştirilmiş Bilgisayarlı Sınıflama Testleri, Madde Havuzu Dağılımı, Madde Havuzu Büyüklüğü, Test Uzunluğu, Sınıflama Doğruluğu

KEYWORDS: Computerized Adaptive Classification Testing, Item Pool Distrubition, Item Pool Size, Test Length, Classification Accuracy

DOI : [PDF]