Buradasınız

Issues and Remedies in Composite Scoring: A Case of Joint EGP-ESP Test

Journal Name:

Publication Year:

Author NameUniversity of Author
Abstract (2. Language): 
In comparing the high- stakes ESP tests administered in recent years in Iran for Master Degree and PhD levels admission purpose, a vast area of uncertainty arises as to the nature of scoring system used at the large scale for these competitive exams. Where we had the modular PhD entrance exam with a prerequisite EGP Module followed by a Specific Purpose Module, the participants of our Master Degree counterpart test sat for a joint or blended EGP-ESP subtest which then was scored and reported in a composite percentile and interpreted along with other knowledge subtests against a common national norm. In this article, we attempted to address the issues associated with these models of scoring and the problems which are due from it for our accountability system.
58
64

REFERENCES

References: 

Arce-Ferrer, A. (2010). Investigating approaches to estimate an individual’s strand/objective
score profile reliability: A Monte Carlo study. Paper presented at the 2010 Annual Meeting of the American Educational Research Association, Denver, CO.
Baker, D. (1990). A Guide to Language Testing. London: Edward Arnold.
Chalhoub-Deville, M. And Fulcher, G. (2003). The oral proficiency interview and the ACTFL
Guidelines: A research agenda. Foreign Language Annals, 36, (4), 498 - 506.
Boughton, D., Yao, K., & Lewis, D. (2006, April). A comparison of subscale score augmentationmethods using empirical data.
Paper presented at the meeting of the National Council on Measurement in Education, San Fransisco, CA.
Chase, L. (1999). Contemporary assessment for educators. New York: Longman.
Cronbach, L. J. (1984). Essentials of Psychological Testing, 4th edn. Harper and Row, New
York.
Farhady, H. (1998). A critical review of the English section of the BA and MA University Entrance Examination. In the
Proceedings of the conference on MA tests in Iran (1998). Ministry of Culture and Higher Education, Center for
Educational Evaluation. Tehran, Iran.
Farhady, H. & Hedayati, H. (2009). Language assessment policy in Iran. Annual Review of Applied Linguistics, 29, 132–
141.
Fulcher, G., & Davidson, F. (2009). Test architecture. Test retrofit. Language Testing, 26, (1)
123–144.
Glutting, J. (2002). Glutting 's guide for norm referenced test score interpretation, using a sample
psychological report. Retrieved from: http://www.udel.edu/educ/gottfredson/451/Glutting-guide.htm
Johnson. E. G., & Carlson, J. (1994). The NAEP 1992 Technical Report (Report No. 23-TR-20).
Washington, DC: National Center for Education Statistics.
Kohlman, N. (2006). What teachers need to know, and be able to do, about norm-referenced tests. ELL Outlook,
available online at: http://www.coursecrafters.com/ELL-Outlook/2006/jul_aug/ELLOutlookITIArti...
Rabbani Yekta, R. (2012). Justifying the dimensional structure of General Academic –Specific Academic Purposes English
subtests for master’s degree entrance examinations: An upgrade test retrofit using Rasch de-modularization technique
(Doctoral Dissertation). Isfahan University, Isfahan, Iran.
Robinson, P. & Ross, S. (1996). The development of task-based assessment in English for
Academic Purposes programs. Applied Linguistics, 17 (4), 455-476.
Shin, D. (2004). A comparison of methods of estimating objective scores. Ph.D. dissertation, The
University of Iowa, United States -- Iowa. Retrieved December 10, 2007, from ProQuest Digital Dissertations database. (Publication No. AAT 139395).
Tate, R. L. (2004). Implications of multidimensionality for total score and subscore performance. Applied measurement
in education, 17(2). 89-112.
Thissen, D., & Edwards, M.C. (2005). Diagnostic scores augmented using multidimensional item
response theory. Paper presented at the annual meeting of the National Council on Measurement in Education, Montreal, Canada.
Torre, J., & Patz, R. (2002) A Multidimensional item response theory approach to simultaneous
ability estimation. Paper presented at the Annual Meeting of the National Council on Measurement in Education, April, New Orleans, LA.
Tratnik, A. (2008). Key issues in testing English for Specific Purposes. Scripta Manent, 4(1) 3
13.
Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., et al. (2001).
Augmented scores: ‘‘Borrowing Strength’’ to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 343-387). Mahwah, NJ: Lawrence Erlbaum.
Williams, Paul L. (1989). Using customized standardized tests. Practical Assessment, Research
& Evaluation, 1(9).
Wood, D., Nye, C., & Saucier, G. (2010).Identification and measurement of a more
Yekta, R. R., The Journal of Language Teaching and Learning, 2013–2, 58-64
64
comprehensive set of person-descriptive trait markers from the English lexicon. Journal of Research in Personality, 44 (2).
Yao, L., & Boughton, K. A. (2005). A Multidimensional item response modeling approach
for improving subscale proficiency estimation in cognitive diagnostic assessments. Paper
submitted for publication in APM.
Yen, W. M. (1987, April). A Bayesian/IRT index of objective performance. Paper presented at the
annual meeting of the Psychometric Society, June, Montreal, Quaebec, Canada.
Yen, W. M., Green, D. R., & Burket, G. R. (1987). Valid Normative Information from
Customized Achievement Tests. Educational Measurement: Issues and Practice, 6, 7-13.
Zhang, J. & Stout, W. (1999). Conditional covariance structure of generalized compensatory
multidimentional items. Psychometrika, 64, 129-152.

Thank you for copying data from http://www.arastirmax.com