EXPLORING RANKING CONSISTENCY OF GENERATIVE AI IN MOOC PLATFORM EVALUATION: A NON-PARAMETRIC APPROACH

Victor K. Y. Chan

Home

Digital Library

Visit Digital Library

Conference Proceedings

IADIS International Conference Cognition and Exploratory Learning in Digital Age - CELDA

IADIS International Conference Cognition and Exploratory Learning in Digital Age 2025

Document Info

Title:	EXPLORING RANKING CONSISTENCY OF GENERATIVE AI IN MOOC PLATFORM EVALUATION: A NON-PARAMETRIC APPROACH
Author(s):	Victor K. Y. Chan
ISBN:	978-989-8704-72-6
Editors:	Demetrios G. Sampson, Dirk Ifenthaler and Pedro Isaías
Year:	2025
Edition:	Single
Keywords:	Generative AI, MOOC Platforms, Ranking Consistency, Relative Rankings
Type:	Full Paper
First Page:	53
Last Page:	60
Language:	English
Cover:
Full Contents:	if you are a member please login
Paper Abstract:	This paper extends a prior study on the consistency of generative Artificial Intelligence (AI) models in evaluating Massive Open Online Course (MOOC) platforms. While the original work focused on the consistency of direct numerical scores, this research investigates the consistency of the rankings derived from those scores. When evaluating platforms, the relative order (i.e., which platform is better than another) is often more critical to a decision-maker than the absolute scores, which may be subject to systematic biases. This study analyzes the scores of 31 MOOC platforms across eight dimensions as evaluated by two AI models, Claude+ and Dragonfly. A suite of non-parametric statistical methods are employed, including Spearman's Rank Correlation Coefficient (?), Kendall's Tau (?), and the top-weighted Rank-Biased Overlap (RBO), to measure the concordance of the platform rankings produced by each model. The Wilcoxon Signed-Rank Test is used to assess systematic differences in scoring. Results indicate a moderate to strong monotonic correlation in rankings for dimensions like (2) pedagogical design, (1) content/course quality, and (6) Learner Engagement, reinforcing the original study's findings of consistency. However, the RBO analysis reveals that this agreement is weaker for the top-ranked platforms, providing a more nuanced understanding of AI evaluation consistency. The systemic scoring bias found in the original study is also reaffirmed here. This rank-based analysis offers a robust alternative to score-based comparisons, mitigating the effects of differing internal scoring scales and highlighting the practical utility of AI evaluations for comparative decision-making. By shifting the focus from absolute scores to relative rankings, this study underscores the practical value of generative AI as a decision-support tool in educational technology evaluation. The findings not only enhance methodological rigor in AI-based assessments but also provide actionable insights for learners and institutions navigating an increasingly complex MOOC landscape.

	Go Back

Social Media Links

amazon

Search

Login

Top Visited