Measuring Student Cognitive Engagement with GenAI-based Tutor Conversations-Poster

Measuring cognitive engagement in AI tutor conversations requires moving beyond traditional behavioral metrics like conversation length. Using the ICAP framework [1] we developed a scalable, reliable labeling method to classify engagement (Passive, Active, Constructive). Two human raters independently coded 200 STEM-focused conversations, achieving high inter-rater reliability (Krippendorff’s Alpha = 0.82). We then trained an LLM-as-a-judge, which closely matched human labels (0.77 reliability), enabling large-scale automation. This approach provides a robust, scalable solution for analyzing cognitive engagement in GenAI tutor interactions, paving the way for improved AI tutor design and student learning insights.

See the Resource

Previous
Previous

MATHstream and UpGrade: Using Rapid, Large-Scale Experimentation for Data-Driven Improvements in a Digital Learning Tool

Next
Next

Navigating the Math Mind: The Latest in Math Mind Measures Development