Measuring Student Cognitive Engagement with GenAI-based Tutor Conversations-Poster

Jun 23

Measuring cognitive engagement in AI tutor conversations requires moving beyond traditional behavioral metrics like conversation length. Using the ICAP framework [1] we developed a scalable, reliable labeling method to classify engagement (Passive, Active, Constructive). Two human raters independently coded 200 STEM-focused conversations, achieving high inter-rater reliability (Krippendorff’s Alpha = 0.82). We then trained an LLM-as-a-judge, which closely matched human labels (0.77 reliability), enabling large-scale automation. This approach provides a robust, scalable solution for analyzing cognitive engagement in GenAI tutor interactions, paving the way for improved AI tutor design and student learning insights.

See the Resource

Area(s) Researched: Student Affect

6-to-8Student Affect

Michael Wiemeyer

Measuring Student Cognitive Engagement with GenAI-based Tutor Conversations-Poster

MATHstream and UpGrade: Using Rapid, Large-Scale Experimentation for Data-Driven Improvements in a Digital Learning Tool

Moving the Needle on Elevating Teacher and Student Voice to Improve Mathematics Education for Priority Students