Benchmarking of GPT-4 model in the context of Parsons Problem – UROP Symposium

Benchmarking of GPT-4 model in the context of Parsons Problem

Tsubasa Okada

Pronouns: He/Him

Research Mentor(s): Barbara Ericson
Research Mentor School/College/Department: Computing / Information
Program:
Authors:
Session: Session 1: 9:00 am – 9:50 am
Poster: 45

Abstract

The emergence of generative AI has drastically evolved the platform for computer science (CS) education. One of the important, ubiquitous techniques used in this field is the Parsons Problem, the programming exercise that presents the learner with blocks of code to arrange, instead of constructing the code from scratch. This technique has continually evolved with new methods and extensions proposed and implemented; one such extension is called the “distractor,” which involves the intentional inclusion of incorrect code block, imposing the learners to encounter common mistakes in programming and further the ability to distinguish irrelevant pieces of code. While numerous propositions have been made to incorporate generative AI in computer science education, we propose a new scheme: the employment of generative AI in generating “distractors” as well as other important components in the context of Parsons Problems. Through means of prompt engineering, we develop an effective prompt used in the development of the algorithm that utilizes the GPT-4 model, benchmarking the performance based on appropriate automatic assessment criteria. We then conduct a quantitative and qualitative analysis of the congregated score table, comparable based on initial conditions such as the programming exercise topic.

Engineering, Interdisciplinary, Social Sciences

lsa logoum logo