BYLINE: Inga Kiderra

News — Can you tell if what you’re reading right now was written by a human or generated by artificial intelligence? Do you care? Those are essentially the questions that University of California San Diego researchers asked in an experiment with ChatGPT at a regional high school. 

The researchers tested teachers and students with pairs of essays – one by a high school student and the other by ChatGPT – and asked them to identify which essay was the work of a human and which of the AI language model. Teachers were right about 70% of the time. Students scored an average of 62%. 

Those may not seem like terrible marks; they’re passing grades, right? But the researchers say the numbers should be well above 90% if it were easy to tell the difference. 

Confidence didn’t correlate with accuracy either. People who thought they could spot the work of the chatbot didn’t do better than those who were less certain of their abilities. 

“We were surprised that teachers who had experience with ChatGPT or a history of teaching high school English found the task so challenging,” said senior author Gail Heyman, a professor of psychology in the UC San Diego School of Social Sciences.   

These findings underscore widespread concerns about students potentially turning in AI-generated essays as their own and getting away with the dishonest behavior.

“But also,” Heyman said, “one of the most interesting, and troubling, aspects of our study is that teachers performed worse on the identification task when the pair of essays included a student essay that was particularly well-written. In fact, many teachers said they guessed that the better-written essay was generated by ChatGPT. This finding suggests that teachers are more likely to ‘accuse’ a well-written essay of being produced by AI – which also has some potentially concerning implications in a real-world classroom setting.” 

The study, published in the journal Human Behavior and Emerging Technology, included 69 high school teachers and 140 high school students as participants. The essay topics were similar to ones that are commonly assigned in schools. (One topic, for instance, was: “Why is literature important?”) 

The study also surveyed the participants about their views on ChatGPT. 

Students reported greater optimism than their teachers about the future role of ChatGPT in education, and rated possible academic integrity violations like submitting AI-generated essays as one’s own less negatively than teachers did. 

Study co-author Riley Cox, a high school student who volunteered as a research assistant on the study, said: “It was exciting to me to watch my classmates and teachers figure out this new technology both from the perspective of a student and a psychology researcher. It was interesting to see that teachers had a lot of worries about ChatGPT that didn’t seem to concern students.” 

As one high school teacher who participated in the study commented, “I think ChatGPT could have some interesting applications in the classroom, but my concerns outweigh any positives. I am worried that we are watching the decline of original thought in our students, as well as their ability to persevere through hard work.” 

The researchers believe their study highlights both some of the challenges and opportunities that ChatGPT brings to education. 

“We’re on the verge of a major shift in educational practices as high-quality human-like content becomes increasingly available for anyone to use,” said co-author Tal Waltzer, a postdoctoral fellow in Heyman’s lab at UC San Diego. “How exactly we handle this transition raises important ethical considerations. For example, the fact that the paid subscription version of ChatGPT performs better on many standardized tests than the freely available version could exacerbate already existing concerns about equity in education.” 

Heyman and Waltzer plan to continue research in this area “to develop an empirical foundation for best practices regarding the ethical use of AI in secondary education,” Heyman said. They will investigate what kind of activities enhance learning, to help figure out ways that ChatGPT might be used as a kind of collaborator. 

The research was supported in part by a grant from the National Science Foundation to Waltzer (SPRF-FR# 2104610).