The OpenAI chatbot and the future of higher education

There are many obvious ways to cheat in large (typically 1000+ students) undergraduate classes such as mine, and one of the frustrations one often has to deal with is the fact that while it might be easy to see — and to be relatively certain — that a student has committed academic dishonesty, it’s not always easy to prove that they have done so.

One example I see every year is the struggling student, who has never produced written work meriting more than a mere pass (sometimes through generous grading), somehow finishing a multiple-choice quiz in 5 minutes rather than the expected 30-ish minutes, while also achieving 100% for that quiz.

Yes, of course this is something I can mitigate through larger question banks and so forth, but I’ve long ago resigned myself to the fact that I’ll focus on teaching the students who want to learn, rather than the ones who want to get a degree; and also the fact that the trade-off of time required to prevent or police the sort of cheating involving students sharing answers with each other isn’t worth the returns.

One might like to think that they feel a twinge of embarrassment, but they probably don’t — and I can’t really bring myself to blame them for that as much as I would have a decade ago. They are overworked, and operate in a World that asks too much of all of us, and are usually too young to know what integrity is, never mind care about it.

For written work, we have solutions like Turnitin, where even my colleagues subvert the value of such tools by not understanding what plagiarism is, resulting in further inefficiencies. You see, what Turnitin does is detect similarities between an essay submitted in my course with its large database of submissions from other universities, and also the database that is “everything on the public Internet”.

If you misplace a quotation mark, or introduce a typographical error, Turnitin might pick up a similarity that doesn’t exist, or miss one that does. So, it’s pointless to use a “score” from Turnitin as proof of plagiarism. But, because some of my colleagues tell students that “X percent on your Turnitin report is okay”, I often have to deal with upset students who don’t understand why I penalise them for having “only” copied 5% of the words in their submission.

Compounding the problem is that, while I happen to have been at a university since 1991, and am therefore reasonably well-equipped to spot language that is suspicious in being atypical for the student, or surprisingly technical or astute in comparison to the average 19 year-old’s writing, many of the people who mark work in courses like mine have only been at a university for 4 years or so, and are also from generations that might not be as fastidious about academic dishonesty as I am, given that their entire lives have been lived in a World that remixes and borrows from other sources, and (I think, but maybe this is a grumpy old man thing), cares less for the history of ideas than I do.

In short, it’s easy to cheat, and tools like Turnitin can be tricked (sometimes, in ways that I haven’t yet figured out — there were some cases this year that were obviously plagiarised, yet not detected by Turnitin). And of course, one errs on the side of generosity, and one also prioritises remedial/educative over punitive action.

Moving on from my nostalgic sentiments, Artificial Intelligence-driven tools like the OpenAI chatbot have changed the landscape in fundamental ways. Not only in academia, but also for journalism, and generally, in terms of humans knowing what information to trust, and even what credible information looks like in the first place.

I guess GPT-3 is old news, but playing with OpenAI’s new chatbot is mindblowing. https://t.co/so1TuXMQB0

We’re witnessing the death of the college essay in realtime. Here’s the response to a prompt from one of my 200-level history classes at Amherst

Solid A- work in 10 seconds pic.twitter.com/z1KPxiAc1O
— Corry Wang (@corry_wang) December 1, 2022

In the classroom — particularly classrooms in countries with large socioeconomic disparity, like South Africa — the chatbot also stands to exacerbate any digital divide that might exist between students, in that tech-savvy students who have always had access to the Internet will start using these tools first, often getting away with it, while kids who grew up in a multiple-occupancy shack, sharing one Internet-connected device between them, might carry on struggling to develop and express independent thought, on completely alien topics, in a completely alien setting.

For academic staff, discussion about a “pressure to publish” has been going on for ever, because (in South Africa’s case at least), a large proportion of our university funding comes from outputs, where outputs might take the form of journal publications, but are also the headcount of people who get through the system and graduate.

So, just as staff are tempted — sometimes encouraged — to publish junk in order to fill university coffers, students have their own incentive, which is to pass (and pass well), however that happens. They don’t necessarily have — or at least, see — why it’s important to learn, rather than merely pass.

And as noted above, when student numbers are vast, and markers are inexperienced, it’s not difficult to predict that we’ll be seeing many essays written by the OpenAI chatbot (and its successors), and that we will usually not notice. As a friend at my university remarked this morning, of course there will be some embarrassing or easy to spot outputs from one of these AI tools, but I’d imagine that most students would spot that, and simply click a button to generate a more plausible submission for consideration by the overworked (and often, under-qualified) marker.

The challenge, in conclusion, is that we’d ideally develop a system that acknowledges that it’s easy to fake competence, but which nevertheless tries to somehow incentivise students to develop the competencies specified in course outcomes. A system that undermines or combats the idea that it’s only “marks” that count, rather than learning something.

But, this challenge collides immediately with the imperative of seeing students through to graduation — not only for subsidy purposes, but also so that they can get on with life with the advantage of a university degree — which means that they need some form of certification of what they have done.

That certification needs marks (or, grades), as a comparative indicator of competence, so that they can compete effectively with others, whether for post-graduate places abroad, or simply in the workplace, anywhere.

Thinking about assessment, and how to protect its value (and of course there are many other problems with assessment, but they are beyond the scope of this piece) from tools like this chatbot should be a priority in the education sector.

But, as any of us who work in education know, the glacial pace of bureaucratic change means we’ll likely only start thinking about it quite a while from now, and doing something about it far later than that.

By Jacques Rousseau