Categories
AI computing teaching and learning

Stop calling it cheating.

Stop calling it “cheating”, take a step back and consider why you set that assignment in that way and what you hoped to achieve by it. There might be a better way.

They’ve used AI to cheat!

Now if you’ve ever said this, or even thought it, then some things are clear.

  • One: you have set a piece of work, usually called an assignment, to be completed outside the classroom.
  • Two: marking or grading the submission is important to you: perhaps the grade you give needs to be recorded, reported to stakeholders or counts towards some certificate of achievement.
  • And three: that you expected (or hoped) the students would complete the work independently, using only what they know and perhaps some “approved” source material such as a textbook or a website, such that what was handed in accurately reflected what they knew.

In this article I hope to encourage you to question all three of the above criteria before setting an assignment, and a fourth one, namely, that you need to grade that assignment at all. Because it is only by such introspection that we will arrive at a solution to the idea that your assessments are unreliable because of “cheating” with AI.

Why assess?

In this presentation on assessment, Tom Sherrington explains that assessment serves at least two different purposes: feedback and reporting. Formative assessment provides feedback to students and teachers informing the teaching and learning process, while an assessment designed to report progress to stakeholders can be useful for such a purpose but is much less likely to have an impact on future learning.

We must therefore consider why we are assessing, and ensure the vast majority of our assessments are of the formative variety, giving students insights they can use to answer the question: “What do I need to do in order to achieve my goals?”

Formative assessment helps the teacher too, showing them where they need to direct their efforts in instruction and curriculum design. If the data shows that a topic is poorly understood then we can re-teach that topic, if on the other hand they have grasped it early, we can move on more quickly.

When teachers complain that students have used generative AI (GAI) tools such as ChatGPT or Bard, what they usually mean is that some piece of creative work being used as a summative assessment appears to be the work of a GAI, and therefore it is of little validity as a measure of progress. However, to think like this suggests an over-reliance on the validity of such assessments in the first place, given that “cheating” was entirely possible before GAI in the form of copying, plagiarism and essay mills. Also the idea that an essay completed without any such assistance would somehow be an entirely valid, reliable measure of a student’s abilities is a flawed notion in the first place. All assessment is an unreliable proxy for what we would really like to know, which is “what have they retained about this topic (domain)?”

Someone conducting an educational assessment is generally interested in the ability of the result of the assessment to stand as a proxy for some wider domain (emphasis mine).

Dylan Wiliam

Generally these complaints about cheating arise only when performing summative assessment: when the teacher needs to mark or grade the assessment, thus the result is being used to report to stakeholders on the students’ performance, or counts towards an award (such as a diploma or certificate). But as we heard above from Sherrington and Wiliam, this type of assessment has limited validity and has little impact on future learning.

Why the essay is dead

[Teachers should] assume that 100 percent of their students are using ChatGPT and other generative A.I. tools on every assignment, in every subject, unless they’re being physically supervised inside a school building.

Kevin Roose in the New York Times 24th August 2023

It’s true, the independent essay or other creative written assignment is dead as a valid (reliable) measure of what students have learned. Even if you are testing different forms of knowledge, to include declarative knowledge as well as practical knowledge (skills) and conditional knowledge (judgement) – if the means of demonstrating this learning is via an essay completed outside the classroom, you cannot rely on the results because of the ease of use of GAI on top of the more traditional methods of “cheating” mentioned above. Neither can we rely on so-called AI detectors, because they produce too many false negatives and positives, and students can learn to game the detector, or indeed get GAI to do so!

But you may have noticed that I have made the same point a few times now, this is only an issue if we need a reliable, summative assessment, for the purposes of reporting to stakeholders or awarding a certificate. How many of your assignments genuinely have to be used in this way? Can you set a supervised assessment in class once per term, and get enough data from that to feed your reporting systems, and switch out all your other assignments for formative assessment that truly moves the needle of attainment?

Vintage line drawing of a human head labelled with traits such as benevolence and cautiousness, historically used by phrenologists

All assessment measures a flawed proxy of what is inside their heads.
Image Credit: rawpixel.com

Moving to formative assessment

In the UK, compulsory schooling (K-12) is assessed with terminal exams, at 16 and at 18. We do not have a high-school diploma, grade point average (GPA) or a tradition of graded essays and term papers. It’s therefore easier in the UK to favour formative assessment. Although schools require performance data at least once per term, how this is gathered in each subject is often a matter for the subject leader.

As a Head of Computing I would usually capture my data through a mixture of auto-marking tests – making good use of multiple-choice questions (MCQs) – and a short written test conducted in class maybe once per term. Students on GCSE courses (14-16, years 10 to 11) would sit two “mock exams”, in the summer of Y10 and around Christmas in Y11. A-level students (16-18, years 12 and 13) would sit a written test at the end of each unit, so around 20 tests across the two years. I would set lots of independent work to be completed outside the classroom, but crucially none of this would be marked or graded beyond a measure of effort – did they put sufficient work in?

But importantly, I would use lessons to deliver new material, yes, but also to check for understanding, support the learners in understanding what they need to do next, and use formative assessment techniques to really help them make progress. Let the students assess themselves against criteria you set (self-assessment) or mark each others’ work (peer assess).

Or in a practical programming lesson where they are all solving a series of problems, I would walk the room helping them, and they would help each other. Or if it’s a GCSE or A-level class, and I’ve set an exam question such as “How will robotics affect the world of work?”, I will give them ten minutes then choose some students’ work to critique as a class, then give them more time to improve their own work: rinse and repeat. Without computers, a teacher visualiser device is all you need and this technique is explained here.

The “ungrading” movement

Ungrading is an approach that deviates from traditional grading systems, favouring a more feedback-centric model. Instead of focusing on scores or letter grades, the emphasis shifts towards providing detailed, constructive feedback, encouraging students to reflect on their learning and grow from their experiences.

Leon Furze

This movement away from graded assignments in the US sounds a lot like what goes on in many UK schools already, and I recommend US readers of this blog check out the link above, or Jesse Stommel’s blog post here. The case for ungrading is that a focus on grades drives students to engage in academic dishonesty. 

When the primary aim of education shifts towards attaining higher grades rather than gaining knowledge and honing skills, students are more likely to turn to GAI for completing their assignments.

Emily Pitts Donahoe

Indeed, for students with perhaps 20 essays each term, many with part-time jobs or caring responsibilities, and a GPA to maintain, using GAI is not “cheating” it’s sandbagging their future. And as I wrote in my previous blog on GAI, ChatGPT can level the playing-field for students with disabilities or assist learners for whom English is an additional language

So wherever you teach, moving away from graded assignments removes one of the drivers of “cheating”. If you can deliver sufficient reportable data with fewer graded assignments, then you will get more authentic work from the students.

Feedback and motivation

I want to go back to Tom Sherrington’s slides and revisit the purpose of assessment. Remember, if you’re grading, you’re not giving much formative feedback.

A component of learning, as students build their schema for any given knowledge domain, is a metacognitive process that drives motivation and intentionality: a knowledge of self – what do I know? What do I need to know/do/focus my attention and effort on in order to achieve goals? 

Tom Sherrington

Once we start giving feedback instead of grades, showing the learners that we care about their progress, then chances are they will care more about the process too. Assignments will become genuine expressions of what they can do, and they will value your feedback and become more motivated to do their best work. Not always, and not all students, but we will move the needle if we give it our best shot.

By mraharrisoncs

Freelance consultant, teacher and author, professional development lead for the NCCE, CAS Master Teacher, Computer Science lecturer.