A perennial question as technology advances is to what extent it will change or even replace the work traditionally done by humans. From self-checkout at the grocery store to AI’s ability to detect serious illnesses in medical scans, workers in every sector find themselves working alongside tools that can perform parts of their jobs. . With the pandemic accelerating the rise in the availability of AI tools in the classroom with no signs of slowing down, education is yet another area where professional work is shared with tools such as AI.
We wondered about the role of AI in the specific educational practice of assessing student learning. Due to the time it takes to grade student work and give feedback, many writing teachers are unable to assign longer writing tasks, and most students take a long time to receive grades and feedback. Therefore, AI-assisted grading has significant time-saving and learning potential. student work. So we wondered if AI grading and feedback systems could really help students as much as teachers.
“Teachers have the ability to say, ‘What were you trying to say? So instead of trying to understand what the AI was trying to say, we’re trying to fix what already exists. ”
We recently completed an evaluation of an AI-powered platform that allows middle school students to draft, submit, and revise argumentative essays in response to pre-curated essay prompts. Each time a student clicks Submit, scores aligned to the proficiency-based (scores 1-4) dimension in the four writing domains (argument and focus, support and evidence, structure, language and style) and observations and received comments tailored to the dimensions that provide opinions. All suggestions for improvement are generated by AI as soon as a student submits them.
To compare the AI scores and feedback with those provided by real teachers, we held a meeting in person with 16 middle school writing teachers who used the platform with their students during the 2021-22 school year. After coordinating the project rubric to ensure that the scores and suggestions were understood and applied, each teacher was randomly assigned (not from her own students) ten of her essays and graded. and provided feedback. This yielded a total of 160 teacher-rated essays that could be directly compared to the scores and feedback given by the AI for the same essays.
How were the teacher’s scores similar to or different from the scores given by the AI?
We found that, on average, teachers scored lower on essays than AI, with large differences in all aspects except assertiveness and focus. For the overall score on all four dimensions (minimum 4, maximum 16), the average teacher score for these 160 essays was 7.6, while the average AI score for the same set of papers was 8.8. Regarding specific aspects, Figure 1 shows that teachers and AI tend to agree on high-scoring (4) and low-scoring (1) essays on the claims and focus and support and evidence aspects. There was no match in the middle. On the organizational and language and style aspects, on the other hand, teachers were much more likely to grade essays as 1 or 2, whereas AIs were more likely to grade essays as 3. Scores were divided from 1 to 4, with more essays at 3 or 4.

How were the comments written by the teacher similar to or different from those given by the AI?
During the meeting with the 16 teachers, we gave them the opportunity to discuss the scores and feedback given to the 10 essays. Before looking back at their specific essays, the common opinion we heard was that when I was using this program in my own classroom the previous year, the majority of students read the comments his AI gave. It was that I needed help interpreting the . For example, students often read comments but reported not knowing what they were asked to do to improve their writing. Therefore, according to teachers, one of the immediate differences was the ability to express comments in a developmentally appropriate language to suit the needs and abilities of the students.
“During the retrospective, we discussed how AI is good, even in the comments/feedback.The kids here now are used to more direct and honest feedback. It’s not about stroking the ego, it’s about solving problems, which is why we don’t.You don’t always need two stars for one wish.Sometimes you need to get straight to the point. “
Another difference that emerged was whether the teacher considered the entire essay—the flow, the voice, whether it was just a summary or building an argument, whether the evidence was fit for argument, or whether it all made sense as a whole. The focus was on whether or not They reasoned that the tendency of teachers to score 2s in areas focused on claims and focus and support and argumentation of evidence was due to their ability to see the entire essay. Because many AIs are trained, this AI can’t actually see it at the sentence level, rather than teaching the entire essay.
The teacher’s critical evaluation of the organization similarly stems from the teacher’s ability to grasp the order and flow of the entire essay, unlike AI. For example, while AI can find transition words, coach students to use more transition words, and assess transition word use as evidence of good organization, teachers , shared that you can check if the migration is actually flowing. I just incorporated it into the incoherent sentences. In the area of language and style, teachers again pointed out ways to make the AI trickier. For example, including a set of seemingly sophisticated vocabularies. This will give an impression to the AI, but to the teacher it will give an impression of a series of words. The sum does not add up to a sentence or an idea.
Can AI help teachers grade?
Appropriate assessment of student work is a time consuming and very important part of teaching, especially when students are learning how to write. Steady practice with rapid feedback is necessary for students to become confident and solid writers, but most teachers lack time for planning and grading, and too many students Inability to assign routine or lengthy writing or maintain things like work-life balance. Sustainability in Career.
The potential for AI to alleviate some of this burden is very important. Although the initial findings of this study show that teachers and AI approach assessment in slightly different ways, AI systems can take a more holistic view of essays than teachers do, giving them both developmental and contextual We would love to train you to create a better feedback language. AI could actually help teachers with grading if they had a way for students to handle comments on their own. We believe that improving AI in these areas will reduce the grading burden on teachers and, as a result, give students more frequent opportunities to write combined with immediate and informative feedback to grow as writers. We believe it is a worthwhile effort to make it possible.
