AI cheat tools are winning. Detection is not the point.



The videos are everywhere and the offer is always the same. Let the AI ​​do its homework and you won’t get caught out.

According to a New York Times investigation, TikTok and YouTube are now filled with tutorials selling students on two types of tools. Humanizers rewrite AI-generated text so that it no longer reads like a chatbot. Self-writers do something sneakier: They stuff words into a document for hours, faking typos, deletions, and edits to make a finished essay look like a real writing session.

Both are designed to defeat the software teachers use to detect AI.

The same companies sell the disease and the cure.

Here’s the awkward part. Some of the companies that sell detection tools also sell tools that outperform them.

Grammarly, now owned by Superhuman, offers teachers an “authorship” checker that scans a document’s history for signs of AI. The same application will also generate text from scratch, “humanize” it, and rewrite phrases that could trigger a detector. GPTZero, a detector born as a Princeton thesis, can also write a complete article, with citations, in seconds.

The NYT discovered that a marketer had created a fake teaching assistant persona on TikTok to present to students.

Jenny Maxwell, who heads up education at Superhuman, was direct about where this leads. The race between detection and evasion is, he said, “ultimately a dead end.” His summary: “A bigger cat, a bigger mouse.”

The detectors barely work anyway.

You’re right, because cats aren’t very good.

Researchers at the University of Florida tested the five most popular AI text detectors and found false negative rates as high as 99.6 percent, with a single vocabulary tweak enough to defeat most of them, Digital Trends reported. The tools also return false positives, disproportionately singling out non-native English speakers.

So schools that discipline students based on the command of a detector are on very thin ice. The technology they rely on is, by its own admission, losing.

From oral exams to internet blackouts

Given this, institutions are improvising and the responses range from the sensible to the extreme. On the quiet side, Harvard professors lean more toward oral and paper-and-pencil exams, which a chatbot can’t do for you.

At the other extreme is coercion.

To stop cheating on its national medical school entrance exam, India ordered Telegram blocked for several days, The Register reported, after the test was canceled and rescheduled following an alleged leak. More than two million people take this exam for approximately 100,000 places.

Digital rights groups called the shutdown disproportionate and part of a broader pattern of governments. crack down on misuse of AI with very forceful instruments.

The number was always the problem.

Take a step back and the panic about cheating seems like a symptom of something bigger. The school converted learning into a single number, the grade, a long time ago.

Philosopher C. Thi Nguyen calls this “value capture”: you adopt an external metric and then let it quietly replace what it was supposed to measure. In his book “The Score,” reviewed this week by MIT Technology Review, he points to GPA as the classic case. Students stop chasing understanding and start chasing grade. It’s Goodhart’s Law in a backpack: when a measure becomes an objective, it is no longer a good measure.

AI is simply the most efficient optimizer ever invented for that purpose. If the goal of the essay is scoring, not thinking, then offloading thought is the rational move, even though studies warn that this type of cognitive offloading allows actual skills to wither.

An accelerator pedal, no brake

The people building this technology are also restless.

Anthropic co-founder Jack Clark told the BBC that the industry “has an accelerator pedal, but it doesn’t have a brake pedal,” noting that Anthropic’s own model now writes most of its code. Your company has called for a coordinated brake in frontier AI. Maxwell, on the other hand, argues that denying AI to students is “educational malpractice” since they will use it at work anyway.

Both things can be true.

The detection arms race cannot be won, and detection was never the real issue. The hardest one, the one that schools have eluded for a century, is what the grade is actually for. AI did not create that problem. He just did it impossible to continue ignoring. Until someone answers, the bigger cat will keep chasing the bigger mouse and the mouse will keep winning.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *