0:00 / 0:00

Claude is a filthy cheater #programming #coding #softwareengineer #ai #artificialintelligence

@thecodingsloth
146.9K views9.8K likes0:35ENJun 3, 2026
137 words769 characters12 sentencesReadability: Grade 5

Transcript

Cheating is bad, and this AI model cheated. "Splash news, yo!" Andthropic just dropped plot Opus 4.8, and it's now ranked the number one AI model for programming. Except not really. The programming benchmarks for AI models are kind of broken. They're contaminated, too easy, and unreliable. The answers are public, and the automated graders they were using were wrong about 32% of the trials. The last cloud model literally cheated on a benchmark by running Git log to get the answers. If the old model cheated, the new one probably did too. So a startup called Data Curve built a new benchmark that they can't cheat on, and now all of a sudden, ChatchyBT is better than this new cloud model. Hmm, if you want more news like this, you can check out my free newsletter.